[swift-dev] [RFC] UnsafeBytePointer API for In-Memory Layout

Andrew Trick atrick at apple.com
Fri May 13 13:36:05 CDT 2016


> On May 12, 2016, at 4:03 PM, John McCall via swift-dev <swift-dev at swift.org> wrote:
> 
>> On May 12, 2016, at 3:21 PM, Joe Groff <jgroff at apple.com> wrote:
>>> On May 12, 2016, at 11:21 AM, John McCall <rjmccall at apple.com> wrote:
>>> 
>>>> On May 12, 2016, at 10:45 AM, Jordan Rose via swift-dev <swift-dev at swift.org> wrote:
>>>>> On May 12, 2016, at 10:44, Joe Groff <jgroff at apple.com> wrote:
>>>>> 
>>>>> 
>>>>>> On May 12, 2016, at 9:27 AM, Jordan Rose via swift-dev <swift-dev at swift.org> wrote:
>>>>>> 
>>>>>> 
>>>>>> - I’m uncomfortable with using the term “undefined behavior” as if it’s universally understood. Up until now we haven't formally had that notion in Swift, just “type safety” and “memory safety” and “invariant-preserving” and the like. Maybe we need it now, but I think it needs to be explicitly defined. (I’d actually talk to Dave about exactly what terms make the most sense for users.)
>>>>> 
>>>>> We do have undefined behavior, and use that term in the standard library docs where appropriate:
>>>>> 
>>>>> stdlib/public/core/Optional.swift-  /// `!` (forced unwrap) operator. However, in optimized builds (`-O`), no
>>>>> stdlib/public/core/Optional.swift-  /// check is performed to ensure that the current instance actually has a
>>>>> stdlib/public/core/Optional.swift-  /// value. Accessing this property in the case of a `nil` value is a serious
>>>>> stdlib/public/core/Optional.swift:  /// programming error and could lead to undefined behavior or a runtime
>>>>> stdlib/public/core/Optional.swift-  /// error.
>>>>> stdlib/public/core/Optional.swift-  ///
>>>>> stdlib/public/core/Optional.swift-  /// In debug builds (`-Onone`), the `unsafelyUnwrapped` property has the same
>>>>> --
>>>>> stdlib/public/core/StringBridge.swift-  /// The caller of this function guarantees that the closure 'body' does not
>>>>> stdlib/public/core/StringBridge.swift-  /// escape the object referenced by the opaque pointer passed to it or
>>>>> stdlib/public/core/StringBridge.swift-  /// anything transitively reachable form this object. Doing so
>>>>> stdlib/public/core/StringBridge.swift:  /// will result in undefined behavior.
>>>>> stdlib/public/core/StringBridge.swift-  @_semantics("self_no_escaping_closure")
>>>>> stdlib/public/core/StringBridge.swift-  func _unsafeWithNotEscapedSelfPointer<Result>(
>>>>> stdlib/public/core/StringBridge.swift-    _ body: @noescape (OpaquePointer) throws -> Result
>>>>> --
>>>>> stdlib/public/core/Unmanaged.swift-  /// reference's lifetime fixed for the duration of the
>>>>> stdlib/public/core/Unmanaged.swift-  /// '_withUnsafeGuaranteedRef' call.
>>>>> stdlib/public/core/Unmanaged.swift-  ///
>>>>> stdlib/public/core/Unmanaged.swift:  /// Violation of this will incur undefined behavior.
>>>>> stdlib/public/core/Unmanaged.swift-  ///
>>>>> stdlib/public/core/Unmanaged.swift-  /// A lifetime of a reference 'the instance' is fixed over a point in the
>>>>> stdlib/public/core/Unmanaged.swift-  /// programm if:
>>>> 
>>>> Those latter two are in stdlib-internal declarations. I think I have the same objection with using the term for 'unsafelyUnwrapped'.
>>> 
>>> Well, we can say "A program has undefined behavior if it does X or Y", or we can say "A program which does X or Y lacks type safety". In all cases we are referring to a concept defined elsewhere.  If we say "undefined behavior", we are using an easily-googled term whose popular discussions will quickly inform the reader of the consequences of the violation.  If we say "type safety", we are using a term with that's popularly used in very vague, hand-wavey ways and whose consequences aren't usually discussed outside of formal contexts.  If we say "memory safety", we're using a term that doesn't even have that precedent.  So we can use the latter two terms if we want, but that just means we need to have a standard place where we define them and describe the consequences of violating them, probably with at least a footnote saying "this is analogous to the undefined behavior rules of C and C++".
>> 
>> In other places where the standard library intentionally has undefined behavior, it looks like we use the term "serious programming error", for instance in the the doc comment for `assert`:
>> 
>> /// * In -Ounchecked builds, `condition` is not evaluated, but the
>> ///   optimizer may assume that it *would* evaluate to `true`. Failure
>> ///   to satisfy that assumption in -Ounchecked builds is a serious
>> ///   programming error.
>> 
>> which feels a bit colloquial to me, and doesn't provide much insight into the full consequences of UB. I think we're better off using an established term.
> 
> Agreed.
> 
> Do we have a good place to document common terms?  Preferably one that isn't a book?
> 
> John.


Am I the only one who sees defining "undefined behavior" as a paradox?

I'm not disagreeing with better documentation, but there's no way to specify the behavor of compiled code once you feed the compiler an incorrect fact. Violating a simple constraint that two pointers cannot alias can easily lead to executing code paths that would not otherwise be executed, hence unknown side effects. We could make statements about the current implemenation of the compiler but that would only be misleading as it's impossible to make any guarantee about future compilers once you've violated the contract. The implementation should make common cases less surprising, but limits on the possible side effects can't be specified. Once you intentionally step beyond the protection that the Swift language provides, you're firmly in C/C++ compiler territory. So for more on that, see one of the many discussions out there on the topic in general.

What we should try really hard to do is to make it clear what rules programmers need to follow to safely use "unsafe" constructs. Once you have those rules, you have a contract with future compilers and you can write code sanitizers. 

I'm specifically focussing on UnsafePointer's Pointee type, because making that safer requires source breaking changes, and because the rules were so nonobvious. This is an API that programmers use when they are comfortable taking responsibility for the lifetime and bounds of an object. They are probably not expecting to take responsibility for type safety, and likely not even aware of strict aliasing rules.

-Andy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20160513/4739297e/attachment.html>


More information about the swift-dev mailing list