[swift-evolution] [late pitch] UnsafeBytes proposal

Andrew Trick atrick at apple.com
Fri Aug 19 15:48:07 CDT 2016


> On Aug 19, 2016, at 12:43 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> 
> On Fri, Aug 19, 2016 at 2:32 PM, Karl via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
> 
>> On 19 Aug 2016, at 19:35, Andrew Trick <atrick at apple.com <mailto:atrick at apple.com>> wrote:
>> 
>> 
>>> On Aug 16, 2016, at 7:13 PM, Karl via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>> 
>>> 
>>>> On 16 Aug 2016, at 01:14, David Sweeris via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>> 
>>>>> On Aug 15, 2016, at 13:55, Michael Ilseman via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>> 
>>>>> 
>>>>> It seems like there’s a potential for confusion here, in that people may see “UInt8” and assume there is some kind of typed-ness, even though the whole point is that this is untyped. Adjusting the header comments slightly might help:
>>>>> 
>>>>> 
>>>>> /// A non-owning view of raw memory as a collection of bytes.
>>>>> ///
>>>>> /// Reads and writes on memory via `UnsafeBytes` are untyped operations that
>>>>> /// do no require binding the memory to a type. These operations are expressed 
>>>>> /// in terms of `UInt8`, though the underlying memory is untyped.
>>>>> 
>>>>>>>>>> 
>>>>> You could go even further towards hinting this fact with a `typealias Byte = UInt8`, and use Byte throughout. But, I don’t know if that’s getting too excessive.
>>>> 
>>>> I don't think that's too excessive at all. I might even go further and say that we should call it "Untyped" instead of "Byte", to really drive home the point (many people see "byte" and think "8-bit int", which is merely a side effect of CPUs generally not having support for types *other* than ints and floats, rather than a reflection of the true "type" of the data).
>>>> 
>>>> - Dave Sweeris
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>>> ‘Byte’ is sufficient, I think.
>>> 
>>> In some sense, it is typed as bytes. It reflects the fact that anything that is representable to the computer must be expressible as a sequence of bits (the same way we have string de/serialisation — which of course is not to say that the byte representation is good for serialisation purposes). “withUnsafeBytes” can be seen as doing a reversible type conversion the same way LosslessStringConvertible does; only in this case the conversion is free.
>> 
>> Yes. Byte clearly refers to a value's in-memory representation. But typealias Byte = UInt8 would imply the opposite of what needs to be conveyed. The name Byte refers to raw memory being accessed, not the value being returned by the collection. The in-memory value's bytes are loaded from memory and reinterpreted as UInt8 values. UInt8 is the correct type for the value after it is loaded. Calling the collection’s element type Byte sends the wrong message. e.g. [Byte] or UnsafePointer<Byte> would be nonsense.
>> 
>> Keep in mind the important use case is code that needs to work with a collection of UInt8 values without knowing the type of the values in memory. This makes it intuitive and convenient to implement correctly without needing to reason about the Swift-specific notions of raw vs. typed pointers and binding memory to a type.
>> 
>> The documentation should be fixed to clarify that the in-memory value is not the same as the loaded value.
>> 
>> -Andy
> 
> Well, a byte is a numerical type as much as a UInt8 is. We attach meaning to it (e.g. a memory location), but it’s just a number.
> 
> But I thought what Andy's saying is that he's proposing to standardize the usage of the word byte to mean raw memory and not a number?

That’s right. That’s exactly how the name “bytes” is being used in APIs and method names. A byte is not itself a number but it is common practice to reinterpret a byte as a number in [0,256). IMO this isn’t a problem that needs to be fixed.

> Perhaps it shouldn’t be a typealias then (if the alias would have some kind of impure semantics), but its own type which is exactly the same as UInt8. Typing raw memory accesses with `Byte` to indicate that the number was read from raw memory is a good idea for type-safety IMO.
> 
> You’d wonder if we could have initialisers for other integer types which take a fixed-size array of `Byte`s - e.g. UInt16(_: [2 * Byte]). That wouldn’t make as much sense with two UInt8s.

You would always go through memory to reinterpret the bits. There’s nothing wrong with this if you know the underlying pointer is aligned:

  bytes.load(as: UInt16.self)

UInt8 is the right default for the collection API because it’s common practice to work with buffers of [UInt8].

Most use cases are not going to exercise the numeric properties of UInt8, but I don’t see that as a problem in practice.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160819/653abb59/attachment.html>


More information about the swift-evolution mailing list