[swift-users] raw buffer pointer load alignment

Johannes Weiß johannesweiss at apple.com
Thu Nov 9 10:37:06 CST 2017


Hi Kelvin,

> On 9 Nov 2017, at 12:30 am, Kelvin Ma <kelvin13ma at gmail.com> wrote:
> 
> For context, the problem I’m trying to solve is efficiently parsing JPEG chunks. This means reading each chunk of the JPEG from a file into a raw buffer pointer, and then parsing the chunk according to its expected layout. For example, a frame header chunk looks like this:
> 
> 0                 1                 2                 3                  4                  5              6
> [ precision:UInt8 |          height:UInt16            |             width:UInt16            |   Nf:UInt8   | ... ]
> 
> what I want to do is to be able to load the height and width into something I can pass into UInt16.init(bigEndian:) without failing because of alignment. I’ve thought of several options but none of them seem to be great.
> 
> 1 - bind the entire buffer to UInt8.self, and then do buffer[1] << UInt8.bitWidth | buffer[2]. probably most straightforward, but doesn’t generalize well at all to larger Int types.
> 
> 2 - copy MemoryLayout<UInt16>.size bytes from offset 1 into the beginning of a new raw buffer, aligned to MemoryLayout<UInt16>.alignment, and do load(fromByteOffset: 0, as: UInt16.self) from that. Seems very inefficient because you have to allocate a new heap buffer copy everything over and then free it just to hold the bytes in the right alignment.
> 
> 3 - use withUnsafeMutablePointer(to:_:) on a local variable of type UInt16, cast it to a raw pointer, and copy MemoryLayout<UInt16>.sizebytes into it. Like 2 it involves declaring a temporary variable which is annoying, and also, while the default initialization isn’t that big a problem, it’s introducing a meaningless value into the source code and can be problematic for non-integer types. Also, wasn’t Swift supposed to be designed so that Optional is the only thing which has a “default” value; Bool does not default to false and Int does not default to 0. Default constructors are evil.

I agree. However, that meaningless value would just exist very temporarily in a function. I think you'd need a fancier type system to express 'this is an uninitialised value on the stack that can only be read after it has been written to'. Sure you could use a local Int16? but that'd come with some overhead.

With endianness I still think you can use that function below and you'll get it super efficient. The compiler will (likely) inline that whole function anyway.

What's the problem with the local temporary variable? You'd need that in C too. Maybe can you post the C code that you'd like to write? Then we can work from there and create some Swift code that does the same.


enum Endianness {
    case little
    case big
}

func integerFromBuffer<T: FixedWidthInteger>(_ pointer: UnsafeRawBufferPointer, index: Int, endianness: Endianness = .big) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: pointer.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<T>.size))
    }
    switch endianness {
        case .little:
            return value.littleEndian /* does nothing on little endian, swaps on big */
        case .big:
            return value.bigEndian /* does nothing on big endian, swaps on little */
    }
}

-- Johannes


> 
> 
> 
> On Wed, Nov 8, 2017 at 7:49 PM, Johannes Weiß <johannesweiss at apple.com>wrote:
> Hi Kelvin,
> 
> > On 8 Nov 2017, at 5:40 pm, Kelvin Ma <kelvin13ma at gmail.com> wrote:
> >
> > yikes there’s no less verbose way to do that? and if the type isn’t an integer there’s no way to avoid the default initialization? Can this be done with opaques or something?
> 
> well, it's 5 lines for the generic case to rule all the integers. You could just put that in a function and never think about it again, right?
> 
> func integerFromBuffer<T: FixedWidthInteger>(_ pointer: UnsafeRawBufferPointer, index: Int) -> T {
>     precondition(index >= 0)
>     precondition(index <= pointer.count - MemoryLayout<T>.size)
> 
>     var value = T()
>     withUnsafeMutableBytes(of: &value) { valuePtr in
>         valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: pointer.baseAddress!.advanced(by: index),
>                                                         count: MemoryLayout<T>.size))
>     }
>     return value
> }
> 
> should work (untested). Also you might need to handle endianness.
> 
> Regarding types that are not integers, what types are you thinking of? For normal Swift types the layout isn't guaranteed so you can't just read the bytes from somewhere. For C types where the layout is known you can just use the above code and either relax the constraint or specialise it to the very type you need. The only requirement of the type (besides that it's layout is defined) is that is has an empty constructor.
> 
> This isn’t what I’m trying to do atm but does this mean it’s not possible to save a memory dump of a Swift struct to a file and then read it back in from the file to reconstitute it? Also the empty constructor requirement is problematic as explained before, especially when the type isn’t super simple like an Int.



More information about the swift-users mailing list