[swift-users] Getting at the bytes of Data

Ray Fix rayfix at gmail.com
Fri Dec 9 13:16:52 CST 2016


Keeping the closure open until the method ends and returning the result was eye opening. Thanks again!

I thought about making the destination buffer a Data as well but I kind of feel like it is better the way it is as it keeps down the number of scopes.  It kind of starts feeling like a pyramid of doom.

The example you sent back doesn’t actually work for a subtle reason.  I think what is happening is compression_stream_init resets all of the pointers and sizes back to zero.  It feels strange to init, call compression_stream_init and then re-assign which is why I did things the way I did.

Since Swift doesn’t require break statements, I usually omit them. Watch out when I get back into the C/C++ code!

Good call on adding fatalError to default. I wish compression_status was a finite enum so I could get a compiler error instead.

The error information coming back from compression_stream module is rather lacking is which is why I am just returning nil instead of throwing. Also, that, and this is for a tutorial so I am trying to keep the number of concepts to a minimum (within reason anyway).

Anyway, I updated the gist here:

https://gist.github.com/rayfix/a286bb55accffef09249ba3535993782 <https://gist.github.com/rayfix/a286bb55accffef09249ba3535993782>

With great thanks,
Ray


> On Dec 9, 2016, at 10:00 AM, Philippe Hausler <phausler at apple.com> wrote:
> 
> What you have is probably ok in common usage; but there is a chance that it would not properly do a copy on write change as you expect if the Data was somehow obtained from a  multi-threaded environment. Here is a slight refactor (with a few extra fixes and comments for you that might save you some time later)
> 
> import Foundation
> import Compression
> 
> public enum CompressionAlgorithm {
>     case lz4   // speed is critical
>     case lz4a  // space is critical
>     case zlib  // reasonable speed and space
>     case lzfse // better speed and space
> }
> 
> private enum CompressionOperation {
>     case compression
>     case decompression
> }
> 
> private func perform(_ operation: CompressionOperation,
>                      on input: Data,
>                      using algorithm: CompressionAlgorithm,
>                      workingBufferSize: Int = 2000) -> Data?  {
>     
>     
>     // set the algorithm
>     let streamAlgorithm: compression_algorithm
>     switch algorithm {
>     case .lz4:   streamAlgorithm = COMPRESSION_LZ4
>     case .lz4a:  streamAlgorithm = COMPRESSION_LZMA
>     case .zlib:  streamAlgorithm = COMPRESSION_ZLIB
>     case .lzfse: streamAlgorithm = COMPRESSION_LZFSE
>     }
>     
>     // set the stream operation, and flags
>     let streamOperation: compression_stream_operation
>     let flags: Int32
>     switch operation {
>     case .compression:
>         streamOperation = COMPRESSION_STREAM_ENCODE
>         flags = Int32(COMPRESSION_STREAM_FINALIZE.rawValue)
>     case .decompression:
>         streamOperation = COMPRESSION_STREAM_DECODE
>         flags = 0
>     }
> 
>     let dstSize = workingBufferSize
>     let dstPointer = UnsafeMutablePointer<UInt8>.allocate(capacity: dstSize) // you could technically use a Data for this too by accessing the mutable bytes ;)
>     defer {
>         dstPointer.deallocate(capacity: dstSize)
>     }
>     
>     let inputCount = input.count
>     return input.withUnsafeBytes { (bytes: UnsafePointer<UInt8>) in // the closure here can return a Data so the entire compression operation can be encapsulated in the lambda
>         var output = Data()
>         var stream = compression_stream(dst_ptr: dstPointer, dst_size: dstSize, src_ptr: bytes, src_size: inputCount, state: nil)
>         var status = compression_stream_init(&stream, streamOperation, streamAlgorithm)
>         guard status != COMPRESSION_STATUS_ERROR else {
>             return nil // perhaps instead of making this nullable a throw?
>         }
>         defer {
>             compression_stream_destroy(&stream)
>         }
>         
>         
>         while status == COMPRESSION_STATUS_OK {
>             // process the stream
>             status = compression_stream_process(&stream, flags)
>             
>             // collect bytes from the stream and reset
>             switch status {
>             case COMPRESSION_STATUS_OK:
>                 output.append(dstPointer, count: dstSize)
>                 stream.dst_ptr = dstPointer
>                 stream.dst_size = dstSize
>                 break // I am certain you probably meant to break here?
>             case COMPRESSION_STATUS_ERROR:
>                 return nil // perhaps instead of making this nullable a throw?
>             case COMPRESSION_STATUS_END:
>                 output.append(dstPointer, count: stream.dst_ptr - dstPointer)
>                 break // here too, this should break right?
>             default:
>                 break // this probably should be a fatal error since it is an unexpected result.
>             }
>         }
>         return output
>     }
> }
> 
> // Compressed keeps the compressed data and the algorithm
> // together as one unit, so you never forget how the data was
> // compressed.
> public struct Compressed {
>     public let data: Data // this was what was saving your state of the pointer
>     public let algorithm: CompressionAlgorithm
>     
>     public init(data: Data, algorithm: CompressionAlgorithm) {
>         self.data = data
>         self.algorithm = algorithm
>     }
>     
>     // Compress the input with the specified algorith. Returns nil if it fails.
>     public static func compress(input: Data,
>                                 with algorithm: CompressionAlgorithm) -> Compressed? {
>         guard let data = perform(.compression, on: input, using: algorithm) else {
>             return nil
>         }
>         return Compressed(data: data, algorithm: algorithm)
>     }
>     
>     // Factory method to return uncompressed data. Returns nil if it cannot be decompressed.
>     public func makeDecompressed() -> Data? {
>         return perform(.decompression, on: data, using: algorithm)
>     }
> }
> 
> // For discoverability, add a compressed method to Data
> extension Data {
>     // Factory method to make compressed data or nil if it fails.
>     public func makeCompressed(with algorithm: CompressionAlgorithm) -> Compressed? {
>         return Compressed.compress(input: self, with: algorithm)
>     }
> }
> 
> // Example usage:
> let input = Data(bytes: Array(repeating: UInt8(123), count: 10000))
> 
> let compressed = input.makeCompressed(with: .lzfse)
> compressed?.data.count // in most cases much less than orginal input count
> let restoredInput = compressed?.makeDecompressed()
> input == restoredInput // true
> 
> 
>> On Dec 9, 2016, at 9:04 AM, Ray Fix <rayfix at gmail.com <mailto:rayfix at gmail.com>> wrote:
>> 
>> Hi Philippe,
>> 
>> Thank you for your response!
>> 
>>> On Dec 9, 2016, at 8:44 AM, Philippe Hausler <phausler at apple.com <mailto:phausler at apple.com>> wrote:
>>> 
>>> So letting the bytes escape from the closure could potentially be in that territory of “bad things may happen”. If you know for certain that the Data will not be mutated and it’s underlying storage will last for the duration of the usage then you should be able to get away with it.
>>> 
>> 
>> So far it is looking like I am getting away with it. I believe the conditions that you state hold.
>> 
>>> The bytes pointer is an inner pointer return from the encapsulating object backing the struct Data. Since that backing object can potentially be deallocated via mutation or the buffer may be reallocated (which very often can cause the base address of the pointer to change) we decided it would be safer to use a closure to ensure the lifespan of the bytes over the access to the bytes.
>>> 
>>> Perhaps it might be best to understand more of what you are trying to do to offer a better solution.
>> 
>> I am trying to use the compression_stream C API in the Compression framework and wrap it so it is a bit more Swifty.  :]
>> 
>> A full gist of the playground is here: https://gist.github.com/rayfix/a286bb55accffef09249ba3535993782 <https://gist.github.com/rayfix/a286bb55accffef09249ba3535993782>
>> 
>> See line 91.
>> 
>> Appreciate the hints. 
>> 
>> Ray
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20161209/fc207e0b/attachment.html>


More information about the swift-users mailing list