[swift-dev] statically initialized arrays

Erik Eckstein eeckstein at apple.com
Wed Jun 14 16:08:53 CDT 2017


> On Jun 14, 2017, at 12:03 PM, Jordan Rose <jordan_rose at apple.com> wrote:
> 
> 
> 
>> On Jun 14, 2017, at 11:24, Erik Eckstein via swift-dev <swift-dev at swift.org <mailto:swift-dev at swift.org>> wrote:
>> 
>> Hi,
>> 
>> I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.
>> 
>> Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.
>> 
>> This optimization can be done for array literals containing only other literals as elements.
>> Example:
>> 
>> func createArray() -> [Int] {
>>   return [1, 2, 3]
>> }
>> 
>> The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
>> It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.
>> 
>> What’s needed for this optimization?
>> 
>> 1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
>> 2) A representation of statically initialized global variables in SIL
>> 3) IRGen to create statically initialized objects as global llvm-variables
>> 
>> ad 2) Changes in SIL:
>> 
>> Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
>> This is somehow quirky and would get even more complicated for statically initialized objects.
>> 
>> I’d like to change that so that the sil_global itself contains the initialization value.
>> This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.
>> 
>> @@ -1210,7 +1210,9 @@ Global Variables
>>  ::
>>  
>>    decl ::= sil-global-variable
>> +  static-initializer ::= '{' sil-instruction-def* '}'
>>    sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
>> +                             (static-initializer)?
>>  
>>  SIL representation of a global variable.
>>  
>> @@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
>>  Once a global's storage has been initialized, ``global_addr`` is used to
>>  project the value.
>>  
>> +A global can also have a static initializer if it's initial value can be
>> +composed of literals. The static initializer is represented as a list of
>> +literal and aggregate instructions where the last instruction is the top-level
>> +value of the static initializer::
>> +
>> +  sil_global hidden @_T04test3varSiv : $Int {
>> +    %0 = integer_literal $Builtin.Int64, 27
>> +    %1 = struct $Int (%0 : $Builtin.Int64)
>> +  }
>> +
>> +In case a global has a static initializer, no ``alloc_global`` is needed before
>> +it can be accessed.
>> +
>> 

I just talked to MichaelG in person. He pointed out that the static initializer should not look like a function. Also the implicit convention that the last value is the actual top-level value is not obvious.
I think it makes sense to add some syntactic sugar to make this more clear (adding ‘=‘ and ‘init_value’):

+  sil_global hidden @_T04test3varSiv : $Int = {
+    %0 = integer_literal $Builtin.Int64, 27
+    init_value = struct $Int (%0 : $Builtin.Int64)
+  }



>> Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.
>> 
>> +object
>> +``````
>> +::
>> +
>> +  sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
>> +
>> +  object $T (%a : $A, %b : $B, ...)
>> +  // $T must be a non-generic or bound generic reference type
>> +  // The first operands must match the stored properties of T
>> +  // Optionally there may be more elements, which are tail-allocated to T
>> +
>> +Constructs a statically initialized object. This instruction can only appear
>> +as final instruction in a global variable static initializer list.
>> 
>> Finally we need an instruction to use such a statically initialized global object.
>> 
>> +global_object
>> +`````````````
>> +::
>> +
>> +  sil-instruction ::= 'global_object' sil-global-name ':' sil-type
>> +
>> +  %1 = global_object @v : $T
>> +  // @v must be a global variable with a static initialized object
>> +  // $T must be a reference type
>> +
>> +Creates a reference to the address of a global variable which has a static
>> +initializer which is an object, i.e. the last instruction of the global's
>> +static initializer list is an ``object`` instruction.
>> 
>> 
>> ad 3) IRGen support
>> 
>> Generating statically initialized globals is already done today for structs and tuples.
>> What’s needed is the handling of objects.
>> In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.
>> 
>> HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)
>> 
>> There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.
>> 
>> One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
>> Another possibility would be to call it in a global constructor.
>> 
>> If you have any feedback, please let me know
> 
> Please do not use a global constructor. :-) Globals are already set up to handle one-time initialization; the fact that that initialization is now cheaper is still a good thing.
> 
> To be clear, this sort of operation is only safe when the layout of the instance is statically known. The layout of an array buffer is especially brittle, since we use trailing storage, so this kind of operation really will be hardcoded in that case. I think that's fine.

I assume you are referring to the fact that the tail allocated array buffer layout was implemented in the stdlib originally. But this is not the case anymore. It’s already hard-coded in the compiler (we have a dedicated instruction for this).

> 
> Jordan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20170614/86a03af4/attachment.html>


More information about the swift-dev mailing list