[swift-dev] Metadata Representation

Saleem Abdulrasool compnerd at compnerd.org
Mon Sep 25 23:35:08 CDT 2017

On Mon, Sep 25, 2017 at 11:47 AM, John McCall <rjmccall at apple.com> wrote:

> > On Sep 25, 2017, at 12:24 PM, Joe Groff <jgroff at apple.com> wrote:
> >> On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall at apple.com> wrote:
> >>> On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd at compnerd.org>
> wrote:
> >>>
> >>> On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall at apple.com>
> wrote:
> >>>
> >>>> On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <
> compnerd at compnerd.org> wrote:
> >>>>
> >>>> On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall at apple.com>
> wrote:
> >>>>> On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <
> swift-dev at swift.org> wrote:
> >>>>> On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff at apple.com>
> wrote:
> >>>>>
> >>>>>
> >>>>>> On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <
> compnerd at compnerd.org> wrote:
> >>>>>>
> >>>>>> On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff at apple.com>
> wrote:
> >>>>>>
> >>>>>>
> >>>>>>> On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <
> swift-dev at swift.org> wrote:
> >>>>>>>
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> The current layout for the swift metadata for structure types, as
> emitted, seems to be unrepresentable in PE/COFF (at least for x86_64).
> There is a partial listing of the generated code following the message for
> reference.
> >>>>>>>
> >>>>>>> When building the standard library, LLVM encounters a relocation
> which cannot be represented.  Tracking down the relocation led to the type
> metadata for SwiftNSOperatingSystemVersion.  The metadata here is _T0SC30_
> SwiftNSOperatingSystemVersionVN.  At +32-bytes we find the Kind (1).  So,
> this is a struct metadata type.  Thus at Offset 1 (+40 bytes) we have the
> nominal type descriptor reference.  This is the relocation which we fail to
> represent correctly.  If I'm not mistaken, it seems that the field is
> supposed to be a relative offset to the nominal type descriptor.  However,
> currently, the nominal type descriptor is emitted in a different section
> (.rodata) as opposed to the type descriptor (.data).  This cross-section
> relocation cannot be represented in the file format.
> >>>>>>>
> >>>>>>> My understanding is that the type metadata will be adjusted during
> the load for the field offsets.  Furthermore, my guess is that the relative
> offset is used to encode the location to avoid a relocation for the load
> address base.  In the case of windows, the based relocations are a given,
> and I'm not sure if there is a better approach to be taken.  There are a
> couple of solutions which immediately spring to mind: moving the nominal
> type descriptor into the (RW) data segment and the other is to adjust the
> ABI to use an absolute relocation which would be rebased.  Given that the
> type metadata may be adjusted means that we cannot emit it into the RO data
> segment.  Is there another solution that I am overlooking which may be
> simpler or better?
> >>>>>>
> >>>>>> IIRC, this came up when someone was trying to port Swift to Windows
> on ARM as well, and they were able to conditionalize the code so that we
> used absolute pointers on Windows/ARM, and we may have to do the same on
> Windows in general. It may be somewhat more complicated on Win64 since we
> generally assume that relative references can be 32-bit, whereas an
> absolute reference will be 64-bit, so some formats may have to change
> layout to make this work too. I believe Windows' executable loader still
> ultimately maps the final PE image contiguously, so alternatively, you
> could conceivably build a Swift toolchain that used ELF or Mach-O or some
> other format with better support for PIC as the intermediate object format
> and still linked a final PE executable. Using relative references should
> still be a win on Windows both because of the size benefit of being 32-bit
> and the fact that they don't need to be slid when running under ASLR or
> when a DLL needs to be rebased.
> >>>>>>
> >>>>>>
> >>>>>> Yeah, I tracked down the relativePointer thing.  There is a nice
> subtle little warning that it is not fully portable :-).  Would you happen
> to have a pointer to where the adjustment for the absolute pointers on WoA
> is?
> >>>>>>
> >>>>>> You are correct that the image should be contiugously mapped on
> Windows.  The idea of MachO as an intermediatary is rather intriguing.
> Thinking longer term, maybe we want to use that as a global solution?  It
> would also provide a nicer autolinking mechanism for ELF which is the one
> target which currently is missing this functionality.  However, if Im not
> mistaken, this would require a MachO linker (and the only current viable
> MachO linker would be ld64).  The MachO binary would then need to be
> converted into ELF or COFF.  This seems like it could take a while to
> implement though, but would not really break ABI, so pushing that off to
> later may be wise.
> >>>>>
> >>>>> Intriguingly, LLVM does support `*-*-win32-macho` as a target triple
> already, though I don't know what Mach-O to PE linker (if any) that's
> intended to be used with. We implemented relative references using
> current-position-relative offsets for Darwin and Linux both because that
> still allows for a fairly convenient pointer-like C++ API for working with
> relative offsets, and because the established toolchains on those platforms
> already have to support PIC so had most of the relocations we needed to
> make them work already; is there another base we could use for relative
> offsets on Windows that would fit in the set of relocations supported by
> standard COFF linkers?
> >>>>>
> >>>>>
> >>>>> Yes, the `-windows-macho` target is used for UEFI :-).  The MachO
> binary is translated later to PE/COFF as required by the UEFI specification.
> >>>>>
> >>>>> There are only two relocation types which can be used for relative
> displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section
> relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the
> section.  The latter is why I mentioned that moving them into the same
> section could be a solution as that would allow the relative distance to be
> encoded.  Unfortunately, the section relative relocation is relative to the
> section within which the symbol is.
> >>>>
> >>>> What's wrong with IMAGE_REL_AMD64_REL32?  We'd have to adjust the
> relative-pointer logic to store an offset from the end of the relative
> pointer instead of the beginning, but it doesn't seem to have a section
> requirement.
> >>>>
> >>>> Hmm, is it possible to use RIP relative addressing in data?  If so,
> yes, that could work.
> >>>
> >>> There's no inherent reason, but I wouldn't put it past the linker to
> fall over and die.  But it should at least be section-agnostic about the
> target, since this is likely to be used for all sorts of PC-relative
> addressing.
> >>>
> >>>
> >>> At least MC doesnt seem to like it.  Something like this for example:
> >>>
> >>> ```
> >>>  .data
> >>> data:
> >>>  .long 0
> >>>
> >>>  .section .rodata
> >>> rodata:
> >>>  .quad data(%rip)
> >>> ```
> >>>
> >>> Bails out due to the unexpected modifier.  Now, theoretically, we
> could support that modififer, but it does seem pretty odd.
> >>>
> >>> Now, as it so happens, both PE and PE+ have limitations on the file
> size at 4GiB.  This means that we are guaranteed that the relative
> difference is guaranteed to fit within 32-bits. This is where things get
> really interesting!
> >>>
> >>> We cannot generate the relocation because we are emitting the values
> at pointer width.  However, the value that we are emitting is a relative
> offset, which we just determined to be limited to 32-bits in width.  The
> thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the
> cross-setionness as you pointed out.  So, rather than emitting a
> pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow
> that with the relative displacement (`.long <expr>`).  This would be
> representable in the PE/COFF model.
> >>>
> >>> If I understand the layout correctly, the type metadata fields are
> supposed to be pointer sized.  I assume that we would like to maintain that
> across the formats.  It may be possible to alter the emission to change the
> relative pointer emission to emit a pair of longs instead for PE/COFF with
> a 64-bit pointer value.  Basically, we cannot truncate the relocation to a
> IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and
> pad to the desired width.
> >>>
> >>> Are there any pitfalls that I should be aware of trying to adjust the
> emission to do this?  The only downsides that I can see is that the
> emission would need to be taret dependent (that is check the output object
> format and the target pointer width).
> >>>
> >>> Thanks for the hint John!  It seems that was spot on :-).
> >>
> >> Honestly, I don't know that there's a great reason for this pointer to
> be relative in the first place.  The struct metadata will already have an
> absolute pointer to the value witness table which requires load-time
> relocation, so  maybe we should just make this an absolute pointer, too,
> unless we're seriously considering making that a relative pointer before
> allocation.
> >>
> >> In practice this will just be a rebase, not a full relocation, so it
> should be relatively cheap.
> >
> > At one point we discussed the possibility of also making the value
> witness table pointer relative, which would allow concrete value type
> metadata to be fully read-only, and since code invoking a value witness is
> almost certainly going to have the base type metadata pointer live,
> probably not an undue burden on code size.
> Yes, that's true.  It would make the base of the load (metadata +
> loaded-offset + immediate-offset), which I think would require an extra
> instruction even on x86, but maybe that's not so bad.
> On the other hand, yes, it would not be possible to refer to prebuilt
> vwtables from the runtime, and it would need to be a 64-bit relative offset
> in order to handle dynamic instantiation correctly, which as you say is
> problematic on some platforms.

Hmm, Im not sure I understand the desired approach.  Would we want to
switch to a rebased pointer?  Would this be for all of the metadata or just
the struct type?  Are there no other instances of the same pattern?

> John.
> > It's a fair question though whether we'll ever get around to that
> analysis, and I think the nominal type descriptor reference is the only
> place we statically emit a pointer-sized rather than 32-bit relative
> offset, which has caused problems for ports to other platforms that only
> support 32-bit relative offsets.
> >
> > -Joe

Saleem Abdulrasool
compnerd (at) compnerd (dot) org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20170925/baa92372/attachment.html>

More information about the swift-dev mailing list