[swift-dev] utf-8 issues in gyb and line-directive

Ryan Lovelett swift-dev at ryan.lovelett.me
Tue Jan 12 20:06:52 CST 2016


Attached is patch
`0001-gyb-Force-UTF-8-encoding-when-parsing-templates-on-L.patch` that
should fix this issue for both Python 2 and 3. I’ve tested it on OS X
and Arch as well.

I’ve submitted the patch as a MR [1].

[1] https://github.com/apple/swift/pull/950

On Tue, Jan 12, 2016, at 06:26 PM, Ryan Lovelett via swift-dev wrote:
> Ok I was able to reproduce this. Basically it comes down to you not
> having your locale set.
> 
> I've almost got a patch that works for Python 2 and 3.
> 
> On Tue, Jan 12, 2016, at 03:04 PM, Lukas Stabe wrote:
> > This is a separate issue. Here are the backtraces for the two issues:
> > 
> > This is the issue in gyb (note that this occurs here because file is
> > opened lazily. The real issue is a few lines before, where the
> > argparse.FileType is instantiated)
> > 
> >    FAILED: cd /build/swiftc/src/swift/stdlib/public/core &&
> >    /usr/bin/cmake -E make_directory
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8
> >    && /build/swiftc/src/swift/utils/gyb --test
> >    -DunicodeGraphemeBreakPropertyFile=/build/swiftc/src/swift/utils/UnicodeData/GraphemeBreakProperty.txt
> >    -DunicodeGraphemeBreakTestFile=/build/swiftc/src/swift/utils/UnicodeData/GraphemeBreakTest.txt
> >    -DCMAKE_SIZEOF_VOID_P=8 -o
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift.tmp
> >    Arrays.swift.gyb && /usr/bin/cmake -E copy_if_different
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift.tmp
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift
> >    && /usr/bin/cmake -E remove
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift.tmp
> >    Traceback (most recent call last):
> >      File "/build/swiftc/src/swift/utils/gyb", line 3, in <module>
> >        gyb.main()
> >      File "/build/swiftc/src/swift/utils/gyb.py", line 1064, in main
> >        ast = parseTemplate(args.file.name, args.file.read())
> >      File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
> >        return codecs.ascii_decode(input, self.errors)[0]
> >    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
> >    13116: ordinal not in range(128)
> > 
> > After fixing this (in a way incompatible with OSX Python), I get this
> > error in line-directive:
> > 
> >    FAILED: cd
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core
> >    && /usr/bin/cmake -E make_directory
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/linux/x86_64
> >    && /usr/bin/cmake -E make_directory
> >    /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/./lib/swift/linux/x86_64
> >    && /build/swiftc/src/swift/utils/line-directive
> >    /build/swiftc/src/swift/stdlib/public/core/Algorithm.swift [… more
> >    files omitted for brevity …]
> >    /build/swiftc/src/swift/stdlib/public/core/VarArgs.swift
> >    /build/swiftc/src/swift/stdlib/public/core/Zip.swift
> >    /build/swiftc/src/swift/stdlib/public/core/Prespecialized.swift
> >    Traceback (most recent call last):
> >      File "/build/swiftc/src/swift/utils/line-directive", line 104, in
> >      <module>
> >        run()
> >      File "/build/swiftc/src/swift/utils/line-directive", line 92, in run
> >        file, line_num = map_line(m.group(1), int(m.group(2)))
> >      File "/build/swiftc/src/swift/utils/line-directive", line 60, in
> >      map_line
> >        map = fline_map(filename)
> >      File "/build/swiftc/src/swift/utils/line-directive", line 54, in
> >      fline_map
> >        map = _make_line_map(filename)
> >      File "/build/swiftc/src/swift/utils/line-directive", line 43, in
> >      _make_line_map
> >        for i, l in enumerate(input.readlines()):
> >      File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
> >        return codecs.ascii_decode(input, self.errors)[0]
> >    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
> >    6851: ordinal not in range(128)
> > 
> > — Lukas
> > 
> > > On 12 Jan 2016, at 20:31, Ryan Lovelett via swift-dev <swift-dev at swift.org> wrote:
> > > 
> > > I'm thought this was resolved in master already. Are you running the
> > > latest version of master?
> > > 
> > > See
> > > https://github.com/apple/swift/commit/97b98b193d1490685a0aaba78cccb76e2caba4c1
> > > 
> > > On Tue, Jan 12, 2016, at 01:12 PM, Lukas Stabe via swift-dev wrote:
> > >> I’ve tried to compile Swift in a clean Arch Linux chroot. In this
> > >> environment, `sys.getdefaultencoding()` does not return `utf-8`, which
> > >> causes gyb and line-directive to fail with non-ascii characters in input
> > >> files.
> > >> 
> > >> One example of such a file is Arrays.swift.gyb, which contains a
> > >> ≤-character
> > >> [here](https://github.com/apple/swift/blob/master/stdlib/public/core/Arrays.swift.gyb#L358).
> > >> 
> > >> In gyb, `argparse.FileType` is used to open input files, and
> > >> line-directive uses plain old `open(…)`. Sadly, both of those don’t
> > >> provide functionality to specify the file encoding in Python 2.7.10,
> > >> which is what OSX ships with.
> > >> 
> > >> I’m not that experienced in Python, so I don’t know what the pythonic way
> > >> to solve this would be, so if anyone with more Python experience would
> > >> help solve this, that would be great.
> > >> 
> > >> — Lukas
> > >> _______________________________________________
> > >> swift-dev mailing list
> > >> swift-dev at swift.org
> > >> https://lists.swift.org/mailman/listinfo/swift-dev
> > >> Email had 1 attachment:
> > >> + signature.asc
> > >>  1k (application/pgp-signature)
> > > _______________________________________________
> > > swift-dev mailing list
> > > swift-dev at swift.org
> > > https://lists.swift.org/mailman/listinfo/swift-dev
> > 
> > Email had 1 attachment:
> > + signature.asc
> >   1k (application/pgp-signature)
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-gyb-Force-UTF-8-encoding-when-parsing-templates-on-L.patch
Type: application/octet-stream
Size: 4758 bytes
Desc: not available
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20160112/c5a6bf98/attachment.obj>


More information about the swift-dev mailing list