[swift-dev] utf-8 issues in gyb and line-directive

Ted kremenek kremenek at apple.com
Tue Jan 12 23:56:36 CST 2016


Merged!

> On Jan 12, 2016, at 6:06 PM, Ryan Lovelett via swift-dev <swift-dev at swift.org> wrote:
> 
> Attached is patch
> `0001-gyb-Force-UTF-8-encoding-when-parsing-templates-on-L.patch` that
> should fix this issue for both Python 2 and 3. I’ve tested it on OS X
> and Arch as well.
> 
> I’ve submitted the patch as a MR [1].
> 
> [1] https://github.com/apple/swift/pull/950
> 
>> On Tue, Jan 12, 2016, at 06:26 PM, Ryan Lovelett via swift-dev wrote:
>> Ok I was able to reproduce this. Basically it comes down to you not
>> having your locale set.
>> 
>> I've almost got a patch that works for Python 2 and 3.
>> 
>>> On Tue, Jan 12, 2016, at 03:04 PM, Lukas Stabe wrote:
>>> This is a separate issue. Here are the backtraces for the two issues:
>>> 
>>> This is the issue in gyb (note that this occurs here because file is
>>> opened lazily. The real issue is a few lines before, where the
>>> argparse.FileType is instantiated)
>>> 
>>>   FAILED: cd /build/swiftc/src/swift/stdlib/public/core &&
>>>   /usr/bin/cmake -E make_directory
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8
>>>   && /build/swiftc/src/swift/utils/gyb --test
>>>   -DunicodeGraphemeBreakPropertyFile=/build/swiftc/src/swift/utils/UnicodeData/GraphemeBreakProperty.txt
>>>   -DunicodeGraphemeBreakTestFile=/build/swiftc/src/swift/utils/UnicodeData/GraphemeBreakTest.txt
>>>   -DCMAKE_SIZEOF_VOID_P=8 -o
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift.tmp
>>>   Arrays.swift.gyb && /usr/bin/cmake -E copy_if_different
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift.tmp
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift
>>>   && /usr/bin/cmake -E remove
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/8/Arrays.swift.tmp
>>>   Traceback (most recent call last):
>>>     File "/build/swiftc/src/swift/utils/gyb", line 3, in <module>
>>>       gyb.main()
>>>     File "/build/swiftc/src/swift/utils/gyb.py", line 1064, in main
>>>       ast = parseTemplate(args.file.name, args.file.read())
>>>     File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
>>>       return codecs.ascii_decode(input, self.errors)[0]
>>>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
>>>   13116: ordinal not in range(128)
>>> 
>>> After fixing this (in a way incompatible with OSX Python), I get this
>>> error in line-directive:
>>> 
>>>   FAILED: cd
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core
>>>   && /usr/bin/cmake -E make_directory
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/stdlib/public/core/linux/x86_64
>>>   && /usr/bin/cmake -E make_directory
>>>   /build/swiftc/src/build/buildbot_linux/swift-linux-x86_64/./lib/swift/linux/x86_64
>>>   && /build/swiftc/src/swift/utils/line-directive
>>>   /build/swiftc/src/swift/stdlib/public/core/Algorithm.swift [… more
>>>   files omitted for brevity …]
>>>   /build/swiftc/src/swift/stdlib/public/core/VarArgs.swift
>>>   /build/swiftc/src/swift/stdlib/public/core/Zip.swift
>>>   /build/swiftc/src/swift/stdlib/public/core/Prespecialized.swift
>>>   Traceback (most recent call last):
>>>     File "/build/swiftc/src/swift/utils/line-directive", line 104, in
>>>     <module>
>>>       run()
>>>     File "/build/swiftc/src/swift/utils/line-directive", line 92, in run
>>>       file, line_num = map_line(m.group(1), int(m.group(2)))
>>>     File "/build/swiftc/src/swift/utils/line-directive", line 60, in
>>>     map_line
>>>       map = fline_map(filename)
>>>     File "/build/swiftc/src/swift/utils/line-directive", line 54, in
>>>     fline_map
>>>       map = _make_line_map(filename)
>>>     File "/build/swiftc/src/swift/utils/line-directive", line 43, in
>>>     _make_line_map
>>>       for i, l in enumerate(input.readlines()):
>>>     File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
>>>       return codecs.ascii_decode(input, self.errors)[0]
>>>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
>>>   6851: ordinal not in range(128)
>>> 
>>> — Lukas
>>> 
>>>> On 12 Jan 2016, at 20:31, Ryan Lovelett via swift-dev <swift-dev at swift.org> wrote:
>>>> 
>>>> I'm thought this was resolved in master already. Are you running the
>>>> latest version of master?
>>>> 
>>>> See
>>>> https://github.com/apple/swift/commit/97b98b193d1490685a0aaba78cccb76e2caba4c1
>>>> 
>>>>> On Tue, Jan 12, 2016, at 01:12 PM, Lukas Stabe via swift-dev wrote:
>>>>> I’ve tried to compile Swift in a clean Arch Linux chroot. In this
>>>>> environment, `sys.getdefaultencoding()` does not return `utf-8`, which
>>>>> causes gyb and line-directive to fail with non-ascii characters in input
>>>>> files.
>>>>> 
>>>>> One example of such a file is Arrays.swift.gyb, which contains a
>>>>> ≤-character
>>>>> [here](https://github.com/apple/swift/blob/master/stdlib/public/core/Arrays.swift.gyb#L358).
>>>>> 
>>>>> In gyb, `argparse.FileType` is used to open input files, and
>>>>> line-directive uses plain old `open(…)`. Sadly, both of those don’t
>>>>> provide functionality to specify the file encoding in Python 2.7.10,
>>>>> which is what OSX ships with.
>>>>> 
>>>>> I’m not that experienced in Python, so I don’t know what the pythonic way
>>>>> to solve this would be, so if anyone with more Python experience would
>>>>> help solve this, that would be great.
>>>>> 
>>>>> — Lukas
>>>>> _______________________________________________
>>>>> swift-dev mailing list
>>>>> swift-dev at swift.org
>>>>> https://lists.swift.org/mailman/listinfo/swift-dev
>>>>> Email had 1 attachment:
>>>>> + signature.asc
>>>>> 1k (application/pgp-signature)
>>>> _______________________________________________
>>>> swift-dev mailing list
>>>> swift-dev at swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-dev
>>> 
>>> Email had 1 attachment:
>>> + signature.asc
>>>  1k (application/pgp-signature)
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-dev
> <0001-gyb-Force-UTF-8-encoding-when-parsing-templates-on-L.patch>
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev


More information about the swift-dev mailing list