[swift-dev] Testing fails in GYBUnicodeDataUtils.py

Ryan Lovelett swift-dev at ryan.lovelett.me
Mon Jan 4 16:21:00 CST 2016


I wonder what the value of LC_ALL, LC_CTYPE, LANG are set to in your
environment? On my system LC_CTYPE=en_US.UTF-8 and LANG=en_US.UTF-8. My
understand of Python on Linux is that it reads these environment
variables to set `sys.getfilesystemencoding()`. This has to do with
configuring Python to consistently read filenames and such with the way
the OS is presenting them.

https://docs.python.org/2/library/sys.html#sys.getfilesystemencoding

On Mon, Jan 4, 2016, at 05:12 PM, Ryan Lovelett via swift-dev wrote:
> On Mon, Jan 4, 2016, at 03:40 PM, Tom Gall via swift-dev wrote:
> > Building with: ./swift/utils/build-script -R -t --foundation
> > 
> > on Linux (gentoo amd64) fails with
> > 
> > + /usr/bin/cmake --build
> > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64 -- -j4
> > SwiftUnitTests
> > 
> > [6/29] Generating UnicodeGraphemeBreakTest.cpp from
> > UnicodeGraphemeBreakTest.cpp.gyb with ptr size = 8
> > 
> > FAILED: cd /home/tgall/swift/swift/unittests/Basic && /usr/bin/cmake
> > -E make_directory
> > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8
> > && /home/tgall/swift/swift/utils/gyb --test
> > -DunicodeGraphemeBreakPropertyFile=/home/tgall/swift/swift/utils/UnicodeData/GraphemeBreakProperty.txt
> > -DunicodeGraphemeBreakTestFile=/home/tgall/swift/swift/utils/UnicodeData/GraphemeBreakTest.txt
> > -DCMAKE_SIZEOF_VOID_P=8 -o
> > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp.tmp
> > UnicodeGraphemeBreakTest.cpp.gyb && /usr/bin/cmake -E
> > copy_if_different
> > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp.tmp
> > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp
> > && /usr/bin/cmake -E remove
> > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp.tmp
> > 
> > Traceback (most recent call last):
> > 
> >   File "/home/tgall/swift/swift/utils/gyb", line 3, in <module>
> >     gyb.main()
> >   File "/home/tgall/swift/swift/utils/gyb.py", line 1071, in main
> >     args.target.write(executeTemplate(ast, args.line_directive,
> >     **bindings))
> >   File "/home/tgall/swift/swift/utils/gyb.py", line 974, in
> >   executeTemplate
> >     ast.execute(executionContext)
> >   File "/home/tgall/swift/swift/utils/gyb.py", line 591, in execute
> >     x.execute(context)
> >   File "/home/tgall/swift/swift/utils/gyb.py", line 667, in execute
> >     result = eval(self.code, context.localBindings)
> >   File
> >   "/home/tgall/swift/swift/unittests/Basic/UnicodeGraphemeBreakTest.cpp.gyb",
> > line 23, in <module>
> >     get_grapheme_cluster_break_tests_as_UTF8(unicodeGraphemeBreakTestFile)
> >   File "/home/tgall/swift/swift/utils/GYBUnicodeDataUtils.py", line
> > 553, in get_grapheme_cluster_break_tests_as_UTF8
> >     for line in f:
> >   File "/usr/lib64/python2.7/codecs.py", line 687, in next
> >     return self.reader.next()
> >   File "/usr/lib64/python2.7/codecs.py", line 618, in next
> >     line = self.readline()
> >   File "/usr/lib64/python2.7/codecs.py", line 533, in readline
> >     data = self.read(readsize, firstline=True)
> >   File "/usr/lib64/python2.7/codecs.py", line 480, in read
> >     newchars, decodedbytes = self.decode(data, self.errors)
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
> > 0: ordinal not in range(128)
> > [6/29] Building CXX object
> > unittests/Parse/CMakeFiles/SwiftParseTests.dir/LexerTests.cpp.o
> > ninja: build stopped: subcommand failed.
> > 
> > Ah yes ... the joys of python stack dumps...  anyway, tracing this a bit:
> > 
> > in swift/utils/GYBUnicodeDataUtils.py there is:
> > 
> > with codecs.open(grapheme_break_test_file_name,
> > encoding=sys.getfilesystemencoding(), errors='strict') as f:
> > 
> 
> I wrote that code and patch (see:
> https://github.com/apple/swift/commit/7dbb4127f55022bca7b191d448652b5decf8626e).
> The change was in service of adding Python 3 support to GYB. So first of
> all let me say: I'm sorry. 😏
> 
> Open up your python interpreter and figure out what your filesystem is
> reporting its encoding to be (e.g., `sys.getfilesystemencoding()`). On
> OS X and my copy of Arch linux it reports `'utf-8'` which is why it
> doesn't have an issue. Worst case scenario we can just force it to be
> `with codecs.open(grapheme_break_test_file_name, encoding='utf-8',
> errors='strict') as f:` but I went with the filesystem encoding because
> hopefully it is always UTF-8.
> 
> > It appears to be our offending bit of python code. Now my unicode &
> > python foo isn't the strongest, but if I change what is passed as
> > encoding to : encoding='utf-8', the swift testcases seem to run quite
> > a bit better and end up reporting :
> > 
> > Testing Time: 65.82s
> >   Expected Passes    : 1748
> >   Expected Failures  : 83
> >   Unsupported Tests  : 585
> > -- check-swift-linux-x86_64 finished --
> > --- Finished tests for swift ---
> > 
> > Question is, is that little fix the 'right thing' (TM) ?  If so happy
> > to submit this as my first 'lame' patch.
> > 
> > Thanks
> > 
> > -- 
> > Regards,
> > Tom
> > 
> > "Where's the kaboom!? There was supposed to be an earth-shattering
> > kaboom!" Marvin Martian
> > Director, Linaro Mobile Group
> > Tech Lead, GPGPU
> > Linaro.org │ Open source software for ARM SoCs
> > irc: tgall_foo | skype : tom_gall
> > _______________________________________________
> > swift-dev mailing list
> > swift-dev at swift.org
> > https://lists.swift.org/mailman/listinfo/swift-dev
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev


More information about the swift-dev mailing list