[swift-evolution] [Review] SE-0168: Multi-Line String Literals

Wed Apr 12 03:11:11 CDT 2017

Great explanation thank you Brent. I’m convinced about the closing delimiter now. =)

If I understood correctly what Xiaodi Wu meant in his reply, then we could simplify the whole multi-line string literal and also remove the need of disabling the stripping algorithm.

We should ban these examples completely:

"""Hello·world!"""

"""Hello↵
world!"""
"""Hello↵
world!↵
"""
"""↵
Hello↵
world!"""
Instead an empty multi-line string literal would look like this:

"""↵
"""
To fix the above example you’d need to write it like this:

"""↵
Hello·world!\↵
"""
"""↵
Hello↵
world!\↵
"""
Each line in between the delimiters would add implicit new lines if not disabled by a backslash.
The trailing precision is also handled by the backslash.
The indent is handled by the closing delimiter.
It’s easier to learn/teach.
It’s easier to read, because most of the time the line where the starting delimiter is, is filled with some other code.
let myString = """↵
⇥  ⇥  Hello↵
⇥  ⇥  world!\↵
⇥  ⇥  """
Now that would be a true multi-line string literal which needs at least two lines of code. If you’d need a single line literal, "" is the obvious pick.

-- 
Adrian Zubarev
Sent with Airmail

Am 12. April 2017 um 02:32:33, Brent Royal-Gordon (brent at architechies.com) schrieb:

On Apr 11, 2017, at 8:08 AM, Adrian Zubarev via swift-evolution <swift-evolution at swift.org> wrote:

That’s also the example that kept me thinking for a while.

Overall the proposal is a great compromise to some issues I had with the first version. However I have a few more questions:

Why can’t we make it consistent and let the compiler add a new line after the starting delimiter.
 let string = """↵
    Swift↵
    """

// result
↵Swift↵
If one would would the behavior from the proposal it’s really easy to add a backslash after the starting delimiter.

 let string = """\↵
    Swift\↵
    """

// result
Swift
This would be consistent and less confusing to learn.

That would mean that code like this:

print("""
A whole bunch of 
multiline text
""")
print("""
A whole bunch more 
multiline text
""")

Will print (with - to indicate blank lines):

-
A whole bunch of
multiline text
-
-
A whole bunch more
multiline text
-

This is, to a first approximation, never what you actually want the computer to do.
Can’t we make the indent algorithm work like this instead?
let string = """\↵
····<tag>↵
······content text↵
····</tag>""" // Indent starts with the first non space character

// result

<tag>↵
··content text↵
</tag>
The line where the closing delimiter is trims all space chapters and the indent for the whole multi-line string is starting at the point where the first non-space chapters is in that line.

We could; I discuss that briefly in the very last section, on alternatives to the indentation stripping we specify:

• Stripping indentation to match the depth of the least indented line: Instead of removing indentation to match the end delimiter, you remove indentation to match the least indented line of the string itself. The issue here is that, if all lines in a string should be indented, you can't use indentation stripping. Ruby 2.3 does this with its heredocs, and Python's dedent function also implements this behavior.

That doesn't quite capture the entire breadth of the problem with this algorithm, though. What you'd like to do is say, "all of these lines are indented four columns, so we should remove four columns of indentation from each line". But you don't have columns; you have tabs and spaces, and they're incomparable because the compiler can't know what tab stops you set. So we'd end up calculating a common prefix of whitespace for all lines and removing that. But that means, when someone mixes tabs and spaces accidentally, you end up stripping an amount of indentation that is unrelated to anything visible in your code. We could perhaps emit a warning in some suspicious circumstances (like "every line has whitespace just past the end of indentation, but some use tabs and others use spaces"), but if we do, we can't know which one is supposed to be correct. With the proposed design, we know what's correct—the last line—and any deviation from it can be flagged *at the particular line which doesn't match our expectation*.

Even without the tabs and spaces issue, consider the case where you accidentally don't indent a line far enough. With your algorithm, that's indistinguishable from wanting the other lines to be indented more than that one, so we generate a result you don't want and we don't (can't!) emit a warning to point out the mistake. With the proposed algorithm, we can notice there's an error and point to the line at fault.

Having the closing delimiter always be on its own line and using it to decide how much whitespace to strip is better because it gives the compiler a firm baseline to work from. That means it can tell you what's wrong and where, instead of doing the dumb computer thing and computing a result that's technically correct but useless.
PS: If we’d get this feature in Swift, it would be nice if Xcode and other IDEs which supports Swift could show space characters that are inside a string literal (not other space character <- which is already supported), so it would be easier to tell what’s part of the string and what is not.

That would be very nice indeed. The prototype's tokenizer simply concatenates together and computes the string literal's contents after whitespace stripping, but in principle, I think it could probably preserve enough information to tell SourceKit where the indentation ends and the literal content begins. (The prototype is John's department, though, not mine.) Xcode would then have to do something with that information, though, and swift-evolution can't make the Xcode team do so. But I'd love to see a faint reddish background behind tripled string literal content or a vertical line at the indentation boundary.

In the meantime, this design *does* provide an unambiguous indicator of how much whitespace will be trimmed: however much is to the left of the closing delimiter. You just have to imagine the line extending upwards from there. I think that's an important thing to have.

-- 
Brent Royal-Gordon
Architechies

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170412/9f446ea3/attachment.html>