[swift-evolution] multi-line string literals.

Tyler Cloutier cloutiertyler at aol.com
Fri May 6 14:09:05 CDT 2016


> On May 5, 2016, at 10:52 PM, Brent Royal-Gordon <brent at architechies.com> wrote:
> 
>> As far as mixed whitespace, I think the only sane thing to do would be to only allow leading tabs *or* spaces.  Mixing tabs and spaces in the leading whitespace would be a syntax error.  All lines in the string would need to use tabs or all lines use spaces, you could not have one line with tabs and another with spaces.  This would keep the compiler out of the business of making any assumptions or guesses, would not be a problem often, and would be very easy to fix if it ever happens accidentally.
> 
> The sane thing to do would be to require every line be prefixed with *exactly* the same sequence of characters as the closing delimiter line. Anything else (except perhaps a completely blank line, to permit whitespace trimming) would be a syntax error.
> 

Yes, this I think would be the way to do it.

> But take a moment to consider the downsides before you leap to adopt this solution.
> 
> 1. You have introduced tab-space confusion into the equation.

Agreed, and that’s never fun since they are invisible. Confusing for people new to programming, I imagine.

> 
> 2. You have introduced trailing-newline confusion into the equation.

Yes, you are absolutely right and I missed this one in my response. I assume you are referring to whether or not there is a new line after </catalog>, and if so how do you get rid of it without messing up the whitespace trimming.

Is this not also a problem for heredocs (if you want to use the closing delimiter to mark how much whitespace to trim)?

> 
> 3. The #escaped and #marginStripped keywords are now specific to multiline strings; #escaped in particular will be attractive there for tasks like regexes. You will have to invent a different syntax for it there.

These were just straw man proposals, I don’t think that is what they should/would be. Just throwing the general idea out there.

> 
> 4. This form of `"""` is not useful for not having to escape `"` in a single-line string; you now have to invent a separate mechanism for that.

True, unless you don’t mind taking up 3 lines to do it.

> 
> 5. You can't necessarily look at a line and tell whether it's code or string. And—especially with the #escaped-style constructs—the delimiters don't necessarily "pop" visually; they're too small and easy to miss compared to the text they contain. In extremis, you actually have to look at the entire file from top to bottom, counting the `"""`s to figure out whether you're in a string or not. Granted, you *usually* can tell from context, but it's a far cry from what continuation quotes offer.

To be fair, syntax highlighting also helps with this, but it’s quite possible you are looking at the code in a context where it is not available.

I don’t see how the #compilerDirective modifiers make the delimiters any less visible, though 

And, the same could be said for heredoc delimiters, I think. Although, that really depends on what the delimiters are.

> 
> 6. You are now forcing *any* string literal of more than one line to include two extra lines devoted wholly to the quoting syntax. In my Swift-generating example, that would change shorter snippets like this:
> 
> code +=      "    
>              "    static var messages: [HTTPStatus: String] = [
>              ""
> 
> Into things like this:
> 
> code +=      """
>                  
>                  static var messages: [HTTPStatus: String] = [
>                             
>              """
> 
> To my mind, the second syntax is actually *heavier*, despite not requiring every line be marked, because it takes two extra lines and additional punctuation.

> 
> 7. You are also introducing visual ambiguity into the equation—in the above example, the left margin is now ambiguous to the eye (even if it's not ambiguous to the compiler). You could recover it by permitting non-whitespace prefix characters:
> 
> code +=      """
>             |    
>             |    static var messages: [HTTPStatus: String] = [
>             |
>             |"""
> 
> ...but then we're back to annotating every line, *plus* we have the leading and trailing `"""` lines. Worst of both worlds.
> 

This is a good point. It takes up 5 lines, and you quite possibly will still have to go about counting spaces. It would be worse for the more whitespace you have.


> 8. In longer examples, you are dividing the expression in half in a way that makes it difficult to read. For instance, consider this code:
> 
>         socket.send( 
>             """ #escaped #marginStripped 
>             <?xml version="1.0"?>
>             <catalog>
>                <book id="bk101" empty="">
>                    <author>\(author)</author>
>                    <title>XML Developer's Guide</title>
>                    <genre>Computer</genre>
>                    <price>44.95</price>
>                    <publish_date>2000-10-01</publish_date>
>                    <description>An in-depth look at creating applications with XML.</description>
>                </book>
>             </catalog>
>             """.data(using: NSUTF8StringEncoding))
> 
> The effect—particularly with even larger literals than this—is not unlike pausing in the middle of reading an article to watch a movie. What were we talking about again?
> 
> This problem is neatly avoided by a heredoc syntax, which keeps the expression together and then collects the string below it:
> 
>         socket.send(""".data(using: NSUTF8StringEncoding))
>             <?xml version="1.0"?>
>             <catalog>
>                <book id="bk101" empty="">
>                    <author>\(author)</author>
>                    <title>XML Developer's Guide</title>
>                    <genre>Computer</genre>
>                    <price>44.95</price>
>                    <publish_date>2000-10-01</publish_date>
>                    <description>An in-depth look at creating applications with XML.</description>
>                </book>
>             </catalog>
>             """
> 
> (I'm assuming there's no need for #escaped or #marginStripped; they're both enabled by default.)

 I don’t really see the argument of pausing in the middle of the code. Isn’t that the mental model that most people have for string literals? If anything heredoc syntax would be more confusing.

Where would you put the modifiers then? I assume as modifying letters before the “””? 

        socket.send(ei""".data(using: NSUTF8StringEncoding))

That would work and also be consistent with single line and continuation quote strings if this feature were added there.

Heredocs look like they would be harder to parse than the alternative syntax, no? 

> 
> * * *
> 
> Let's actually talk about heredocs. Leaving aside indentation (which can be applied to either feature) and the traditional token choices (which can be changed), I think these are the pros of heredocs compared to Python triple-quotes:
> 
> H1: Doesn't break up expressions, as discussed above.
> H2: Literal content formatting is completely unaffected by code formatting, including the first and last lines.
> 
> Here are the pros of Python triple-quotes compared to heredocs:
> 
> P1: Simpler to explain: "like a string literal, but really big".
> P2: Lighter syntactic weight, enough to make`"""` usable as a single-line syntax.
> P3: Less trailing-newline confusion.
> 
> (There is one other difference: `"""` is simpler to parse, so we might be able to get it in Swift 3, whereas heredocs probably have to wait for Swift 4. But I don't think we should pick one feature over another merely so we can get it sooner. It's one thing if you plan to eventually introduce both features, as I plan to eventually have both continuation quotes and heredocs, to introduce each of them as soon as you can; it's another to actually choose one feature over another specifically to get something you can implement sooner.)
> 
> But the design you're discussing trades P2 and P3—and frankly, with the mandatory newlines, part of P1—away in an attempt to get H2. So we end up deciding between these two selling points:
> 
> * This triple-quotes design: Simpler to explain.
> * Heredocs: Doesn't break up expressions.
> 
> Simplicity is good, but I really like the code reading benefits of heredocs. Your code is your code and your text is your text. The interface between them is a bit funky, but within their separate worlds, they're both pretty nice.
> 

I would support having both. I think they have sufficiently different use cases and tradeoffs to warrant two solutions. It is also nice that if implementing both were the way to go, continuation quotes could be added to Swift 3, and heredoc could come later if necessary.


> * * *
> 
> Either way, heredocs or multiline-only triple quotes could be tweaked to support indentation by using the indentation of the end delimiter. But as I explained above, I don't think that's a great idea for either triple quotes *or* heredocs—the edge of the indentation is not visually well defined enough.
> 
> That's why I came to the conclusion that trying to cram every multiline literal into one syntax is trying to cram too many peg shapes into one hole shape. Indentation should *only* be supported by a dedicated syntax which is also designed for the smallest multiline strings, where indentation support is most useful. A separate feature without indentation support should handle longer strings, where the length alone is so disruptive to the flow of your code that there's just no point even trying to indent them to match (and the break with normal indentation itself assists you in finding the end of the string).
> 
> And I think that the best choice for the first feature is continuation quotes, and for the second is heredocs. Triple-quote syntaxes—either Python's or this modification—are jacks of all trades, but masters of none.
> 
> -- 
> Brent Royal-Gordon
> Architechies
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160506/e88a7f5e/attachment.html>


More information about the swift-evolution mailing list