[swift-evolution] multi-line string literals.
brent at architechies.com
Wed Apr 27 03:52:54 CDT 2016
> If you agree that these are all orthogonal pieces, then treat them as such: I’d suggest that you provide a proposal that just tackles the continuation string literals. This seems simple, and possible to get in for Swift 3.
I've gone ahead and drafted this proposal, with some small extensions and adjustments. See the "Draft Notes" section for details of what I've changed and what concerns I have.
Multiline string literals
Proposal: SE-NNNN <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md>
Author(s): Brent Royal-Gordon <https://github.com/brentdax>
Status: First Draft
Review manager: TBD
In Swift 2.2, the only means to insert a newline into a string literal is the \n escape. String literals specified in this way are generally ugly and unreadable. We propose a multiline string feature inspired by English punctuation which is a straightforward extension of our existing string literals.
Swift-evolution thread: multi-line string literals. <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160418/015500.html>
This draft differs from the prototypes being thrown around on the list in that it specifies that comments should be treated as whitespace, and that whitespace-only lines in the middle of a multiline string should be ignored. I'm not sure if this is feasible from a parsing standpoint, and I'd like feedback from implementers on this point.
This draft also specifies diagnostics which should be included. Feedback on whether these are good choices would be welcome.
I am considering allowing you to put a backslash before the newline to indicate it should not be included in the literal. In other words, this code:
Would print "foobar". However, I think this should probably be proposed separately, because there may be a better way to do it.
I've listed only myself as an author because I don't want to put anyone else's name to a document they haven't seen, but there are others who deserve to be listed (John Holdsworth at least). Let me know if you think you should be included.
As Swift begins to move into roles beyond app development, code which needs to generate text becomes a more important use case. Consider, for instance, generating even a small XML string:
let xml = "<?xml version=\"1.0\"?>\n<catalog>\n\t<book id=\"bk101\" empty=\"\">\n\t\t<author>\(author)</author>\n\t</book>\n</catalog>"
The string is practically unreadable, its structure drowned in escapes and run-together characters; it looks like little more than line noise. We can improve its readability somewhat by concatenating separate strings for each line and using real tabs instead of \t escapes:
let xml = "<?xml version=\"1.0\"?>\n" +
" <book id=\"bk101\" empty=\"\">\n" +
" <author>\(author)</author>\n" +
" </book>\n" +
However, this creates a more complex expression for the type checker, and there's still far more punctuation than ought to be necessary. If the most important goal of Swift is making code readable, this kind of code falls far short of that goal.
We propose that, when Swift is parsing a string literal, if it reaches the end of the line without encountering an end quote, it should look at the next line. If it sees a quote mark there (a "continuation quote"), the string literal contains a newline and then continues on that line. Otherwise, the string literal is unterminated and syntactically invalid.
Our sample above could thus be written as:
let xml = "<?xml version=\"1.0\"?>
" <book id=\"bk101\" empty=\"\">
(Note that GitHub is applying incorrect syntax highlighting to this code sample, because it's applying Swift 2 rules.)
This format's unbalanced quotes might strike some programmers as strange, but it attempts to mimic the way multiple lines are quoted in English prose. As an English Stack Exchange answer illustrates <http://english.stackexchange.com/a/96613/64636>:
“That seems like an odd way to use punctuation,” Tom said. “What harm would there be in using quotation marks at the end of every paragraph?”
“Oh, that’s not all that complicated,” J.R. answered. “If you closed quotes at the end of every paragraph, then you would need to reidentify the speaker with every subsequent paragraph.
“Say a narrative was describing two or three people engaged in a lengthy conversation. If you closed the quotation marks in the previous paragraph, then a reader wouldn’t be able to easily tell if the previous speaker was extending his point, or if someone else in the room had picked up the conversation. By leaving the previous paragraph’s quote unclosed, the reader knows that the previous speaker is still the one talking.”
“Oh, that makes sense. Thanks!”
Similarly, omitting the ending quotation mark tells the code's reader (and compiler) that the literal continues on the next line, while including the continuation quote reminds the reader (and compiler) that this line is part of a string literal.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#benefits-of-continuation-quotes>Benefits of continuation quotes
It would be simpler to not require continuation quotes, so why are they required by this proposal? There are three reasons:
They help the compiler pinpoint errors in string literal delimiting. If continuation quotes were not required, then a missing end quote would be interpreted as a multiline string literal. This string literal would continue until the compiler encountered either another quote mark—perhaps at the site of another string literal or in a comment—or the end of the file. In either case, the compiler could at best only indicate the start of the runaway string literal; in pathological cases (for instance, if the next string literal was "+"), it might not even be able to do that properly.
With continuation quotes required, if you forget to include an end quote, the compiler can tell that you did not intend to create a multiline string and flag the line that actually has the problem. It can also provide immediately actionable fix-it assistance. The fact that there is a redundant indication on each line of the programmer's intent to include that line in a multiline quote allows the compiler to guess the meaning of the code.
They separate indentation from the string's contents. Without continuation quotes, there would be no obvious indication of whether whitespace at the start of the line was intended to indent the string literal so it matched the surrounding code, or whether that whitespace was actually meant to be included in the resulting string. Multiline string literals would either have to put subsequent lines against the left margin, or apply error-prone heuristics to try to guess which whitespace was indentation and which was string literal content.
They improve the ability to quickly recognize the literal. The " on each line serves as an immediately obvious indication that the line is part of a string literal, not code, and the row of " characters in a well-formatted file allows you to quickly scan up and down the file to see the extent of the literal.
When Swift is parsing a string literal and reaches the end of a line without finding a closing quote, it examines the next line, applying the following rules:
If the next line is all whitespace, it is ignored; Swift moves on to the line afterward, applying these rules again.
If the next line begins with whitespace followed by a continuation quote, then the string literal contains a newline followed by the contents of the string literal starting on that line. (This line may itself have no closing quote, in which case the same rules apply to the line which follows.)
If the next line contains anything else, Swift raises a syntax error for an unterminated string literal. This syntax error should offer two fix-its: one to close the string literal at the end of the current line, and one to include the next line in the string literal by inserting a continuation quote.
Rules 1 and 2 should treat comments as though they are whitespace; this allows you to comment out individual lines in a multiline string literal. (However, commenting out the last line of the string literal will still make it unterminated, so you don't have a completely free hand in commenting.)
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#impact-on-existing-code>Impact on existing code
Failing to close a string literal before the end of the line is currently a syntax error, so no valid Swift code should be affected by this change.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#requiring-no-continuation-character>Requiring no continuation character
The main alternative is to not require a continuation quote, and simply extend the string literal from the starting quote to the ending quote, including all newlines between them. For example:
let xml = "<?xml version=\"1.0\"?>
<book id=\"bk101\" empty=\"\">
This has several advantages:
It is simpler.
It is less offensive to programmers' sensibilities (since there are no unmatched " characters).
It does not require that you edit the string literal to insert a continuation quote in each line.
Balanced against the advantages, however, is the loss of the improved diagnostics, code formatting, and visual affordances mentioned in the "Benefits of continuation quotes" section above.
In practice, we believe that editor support (such as "Paste as String Literal" or "Convert to String Literal" commands) can make adding continuation quotes less burdensome, while also providing other conveniences like automatic escaping. We believe the other two factors are outweighed by the benefits of continuation quotes.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#use-a-different-delimiter-for-multiline-strings>Use a different delimiter for multiline strings
The initial suggestion was that multiline strings should use a different delimiter, """, at the beginning and end of the string, with no continuation characters between. This solution was rejected because it has the same issues as the "no continuation character" solution, and because it was mixing two orthogonal issues (multiline strings and alternate delimiters).
Another suggestion was to support a heredoc syntax, which would allow you to specify a placeholder string literal on one line whose content begins on the next line, running until some arbitrary delimiter. For instance, if Swift adopted Perl 5's syntax, it might support code like:
<book id="bk101" empty="">
In addition to the issues with the """ syntax, heredocs are complicated both to explain and to parse, and are not a natural extension of Swift's current string syntax.
Both of these suggestions address interesting issues with string literals, solving compelling use cases. They're just not that good at fixing the specific issue at hand. We might consider them in the future to address those problems to which they are better suited.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#fixing-other-string-literal-readability-issues>Fixing other string literal readability issues
This proposal is narrowly aimed at multiline strings. It intentionally doesn't tackle several other problems with string literals:
Reducing the amount of double-backslashing needed when working with regular expression libraries, Windows paths, source code generation, and other tasks where backslashes are part of the data.
Alternate delimiters or other strategies for writing strings with " characters in them.
String literals consisting of very long pieces of text which are best represented completely verbatim.
These are likely to be subjects of future proposals, though not necessarily during Swift 3.
This proposal also does not attempt to address regular expression literals. The members of the core team who are interested in regular expression support have ambitions for that feature which put it out of scope for Swift 3.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the swift-evolution