[swift-evolution] multi-line string literals.
Chris Lattner
clattner at apple.com
Tue Apr 26 01:04:31 CDT 2016
On Apr 25, 2016, at 5:22 PM, Brent Royal-Gordon <brent at architechies.com> wrote:
>>> 3. It might be useful to make multiline `"` strings trim trailing whitespace and comments like Perl's `/x` regex modifier does.
>>
>> If you have modifier characters already, it is easy to build a small zoo full of these useful beasts.
>
> Modifiers are definitely a workable alternative, and can be quite flexible, particularly if a future macro system can let you create new modifiers.
Right. I consider modifiers to be highly precedented in other languages, and therefore proven to work. If we go this way, I greatly prefer prefix to postfix modifiers.
>>> * Alternative delimiters: If a string literal starts with three, or five, or seven, or etc. quotes, that is the delimiter, and fewer quotes than that in a row are simply literal quote marks. Four, six, etc. quotes is a quote mark abutting the end of the literal.
>>>
>>> let xml: String = """<?xml version="1.0"?>
>>> """<catalog>
>>> """\t<book id="bk101" empty="">
>>> """\t\t<author>\(author)</author>
>>> """\t</book>
>>> """</catalog>"""
>>>
>>> You can't use this syntax to express an empty string, or a string consisting entirely of quote marks, but `""` handles empty strings adequately, and escaping can help with quote marks. (An alternative would be to remove the abutting rule and permit `""""""` to mean "empty string", but abutting quotes seem more useful than long-delimiter empty strings.)
>>
>> I agree that there is a need to support alternative delimiters, but subjectively, I find this to be pretty ugly. It is also a really unfortunate degenerate case for “I just want a large blob of XML” because you’d end up using “"” almost all the time, and you have to use it on every line.
>
> On the other hand, the `"""` does form a much larger, more obvious continuation indicator. It is *extremely* obvious that the above line is not Swift code, but something else embedded in it. It's also extremely obvious what its extent is: when you stop seeing `"""`, you're back to normal Swift code.
Right, but it is also heavy weight and ugly. In your previous email you said about the single quote approach: "The quotation marks on the left end up forming a column that marks the lines as special”, so I don’t see a need for a triple quote syntax to solve this specific problem.
> I *really* don't like the idea of our only alternatives being "one double-quote mark with backslashing" or "use an entire heredoc". Heredocs have their place, but they are a *very* heavyweight quoting mechanism, and relatively short strings with many double-quotes are pretty common. (Consider, for instance, strings containing unparsed JSON.) I think we need *some* alternative to double-quotes, either single-quotes (with the same semantics, just as an alternative) or this kind of quote-stacking.
I agree that this is a real problem that would be great to solve.
If I step back and look at the string literal space we’re discussing, I feel like there are three options:
1) single and simple multiline strings, using “
2) your triple quote sort of string, specifically tuned to avoid having to escape “ when it occurs once or twice in sequence.
3) heredoc, which is a very general (but also very heavy weight) solution to quoting problems.
I’m trying to eliminate the middle one, so we only have to have "two things”. Here are some alternative ways to solve the problem, which might have less of an impact on the language:
A) Introduce single quoted string literals to avoid double quote problems specifically, e.g.: ‘look “here” I say!’. This is another form of #2 which is less ugly. It also doesn’t help you if you have both “ and ‘ in your string.
B) Introduce a modifier character that requires a more complex closing sequence to close off the string, see C++ raw string literals for prior art on this approach. Perhaps something like:
Rxxx”look “ here “ I can use quotes “xxx
That said, I still prefer C) "ignore this issue for now”. In other words, I wouldn’t want to block progress on improving the string literal situation overall on this issue, because anything we do here is an further extension to a proposal that doesn’t solve this problem.
>
>> For cases like this, I think it would be reasonable to have a “heredoc” like scheme, which does not allow leading indentation, and does work with all the same modifier characters above. I do not have a preference on a particular syntax, and haven’t given it any thought, but this would allow you to do things like:
>>
>> let str = <<EOF
>> <?xml version="1.0"?>
>> <catalog>
>> \t<book id="bk101" empty="">
>> \t\t<author>\(author)</author>
>> \t</book>
>> </catalog>
>> EOF
>>
>> for example. You could then turn off escaping and other knobs using the modifier character (somehow, it would have to be incorporated into the syntax of course).
>
> There are two questions and a suggestion I have whenever heredoc syntax comes up.
>
> Q1: Does the heredoc begin immediately, at the next line, or at the next valid place for a statement to start? Heredocs traditionally take the second approach.
>
> Q2: Do you permit heredocs to stack—that is, for a single line to specify multiple heredocs?
>
> S: During the Perl 6 redesign, they decided to use the delimiter's indentation to determine the base indentation for the heredoc:
>
> func x() -> String {
> return <<EOF
> <?xml version="1.0"?>
> <catalog>
> \t<book id="bk101" empty="">
> \t\t<author>\(author)</author>
> \t</book>
> </catalog>
> EOF
> }
>
> Does that seem like a good approach?
I think that either approach could work, that you have a lot more experience on these topics than I do, and I would expect a vigorous community debate about these topics. :-)
That said, if you look at what we’re discussing:
1. “Continuation" string literals, to allow a multi-line string literal. You and I appear to completely agree about this.
2. Heredoc: You and I seem to agree that they are a good “fully general” solution to have, but there are the details you outline above to iron out.
3. Modifier characters: I’m in favor, but I don’t know where you stand. There is also still much to iron out here (such as the specific characters).
4. A way to avoid having to escape “ in a non-heredoc literal. I’m still unconvinced, and think that any solution to this problem will be orthogonal to the problems solved by 1-3 (and therefore can be added after getting experience with the other parts).
If you agree that these are all orthogonal pieces, then treat them as such: I’d suggest that you provide a proposal that just tackles the continuation string literals. This seems simple, and possible to get in for Swift 3. After that, we can discuss heredoc and modifiers (if you think they’re a good solution) on their own threads. If those turn out to be uncontroversial, then perhaps they can get in too.
On the heredoc aspects specifically, unless others chime in with strong opinions about the topics you brought up, I’d suggest that you craft a proposal for adding them with your preferred solution to these. You can mention the other answers (along with their tradeoffs and rationale for why you picked whatever you think is right) in the proposal, and we can help the community hash it out.
What do you think?
-Chris
More information about the swift-evolution
mailing list