[swift-evolution] multi-line string literals.

Peter Dillinger Peter.Dillinger at synopsys.com
Mon Apr 3 11:58:41 CDT 2017

Can we try to enumerate the potential hazards and potential useful features associated with multi-line strings?  Then perhaps you can judge various proposals based on them.

Potential hazards:

H1) Forgotten '+' (plus).  This affects the current idiom where you could end up with part of your intended string going missing (with a compiler warning) because you forgot to put + between string fragments on adjacent lines.  (Making this a compiler error was rejected in thread "Disallowing many expressions with unused result".)

H2) Forgotten ',' (comma).  If adjacent string tokens (potentially on separate lines) are implicitly concatenated (as in C/C++), that makes it easy to mis-specify arrays of strings, as in
var a = [
which would have only two elements.  This could also affect intended tuples, but with much higher likelihood of being caught by the compiler.

H3) No recovery for tokenization / syntax highlighting.  IMHO, this is the big drawback of Python-style """ strings.  If you jump to an arbitrary point in the source code, you don't know whether you're inside a """, and AFAIK there's no reliable, automatic way to figure out if the next """ enters or exits a multi-line string.  As someone who has dealt with building syntax highlighters, the property of a predictable tokenizer state after newline (as in languages like Java or C# - either default state or multiline comment) is really nice.  Yes, multiline comments are kind of ugly, but they at least tend to be self-correcting because the entry and exit character sequences are different!  Requiring some kind of continuation character generally facilitates immediate recovery.

H4) Unclear escaping / newline / indentation semantics.  This has been under heavy discussion and I don't have much to add.

Potentially useful features:

F1) Interpolation.  There's less value if it's difficult to embed evaluated code.

F2) Raw embedding.  There's less value if it's difficult to construct literals from raw strings, because of the need for escaping / continuation characters etc.

Another imperfect proposal:

Support two forms of multiline strings:
* one with escaping, interpolation, and embedded newlines only with \n, delimited with \\\ and ///
* one with no escaping that includes written newlines in the string, delimited with ``` and '''
Both forms would be subject to further restrictions to aid readability and use of indentation:
* The three-character begin or end delimiter must be on a line by itself, only with optional leading whitespace (spaces and/or tabs).
* Each line up to and including the end delimiter must exactly copy the leading whitespace used for the begin delimiter, and that whitespace is not included in the contents of the parsed string literal.

For example:
    var x =
        <?xml version="1.0">
        <?xml version=\"1.0\">\n
        <path>\n  \(path)\n</path>\n

    let f =
<paste in just about any file here>

Maybe there should be a way to omit the trailing newline from a ``` ''' string, but I don't have a specific proposal.

Peter Dillinger, Ph.D.
Software Engineering Manager, Coverity Analysis, Software Integrity Group | Synopsys

More information about the swift-evolution mailing list