[swift-evolution] multi-line string literals.

Vladimir.S svabox at gmail.com
Mon Apr 3 14:25:55 CDT 2017


On 03.04.2017 19:58, Peter Dillinger via swift-evolution wrote:
> Can we try to enumerate the potential hazards and potential useful features associated with multi-line strings?  Then perhaps you can judge various proposals based on them.
>
>
> Potential hazards:
>
> H1) Forgotten '+' (plus).  This affects the current idiom where you could end up with part of your intended string going missing (with a compiler warning) because you forgot to put + between string fragments on adjacent lines.  (Making this a compiler error was rejected in thread "Disallowing many expressions with unused result".)
>
> H2) Forgotten ',' (comma).  If adjacent string tokens (potentially on separate lines) are implicitly concatenated (as in C/C++), that makes it easy to mis-specify arrays of strings, as in
> var a = [
>   "a",
>   "b"
>   "c"]
> which would have only two elements.  This could also affect intended tuples, but with much higher likelihood of being caught by the compiler.
>
> H3) No recovery for tokenization / syntax highlighting.  IMHO, this is the big drawback of Python-style """ strings.  If you jump to an arbitrary point in the source code, you don't know whether you're inside a """, and AFAIK there's no reliable, automatic way to figure out if the next """ enters or exits a multi-line string.  As someone who has dealt with building syntax highlighters, the property of a predictable tokenizer state after newline (as in languages like Java or C# - either default state or multiline comment) is really nice.  Yes, multiline comments are kind of ugly, but they at least tend to be self-correcting because the entry and exit character sequences are different!  Requiring some kind of continuation character generally facilitates immediate recovery.
>
> H4) Unclear escaping / newline / indentation semantics.  This has been under heavy discussion and I don't have much to add.
>
>
> Potentially useful features:
>
> F1) Interpolation.  There's less value if it's difficult to embed evaluated code.
>
> F2) Raw embedding.  There's less value if it's difficult to construct literals from raw strings, because of the need for escaping / continuation characters etc.
>
>
> Another imperfect proposal:
>
> Support two forms of multiline strings:
> * one with escaping, interpolation, and embedded newlines only with \n, delimited with \\\ and ///

Why we need it ? It's the same as concatenating strings currently.

> * one with no escaping that includes written newlines in the string, delimited with ``` and '''

IMO it is very useful to have string interpolation for such multi-line 
strings. So, in this case we should be able to escape '\(' and this means 
that we need to escape at least backslash. I'm not sure about usefulness of 
the "as-is" multi-string.

I think multi-line string should:
* Allows string interpolation
* But only backslash itself must be escaped (to support interpolation), 
other symbols without escaping
* Begin/End marker is  \" (or probably end marker is "\)
(looks similar to /* which opens multi-line comments)
* It appends '\n' for each *new* line in text(if text is of one line - no 
new line symbol will be added)
* Leading and trailing whitespace will be trimmed
(for opening marker - leading spaces after marker and before the actual 
text will be preserved,
for closing marker - trailing spaces after text and before marker will be 
preserved)

var str =
\"<?xml version="1.0">
         <path>
           \(pathValue)
         </path>
"\

And if you need to control whitespace:

var str =
\"<?xml version="1.0">"\ + "\n"+
\"        <path>"\  + "\n"+
\"          \(pathValue)"\ + "\n"+
\"        </path>"\


sendText(
\"Text line1
Text \(value)
Text line3"\
)

> Both forms would be subject to further restrictions to aid readability and use of indentation:
> * The three-character begin or end delimiter must be on a line by itself, only with optional leading whitespace (spaces and/or tabs).
> * Each line up to and including the end delimiter must exactly copy the leading whitespace used for the begin delimiter, and that whitespace is not included in the contents of the parsed string literal.
>
> For example:
>     var x =
>         ```
>         <?xml version="1.0">
>         <path>
>           C:\Foo
>         </path>
>         '''
> Or:
>     send(
>         \\\
>         <?xml version=\"1.0\">\n
>         <path>\n  \(path)\n</path>\n
>         ///
>     )
>
> Or:
>     let f =
> ```
> <paste in just about any file here>
> '''
>
> Maybe there should be a way to omit the trailing newline from a ``` ''' string, but I don't have a specific proposal.
>
>


More information about the swift-evolution mailing list