[swift-evolution] multi-line string literals.

Tyler Fleming Cloutier cloutiertyler at aol.com
Thu May 5 01:35:17 CDT 2016


Comments inline.

> On May 4, 2016, at 7:57 PM, Brent Royal-Gordon <brent at architechies.com> wrote:
> They separate indentation from the string's contents. Traditional multiline strings usually include all of the content between the start and end delimiters, including leading whitespace. This means that it's usually impossible to indent a multiline string, so including one breaks up the flow of the surrounding code, making it less readable. Some languages apply heuristics, either at compile time or through runtime string manipulation functions, to try to remove indentation, but like all heuristics, these are mistake-prone and murky.
> 
> Continuation quotes neatly avoid this problem. Whitespace before the continuation quote is indentation used to format the source code; whitespace after the continuation quote is part of the string literal. The interpretation of the code is perfectly clear to both compiler and programmer.
> 
> 
Although I agree that there can be problems with runtime manipulation, the Scala implementation of stripMargin does, in a way, solve the "n delimiter” problem, by allowing you to specify which character should be used for the margin marker.  For example,

val s = """
  |Line 1.
  |Line 2.
  |Line 3.""".stripMargin

Source: http://alvinalexander.com/scala/scala-multiline-strings-heredoc-syntax

There are a few issues with this off the bat. First, if you are unfamiliar with stripMargin it’s not clear if the first line counts as a newline.

\n
Line 1.\n
Line 2.\n
Line 3.\n

vs

Line 1.\n
Line 2.\n
Line 3.\n

and what about:

val s = “""
  |Line 1.

  |Line 2.
  |Line 3.""”.stripMargin

or

val s = “”"|
  |Line 1.
  |Line 2.
  |Line 3.""”.stripMargin

or

val s = “”"
  |Line 1.
  |Line 2.
  |Line 3.
  |""”.stripMargin


Secondly, due to the start and end delimiters, it doesn’t play as nicely with indentation as a single column of quote characters. Furthermore, it requires a lot of massaging to get a pasted string to be correct especially because it’s difficult to have the editor auto-format since the delimiters are really just part of the string.

Plus this has the other issues that you mentioned in terms of syntax highlighting, errors, and finding delimiters. Still I think that it does solve some of the more major problems and it has the nice property of being extensible with a simple standard library change. I don’t think that these benefits outweigh these points and the points you brought up, but perhaps it should be included in alternatives considered with the reasons for and against? 
>  <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#future-directions-for-multiline-string-literals>Future directions for multiline string literals
> 
> We could permit comments before encountering a continuation quote to be counted as whitespace, and permit empty lines in the middle of string literals. This would allow you to comment out whole lines in the literal.
> 
> We could allow you to put a trailing backslash on a line to indicate that the newline isn't "real" and should be omitted from the literal's contents. Holdsworth's prototype includes this feature.
> 
> 
Alternately, you could just close the quote on that line, perhaps? As in the limerick below.

let limerick = "Here’s a multiline literal string
               "It’s a cool, kinda fun, sort of thing
               "It's got newlines galore
               "but not anymore "
               "'cause I've capped the above line with bling"

This could be implemented by concatenating adjacent strings together automatically. However, I’m far from a compiler hacker, would an implementation like that be too complicated for the type checker?


>  <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#a-general-mechanism-string-literal-modifiers>A general mechanism: String literal modifiers
> 
> We may introduce the concept of string literal modifiers to alter the interpretation of string literals. These would become the basis for many future string literal features.
> 
> A string literal modifier is a cluster of identifier characters which goes before a string literal and adjusts the way it is parsed. Modifers only alter the interpretation of the text in the literal, not the type of data it produces; for instance, there will never be something like the UTF-8/UTF-16/UTF-32 literal modifiers in C++. 
> 
> Modifiers can be attached to both single-line and multiline literals, and could also be attached to other literal syntaxes which might be introduced in the future. When used with multiline strings, only the starting quote needs to carry the modifiers, not the continuation quotes.
> 
> In one potential design, uppercase modifier characters enable a feature; lowercase characters disable a feature.
> 
> Our prototype also includes basic support for string modifiers, although the specific behavior of the modifiers in the prototype doesn't precisely match this sketch.
> 
> 
I seems like it would be easy to mistake a modified multiline string literal from an unmodified literal since the modifier is only at the top. We could require the modifier on each line, which would allow more granular control, but would be more difficult to edit.

With regard to the rest of the proposal, it’s awesome! It’s really quite thorough in it’s consideration of the tradeoffs. I did not expect someone to actually start running with the idea, but you’ve really taken it really far!

Tyler

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160504/73aa218c/attachment.html>


More information about the swift-evolution mailing list