[swift-evolution] multi-line string literals.

Brent Royal-Gordon brent at architechies.com
Sat Apr 30 02:43:26 CDT 2016


> Example 1. The \@ operator: 
> 
> 	 // 1.  multi-line string literal with data lines as is. 
>          // It loads each line (part) up to and including the source-file-line- end:
>          // you can use all available characters without problems, 
>          // even \\ and \@  thus allowing you to nest e.g. Swift statements...  
>         
>          let xml =                                                    
>                      \@<?xml version="1.0"?>
> 	             \@  <catalog>
> 	             \@    <book id="bk101" empty=“”>      // this is not regarded as a comment.
> 	             \@       <author>//¯\"_(ツ)_//</author>
> 	             \@    </book>
>                      \@  </catalog> 
>  
> 
>    Example 2, The \\ operator: 
>    // Multi-line string literal with data lines with \n \t etc. respected: 
> 
>          var str =
>                      \\This is line one.\nThis is line two, with a few \t\t\t tabs in it...
>                      \\                                            
>                      \\This is line three: there are \(cars)                 // this is a comment.
>                      \\ waiting in the garage. This is still line three

There are a lot of reasons why I don't like these.

The first is simply that I think they're ugly and don't look like they have anything to do with string literals, but that's solvable. For instance, we could modify my proposal so that, if you were using continuation quotes, you wouldn't have to specify an end quote:

         let xml =                                                    
                     "<?xml version="1.0"?>
	             "  <catalog>
	             "    <book id="bk101" empty=“”>      // this is not regarded as a comment.
	             "       <author>//¯\"_(ツ)_//</author>
	             "    </book>
                     "  </catalog> 

So let's set the bikeshed color aside and think about the deeper problem, which is that line-oriented constructs like these are a poor fit for string literals.

A string literal in Swift is an expression, and the defining feature of expressions is that they can be nested within other expressions. We've been using examples where we simply assign them to variables, but quite often you don't really want to do that—you want to pass it to a function, or use an operator, or do something else with it. With an ending delimiter, that's doable:

	let xmlData = 
                     "<?xml version="1.0"?>
	             "  <catalog>
	             "    <book id="bk101" empty=“”>      // this is not regarded as a comment.
	             "       <author>//¯\"_(ツ)_//</author>
	             "    </book>
                     "  </catalog>".encoded(as: UTF8)

But what if there isn't a delimiter? You would't be able to write the rest of the expression on the same line. In a semicolon-based language, that would merely lead to ugly code:

	let xmlData = 
                     "<?xml version="1.0"?>
	             "  <catalog>
	             "    <book id="bk101" empty=“”>      // this is not regarded as a comment.
	             "       <author>//¯\"_(ツ)_//</author>
	             "    </book>
                     "  </catalog>
	             .encoded(as: UTF8);

But Swift uses newlines as line endings, so that isn't an option:

	let xmlData = 
                     "<?xml version="1.0"?>
	             "  <catalog>
	             "    <book id="bk101" empty=“”>      // this is not regarded as a comment.
	             "       <author>//¯\"_(ツ)_//</author>
	             "    </book>
                     "  </catalog>
	             .encoded(as: UTF8)		// This may be a different statement!

You end up having to artificially add parentheses or other constructs in order to convince Swift that, no, that really is part of the same statement. That's not a good thing.

(This problem of fitting in well as an expression is why I favor Perl-style heredocs over Python-style `"""` multiline strings. Heredoc placeholders work really well even in complicated expressions, whereas `"""` multiline strings split expressions in half over potentially enormous amounts of code. This might seem at odds with my support for the proposal at hand, but I imagine this proposal being aimed at strings that are a few lines long, where a heredoc would be overkill. If you're going to have two different features which do similar things, you should at least make sure they have different strengths and weaknesses.)

But you could argue that it's simply a matter of style that, unfortunately, you'll usually have to assign long strings to constants. Fine. There's still a deeper problem with this design.

You propose a pair of multi-line-only string literals. One of them supports escapes, the other doesn't; both of them avoid the need to escape quotes.

Fine. Now what if you need to disable escapes or avoid escaping quotes in a single-line string? What if your string is, say, a regular expression like `"[^"\\]*(\\.[^"\\]*)*+"`—something very short, but full of backslashes and quotes?

The constructs you propose are very poorly suited for that—remember, because they're line-oriented, they don't work well in the middle of a more complicated expression—and they aren't built on features which generalize to act on single-line strings. So now we have to invent some separate mechanism which does the same thing to single-line strings, but works in a different and incompatible way. That means we now have five ad-hoc features, each of which works differently, with no way to transport your knowledge from one of them to another:

* Single-line strings
* Disabling escapes for single-line strings
* Unescaped quotes for single-line strings
* Multi-line-only strings with unescaped quotes
* Multi-line-only strings with unescaped quotes and disabled escapes

My proposal and the other features I sketch, on the other hand, does the same things with only three features, which you can use in any combination:

* Single- or multi-line strings
* Disabling escapes for any string
* Unescaped quotes for any string

This kind of modular design, where a particular task is done in the same way throughout the language, is part of what makes a good language good.

-- 
Brent Royal-Gordon
Architechies



More information about the swift-evolution mailing list