<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><pre style="background-color: rgb(255, 255, 255);" class=""><div style="white-space: normal; font-family: Helvetica;" class="">[couple minutes read]</div><div style="white-space: normal; font-family: Helvetica;" class=""><br class=""></div><div style="white-space: normal; font-family: Helvetica;" class="">I read with great attention this thread, trying to see it from the implementation viewpoint (I know that the compiler structure should not drive the language features). I also revisited the how-to-contribute notes as well as the dev-process description. One of the ideas that stood out in my mind was that when looking at an implementation, enablement changes should be separated from the bulk of the feature, such that reviews can be easier.</div><div style="white-space: normal; font-family: Helvetica;" class=""><br class=""></div><div style="white-space: normal; font-family: Helvetica;" class="">So I tried to elevate this to the rank of a hidden-mandatory-requirement for anything related to this feature. It lead me to a staged approach to this feature that would allow a lot of things to be done, OVER TIME.</div><div style="white-space: normal; font-family: Helvetica;" class=""><br class=""></div><div style="white-space: normal; font-family: Helvetica;" class="">When distilling this feature to the smallest part enabler that would have to be added to the compiler I came to the following short list</div><div style="white-space: normal; font-family: Helvetica;" class=""><br class=""></div><div class=""><ul class="MailOutline"><li style="font-family: Helvetica; white-space: normal;" class="">add a string_multiline_token to the lexer</li><ul style="font-family: Helvetica; white-space: normal;" class=""><li class="">I realize that the current lexer can be tweaked to work (as per John’s PR), but IMO adding a dedicated "hole" in the parsing code is what will give something working today (no difference with current compiler behavior) while allowing all future changes to be cleanly isolated from anything around</li></ul><li class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">if one accepts the idea of a hole created by the token, then it stands to reason to have delimiters around it. I looking at the structure of the grammar, I came to the conclusion that </span></font><font color="#ff2600" class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">_” </span></font></font><font face="Helvetica" class=""><span style="white-space: normal;" class="">and </span><font color="#ff2600" style="white-space: normal;" class="">“_</font><span style="white-space: normal;" class=""> where an easy, unambiguous choice (I believe <font color="#ff2600" class="">“”</font></span></font><font color="#ff2600" face="Helvetica" class=""><span style="white-space: normal;" class="">”</span></font><font face="Helvetica" class=""><span style="white-space: normal;" class=""> and <font color="#ff2600" class="">“””</font> looked like an equally easy an unambiguous choice)</span></font></li><li class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">the next choice should be the creation of a </span><font color="#ff2600" style="white-space: normal;" class="">lexStringMultilineLiteral()</font><span style="white-space: normal;" class=""> and </span><font color="#ff2600" style="white-space: normal;" class="">lexMultilineCharacter()</font><span style="white-space: normal;" class=""> method in the Lexer. Again… bare with me, I do believe it is relevant to what everyone wants this feature to be… The latter method should contain only extensions specific to multiline literals delegating common use cases to <font color="#ff2600" class="">lexCharacter()</font></span></font></li></ul><div class=""><br class=""></div></div><div class=""><span style="font-family: Helvetica; white-space: normal;" class="">The main point of following this route (or any equivalent) is that </span></div><div class=""><ul class="MailOutline"><li class=""><span style="font-family: Helvetica; white-space: normal;" class="">it represents a very clear commitment to multiline string literals</span></li><li class=""><span style="font-family: Helvetica; white-space: normal;" class="">it ensures that there is no strong commitment to feature details, while allowing many future scenarios</span></li><li class=""><span style="font-family: Helvetica; white-space: normal;" class="">it will remain backward compatible with enhancements to the current string literal syntax (translation?)</span></li><li class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">external contributors will be able to prototype while making sure we stay within strict boundaries for integration with the compiler</span></font></li></ul><div class=""><font face="Helvetica" class=""><span style="white-space: normal;" class=""><br class=""></span></font></div></div><div class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">The next equally small step would be to describe the required minimal changes to Parser, a step I do not want to take now if the compiler experts view no merit at all to the proposed staged approach.</span></font></div><div class=""><span style="font-family: Helvetica; white-space: normal;" class=""><br class=""></span></div><div class=""><span style="font-family: Helvetica; white-space: normal;" class=""><br class=""></span></div><div class=""><span style="font-family: Helvetica; white-space: normal;" class=""><br class=""></span></div><div class=""><span style="font-family: Helvetica; white-space: normal;" class="">A thought experiment pushing further down this path, shows how the following would be equally possible language features (with roughly equivalent implementation cost):</span></div><div class=""><br class=""></div><div class=""><span style="white-space: pre-wrap;" class="">let whyOwhy = “”"\</span></div></pre><pre style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class=""> !! Can't understand what improvements it truly delivers
!! It basically removes a handful of characters
!! It works today
!! But I don't see it as a likable foundations for adding in future enhancements
!!\
!! I don't envy the people who will have to support it outside of xcode
!! Or even in xcode (considering how it currently struggles with indents/formatting
!! As for elegance, beauty is in the eye of the beholder, they say.
“”"</pre><pre style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class="">
var json1 = _"[json]\
!!{
!! "file" : "\(wishIhadPlaceholders)_000.md"
!! "desc" : "and why are all examples in xml, i thought it died a while ago ;-)"
!! "rational" : [
!! "Here we go again"
!! "How will xcode help make these workable"
!! ]
!!}
“_
</pre><pre style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class=""><pre style="white-space: pre-wrap;" class="">var json2 = _"[json]\
{
"file" : "\(wishIhadPlaceholders)_000.md"
"desc" : "and why are all examples in xml, i thought it died a while ago ;-)"
"rational" : [
"Here we go again"
"How will xcode help make these workable"
]
}
“_
</pre><div class=""><br class=""></div></pre><pre style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class=""> [<font color="#ff2600" class="">_"</font>] --> start string
[<font color="#ff2600" class="">_"\</font>] --> start line + ignore spaces until eol (basically swallow \r\n)
[<font color="#ff2600" class="">!!\</font>] --> ignore everything until eol... basically the gap does not exits
[<font color="#ff2600" class="">"_</font>] --> terminate string
[<font color="#ff2600" class="">_"[</font><i class="">TYPEID</i><font color="#ff2600" class="">]\</font>] --> start string knowing that it a verifyer or a formatter (or a chain of) understanding TYPEID can syntax check or format or or or
<br class=""></pre><pre style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class=""><br class=""></pre><pre style="background-color: rgb(255, 255, 255);" class=""><pre class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">IMO splitting these expression from the current lexing/parsing has another long term benefits when coupled with the aforementioned idea of contents tagging:</span></font></pre><pre class=""><ul class="MailOutline"><li class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">allow external dedicated formatter to be created in any editor supporting swift</span></font></li><li class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">allow external validators (including in the form of compiler plugins)</span></font></li><li class=""><font face="Helvetica" class=""><span style="white-space: normal;" class="">open a door for an equivalent to the scala's macros for contents marked as <font color="#ff2600" class="">[swift]</font></span></font></li></ul></pre><font face="Helvetica" class=""><pre style="background-color: rgb(255, 255, 255);" class=""><font face="Helvetica" class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></font></pre><pre style="background-color: rgb(255, 255, 255);" class=""><font face="Helvetica" class=""><span style="white-space: pre-wrap;" class="">Once again I fully appreciate that implementation should not drive language design, but considering the flurry of great ideas, I thought it might in this instance be useful to identify a minimal, noncommittal, direction </span></font><span style="white-space: pre-wrap; font-family: Helvetica;" class="">common to many scenarios, such that a step can be taken that will neither favor nor prohibit any of the proposals, but simply enable them all.</span></pre><span style="white-space: pre-wrap;" class=""><pre style="background-color: rgb(255, 255, 255);" class=""><font face="Helvetica" class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></font></pre>Thank you for your patience</span></font></pre><pre style="background-color: rgb(255, 255, 255);" class=""><font face="Helvetica" class=""><span style="white-space: pre-wrap;" class="">Regards
</span></font><span style="white-space: pre-wrap;" class="">
</span></pre><pre style="background-color: rgb(255, 255, 255);" class=""><pre class=""><span style="white-space: pre-wrap;" class=""><font face="Helvetica" class="">PS: I am working on a rudimentary implementation that I hope could help people test all the ideas floating in this list. </font></span></pre></pre></div><div class=""><br class=""></div><br class=""><div><blockquote type="cite" class=""><div class="">On Apr 26, 2016, at 8:04 AM, Chris Lattner via swift-evolution <<a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">On Apr 25, 2016, at 5:22 PM, Brent Royal-Gordon <<a href="mailto:brent@architechies.com" class="">brent@architechies.com</a>> wrote:<br class=""><blockquote type="cite" class=""><blockquote type="cite" class=""><blockquote type="cite" class="">3. It might be useful to make multiline `"` strings trim trailing whitespace and comments like Perl's `/x` regex modifier does.<br class=""></blockquote><br class="">If you have modifier characters already, it is easy to build a small zoo full of these useful beasts.<br class=""></blockquote><br class="">Modifiers are definitely a workable alternative, and can be quite flexible, particularly if a future macro system can let you create new modifiers.<br class=""></blockquote><br class="">Right. I consider modifiers to be highly precedented in other languages, and therefore proven to work. If we go this way, I greatly prefer prefix to postfix modifiers.<br class=""><br class=""><blockquote type="cite" class=""><blockquote type="cite" class=""><blockquote type="cite" class="">* Alternative delimiters: If a string literal starts with three, or five, or seven, or etc. quotes, that is the delimiter, and fewer quotes than that in a row are simply literal quote marks. Four, six, etc. quotes is a quote mark abutting the end of the literal.<br class=""><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span>let xml: String = """<?xml version="1.0"?><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>"""<catalog><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>"""\t<book id="bk101" empty=""><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>"""\t\t<author>\(author)</author><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>"""\t</book><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>"""</catalog>"""<br class=""><br class="">You can't use this syntax to express an empty string, or a string consisting entirely of quote marks, but `""` handles empty strings adequately, and escaping can help with quote marks. (An alternative would be to remove the abutting rule and permit `""""""` to mean "empty string", but abutting quotes seem more useful than long-delimiter empty strings.)<br class=""></blockquote><br class="">I agree that there is a need to support alternative delimiters, but subjectively, I find this to be pretty ugly. It is also a really unfortunate degenerate case for “I just want a large blob of XML” because you’d end up using “"” almost all the time, and you have to use it on every line.<br class=""></blockquote><br class="">On the other hand, the `"""` does form a much larger, more obvious continuation indicator. It is *extremely* obvious that the above line is not Swift code, but something else embedded in it. It's also extremely obvious what its extent is: when you stop seeing `"""`, you're back to normal Swift code.<br class=""></blockquote><br class="">Right, but it is also heavy weight and ugly. In your previous email you said about the single quote approach: "The quotation marks on the left end up forming a column that marks the lines as special”, so I don’t see a need for a triple quote syntax to solve this specific problem.<br class=""><br class=""><blockquote type="cite" class="">I *really* don't like the idea of our only alternatives being "one double-quote mark with backslashing" or "use an entire heredoc". Heredocs have their place, but they are a *very* heavyweight quoting mechanism, and relatively short strings with many double-quotes are pretty common. (Consider, for instance, strings containing unparsed JSON.) I think we need *some* alternative to double-quotes, either single-quotes (with the same semantics, just as an alternative) or this kind of quote-stacking.<br class=""></blockquote><br class="">I agree that this is a real problem that would be great to solve.<br class=""><br class="">If I step back and look at the string literal space we’re discussing, I feel like there are three options:<br class=""><br class="">1) single and simple multiline strings, using “<br class="">2) your triple quote sort of string, specifically tuned to avoid having to escape “ when it occurs once or twice in sequence.<br class="">3) heredoc, which is a very general (but also very heavy weight) solution to quoting problems.<br class=""><br class="">I’m trying to eliminate the middle one, so we only have to have "two things”. Here are some alternative ways to solve the problem, which might have less of an impact on the language:<br class=""><br class="">A) Introduce single quoted string literals to avoid double quote problems specifically, e.g.: ‘look “here” I say!’. This is another form of #2 which is less ugly. It also doesn’t help you if you have both “ and ‘ in your string.<br class=""><br class="">B) Introduce a modifier character that requires a more complex closing sequence to close off the string, see C++ raw string literals for prior art on this approach. Perhaps something like:<br class=""><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span> Rxxx”look “ here “ I can use quotes “xxx<br class=""><br class="">That said, I still prefer C) "ignore this issue for now”. In other words, I wouldn’t want to block progress on improving the string literal situation overall on this issue, because anything we do here is an further extension to a proposal that doesn’t solve this problem.<br class=""><br class=""><blockquote type="cite" class=""><br class=""><blockquote type="cite" class="">For cases like this, I think it would be reasonable to have a “heredoc” like scheme, which does not allow leading indentation, and does work with all the same modifier characters above. I do not have a preference on a particular syntax, and haven’t given it any thought, but this would allow you to do things like:<br class=""><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span>let str = <<EOF<br class=""><?xml version="1.0"?><br class=""><catalog><br class="">\t<book id="bk101" empty=""><br class="">\t\t<author>\(author)</author><br class="">\t</book><br class=""></catalog><br class="">EOF<br class=""><br class="">for example. You could then turn off escaping and other knobs using the modifier character (somehow, it would have to be incorporated into the syntax of course).<br class=""></blockquote><br class="">There are two questions and a suggestion I have whenever heredoc syntax comes up.<br class=""><br class="">Q1: Does the heredoc begin immediately, at the next line, or at the next valid place for a statement to start? Heredocs traditionally take the second approach.<br class=""><br class="">Q2: Do you permit heredocs to stack—that is, for a single line to specify multiple heredocs?<br class=""><br class="">S: During the Perl 6 redesign, they decided to use the delimiter's indentation to determine the base indentation for the heredoc:<br class=""><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span>func x() -> String {<br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>return <<EOF<br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><?xml version="1.0"?><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span><catalog><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>\t<book id="bk101" empty=""><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>\t\t<author>\(author)</author><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>\t</book><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span></catalog><br class=""><span class="Apple-tab-span" style="white-space:pre">        </span><span class="Apple-tab-span" style="white-space:pre">        </span>EOF<br class=""><span class="Apple-tab-span" style="white-space:pre">        </span>}<br class=""><br class="">Does that seem like a good approach?<br class=""></blockquote><br class="">I think that either approach could work, that you have a lot more experience on these topics than I do, and I would expect a vigorous community debate about these topics. :-)<br class=""><br class="">That said, if you look at what we’re discussing:<br class=""><br class="">1. “Continuation" string literals, to allow a multi-line string literal. You and I appear to completely agree about this.<br class="">2. Heredoc: You and I seem to agree that they are a good “fully general” solution to have, but there are the details you outline above to iron out.<br class="">3. Modifier characters: I’m in favor, but I don’t know where you stand. There is also still much to iron out here (such as the specific characters).<br class="">4. A way to avoid having to escape “ in a non-heredoc literal. I’m still unconvinced, and think that any solution to this problem will be orthogonal to the problems solved by 1-3 (and therefore can be added after getting experience with the other parts).<br class=""><br class="">If you agree that these are all orthogonal pieces, then treat them as such: I’d suggest that you provide a proposal that just tackles the continuation string literals. This seems simple, and possible to get in for Swift 3. After that, we can discuss heredoc and modifiers (if you think they’re a good solution) on their own threads. If those turn out to be uncontroversial, then perhaps they can get in too.<br class=""><br class="">On the heredoc aspects specifically, unless others chime in with strong opinions about the topics you brought up, I’d suggest that you craft a proposal for adding them with your preferred solution to these. You can mention the other answers (along with their tradeoffs and rationale for why you picked whatever you think is right) in the proposal, and we can help the community hash it out.<br class=""><br class="">What do you think?<br class=""><br class="">-Chris<br class=""><br class=""><br class=""><br class=""><br class=""><br class="">_______________________________________________<br class="">swift-evolution mailing list<br class=""><a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a><br class="">https://lists.swift.org/mailman/listinfo/swift-evolution<br class=""></div></div></blockquote></div><br class=""></body></html>