[swift-evolution] [Pitch] alternative multiline string literals

L Mihalkovic laurent.mihalkovic at gmail.com
Sun May 8 04:08:33 CDT 2016

Added details about the prototype. 

// This code sample and prototype implementation (implemented in patch against the 3.0
// branch of the swift compiler) explores a possible syntax based on ideas discussed in
// the swift-evolution mailing list at
//      http://thread.gmane.org/gmane.comp.lang.swift.evolution/904/focus=15133
// The proposed syntax uses a combination of two characters to signal the start and end 
// of a multiline string in the source code:   _" contents "_ 
// Additionally, the syntax introduces a new @string_literal() attibute that can be used
// to semantically tag the contents of a multiline string literal. The hope is that this
// attribute would be used by IDEs for custom validation/formatting of the contents of 
// long string literals, as well as possibly be accessible at runtime via an extended
// mirror/reflection mechanism.

// Tagging literal contents
@string_literal("json")       let att1 = _"{"key1": "stringValue"}"_
@string_literal("text/xml")   let att2 = _"<catalog><book id="bk101" empty=""/></catalog>"_
@string_literal("swift_sil")  let att3 = _" embedded SIL contents?! "_

// The following alternatives for placement of the attribute have been looked into 
// and so far rejected for seemingly not fitting as closely with other attribute usage 
// patterns in the Swift grammar:
// let att2 : @string_literal("text/xml") String  = _" ... "_  // Conveys the impressing that the type is annotated rather than the variable
// let att2 = @string_literal("text/xml") _" ... "_            // Appealing, but without any precedent in the Swift grammar

// checking that nothing is broken
let s0 = "s0"

// The default swift syntax requires that quotes be escaped
let s1 = "{\"key1\": \"stringValue\"}"

// The proposed syntax for multiline strings works for single line strings as well (maybe it 
// should not) and does not mandate that enclosed single quote characters be escaped
let s2 = _"{"v2"}"_

// When dealing with long blocks of embedded text, it seems natural to want to describe them
// as close as possible to the contents. The proposed syntax supports inserting a comment
// just before the data it documents. This allows the comment indentation to match exactly
// that of the string.
let s3 =
    /* this is a template */
    _"{"key3": "stringValue"}"_

// --------------------------------------------------------------------------------
// The following section explores different ways to deal with leading spaces

let s4 =
/* this is (almost) the same template */
  "key4": "stringValue"
  , "key2": "stringValue"

let equivS4 = 
"{\n" +
"  \"key4\": \"stringValue\"\n" +
"  , \"key2\": \"stringValue\"\n" +
"}\n" +

//TODO: fix the leading spaces
let s5 =
  /* this is exactly the same template as s5 */
    "key5": "stringValue"

//TODO: fix the leading spaces
let s6 =
  /* this is exactly the same template as s5 */
  |  "key6": "stringValue"
  |  , "key2": "stringValue"

I would appreciate any input on the realism/degree of difficulties of pursuing something like the following (swift_sil being a built in reserved attribute value):

@string_literal(swift_sil)  let att3 = _" embedded SIL contents?! "_

The train of thoughts is to try and identify a “simple” pathway for something akin to rust macros or jai’s “do this during compilation” (possible long term replacement for .gyb?!)

> On May 7, 2016, at 8:20 PM, L Mihalkovic <laurent.mihalkovic at gmail.com> wrote:
> Please accept my apologies for the repeat… I seem to have more trouble with my emails than the brilliant codebase this team has produced.
> Best regards
> LM/
> ——————————
> Wanting to test the validity of some of the arguments I read on the main proposal, I worked on my own prototype. I think there is more freedom than seem to have been identified so far.
> The syntax I am exploring is visible here: https://gist.github.com/lmihalkovic/718d1b8f2ae6f7f6ba2ef8da07b64c1c <https://gist.github.com/lmihalkovic/718d1b8f2ae6f7f6ba2ef8da07b64c1c>
> There are still a couple of things that do not work 
> serialization of the @string_literal attribute
> type checker code for the @string_literal attribute 
> skipping leading spaces on each lines, based on the indentation of the first line
> removing some of the extra EOL (rule to be defined)
> The following works:
> comment before the literal data
> @string_literal(“xxxx”). At the moment the attribute value is a string_literal, maybe a identifier would be better, and maybe it should be @string_literal(type: “xxxx”), so that other properties can be added. I persist in thinking that a lot of good can come from being able to tag the contents of string literal (e.g. XML schema validation, custom syntax coloring, … )
> the code is based on a string_multiline_literal tag to make these extension formally visible in the grammar 
> no need to prefix each line (although it will be possible to use | as a margin)

From a more technical standpoint, some of the choices where dictated by the desire to create a pathway inside the existing Lexer/Parser that would
accommodate the immediate needs of today’s simple multiline string literal needs
lay some clean foundations for future extensions
try to keep the risks to a minimum level
open doors to simplify the work involved in creating other prototypes

As such the prototype relies on the following alterations to the core logic of the parser/lexer
the introduction of a new string_multiline_literal
the following change to the the core logic of lexImpl() 

  case 'o': case 'p': case 'q': case 'r': case 's': case 't': case 'u':
  case 'v': case 'w': case 'x': case 'y': case 'z':
  case '_':
+   if (CurPtr[-1] == '_' && CurPtr[0] == '"') {
+     return lexStringMultilineLiteral();
+   } else {
      return lexIdentifier();
+   }

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160508/f5392980/attachment.html>

More information about the swift-evolution mailing list