[swift-evolution] [Draft] Fix ExpressibleByStringInterpolation

Fri Mar 10 18:44:50 CST 2017

> On Mar 10, 2017, at 8:49 AM, Joe Groff <jgroff at apple.com> wrote:
> 
> I think there's a more powerful alternative design you should also consider. If the protocol looked like this:
> 
> protocol ExpressibleByStringInterpolation: ExpressibleByStringLiteral {
>   associatedtype LiteralSegment: ExpressibleByStringLiteral
>   associatedtype InterpolatedSegment
>   init(forStringInterpolation: Void)
> 
>   mutating func append(literalSegment: LiteralSegment)
>   mutating func append(interpolatedSegment: InterpolatedSegment)
> }
> 
> Then an interpolation expression like this in `Thingy` type context:
> 
> "foo \(bar) bas \(zim: 1, zang: 2)\n"
> 
> could desugar to something like:
> 
> {
>   var x = Thingy(forStringInterpolation: ())
>   // Literal segments get appended using append(literalSegment: "literal")
>   x.append(literalSegment: "foo ")
>   // \(...) segments are arguments to a InterpolatedSegment constructor
>   x.append(interpolatedSegment: Thingy.InterpolatedSegment(bar))
>   x.append(literalSegment: " bas ")
>   x.append(interpolatedSegment: Thingy.InterpolatedSegment(zim: 1, zang: 2))
> 
>   return x
> }()
> 
> This design should be more efficient, since there's no temporary array of segments that needs to be formed for a variadic argument, you don't need to homogenize everything to Self type up front, and the string can be built up in-place. It also provides means to address problems 3 and 4, since the InterpolatedSegment associated type can control what types it's initializable from, and can provide initializers with additional arguments for formatting or other purposes.

On the other hand, you end up with an `init(forStringInterpolation: ())` initializer which is explicitly intended to return an incompletely initialized instance. I don't enjoy imagining this. For instance, you might find yourself having to change certain properties from `let` to `var` so that the `append` methods can operate.

If we *do* go this direction, though, I might suggest a slightly different design which uses fewer calls and makes the finalization explicit:

	protocol ExpressibleByStringLiteral {
		associatedtype StringLiteralSegment: ExpressibleByStringLiteral

		init(startingStringLiteral: ())
		func endStringLiteral(with segment: StringLiteralSegment)
	}
	protocol ExpressibleByStringInterpolation: ExpressibleByStringLiteral {
		associatedtype StringInterpolationSegment

		func continueStringLiteral(with literal: StringLiteralSegment, followedBy interpolation: StringInterpolationSegment)
	}

Your `"foo \(bar) bas \(zim: 1, zang: 2)\n"` example would then become:

	{
		var x = Thingy(startingStringLiteral: ())
		x.continueStringLiteral(with: "Foo ", followedBy: .init(bar))
		x.continueStringLiteral(with: " bas ", followedBy: .init(zim: 1, zang: 2))
		x.endStringLiteral(with: "\n")
		return x
	}

While a plain old string literal would have a more complicated pattern than they do currently, but one which would have completely compatible semantics with an interpolated string:

	{
		var x = Thingy(startingStringLiteral: ())
		x.endStringLiteral(with: "Hello, world!")
		return x
	}

* * *

Another possible design would separate the intermediate type from the final one. For instance, suppose we had:

	protocol ExpressibleByStringInterpolation: ExpressibleByStringLiteral {
		associatedtype StringInterpolationBuffer = Self
		associatedtype StringInterpolationType

		static func makeStringLiteralBuffer(startingWith firstLiteralSegment: StringLiteralType) -> StringLiteralBuffer
		static func appendInterpolationSegment(_ expr: StringInterpolationType, to stringLiteralBuffer: inout StringLiteralBuffer)
		static func appendLiteralSegment(_ string: StringLiteralType, to stringLiteralBuffer: inout StringLiteralBuffer)

		init(stringInterpolation buffer: StringInterpolationBuffer)
	}
	// Automatically provide a parent protocol conformance
	extension ExpressibleByStringInterpolation {
		init(stringLiteral: StringLiteralType) {
			let buffer = Self.makeStringLiteralBuffer(startingWith: stringLiteral)
			self.init(stringInterpolation: buffer)
		}
	}

Then your example would be:

	{
		var buffer = Thingy.makeStringLiteralBuffer(startingWith: "foo ")
		Thingy.appendInterpolationSegment(Thingy.StringInterpolationSegment(bar), to: &buffer)
		Thingy.appendLiteralSegment(" bas ", to: &buffer)
		Thingy.appendInterpolationSegment(Thingy.StringInterpolationSegment(zim: 1, zang: 2), to: &buffer)
		Thingy.appendLiteralSegment("\n", to: &buffer)

		return Thingy(stringInterpolation: x)
	}()

For arbitrary string types, `StringInterpolationBuffer` would probably be `Self`, but if you had something which could only create an instance of itself once the entire literal was gathered together, it could use `String` or `Array` or whatever else it wanted.

* * *

One more design possibility. Would it make sense to handle all the segments in a single initializer call, instead of having one call for each segment, plus a big call at the end? Suppose we did this:

	protocol ExpressibleByStringInterpolation: ExpressibleByStringLiteral {
		associatedtype StringInterpolationType

		init(stringInterpolation segments: StringInterpolationSegment...)
	}
	@fixed_layout enum StringInterpolationSegment<StringType: ExpressibleByStringInterpolation> {
		case literal(StringType.StringLiteralType)
		case interpolation(StringType.StringInterpolationType)
	}
	extension ExpressibleByStringInterpolation {
		typealias StringInterpolationSegment = Swift.StringInterpolationSegment<Self>

		init(stringLiteral: StringLiteralType) {
			self.init(stringInterpolation: .literal(stringLiteral))
		}
	}

Code pattern would look like this:

	Thingy(stringInterpolation:
		.literal("Foo "),
		.interpolation(.init(bar)),
		.literal(" bas "),
		.interpolation(.init(zim: 1, zang: 2)),
		.literal("\n")
	)

I suppose it just depends on whether the array or the extra calls are more costly. (Well, it also depends on whether we want to be expanding single expressions into big, complex, multi-statement messes like we discussed before.)

(Actually, I realized after writing this that you mentioned a similar design downthread. Oops.)

* * *

As for validation, which is mentioned downthread: I think we will really want plain old string literal validation to happen at compile time. Doing that in a general way means macros, so that's just not in the cards yet.

However, once we *do* have that, I think we can actually handle runtime-failable interpolated literals pretty easily. For this example, I'll assume we adopt the `StringInterpolationSegment`-enum-based option, but any of them could be adapted in the same way:

	protocol ExpressibleByFailableStringInterpolation: ExpressibleByStringLiteral {
		associatedtype StringInterpolationType

		init(stringInterpolation: StringInterpolationSegment...)
	}
	extension ExpressibleByFailableStringInterpolation {
		typealias StringInterpolationSegment = Swift.StringInterpolationSegment<Self?>

		init(stringLiteral: StringLiteralType) {
			self.init(stringInterpolation segments: .literal(stringLiteral))
		}
	}
	extension Optional: ExpressibleByStringInterpolation where Wrapped: ExpressibleByFailableStringInterpolation {
		typealias StringLiteralType = Wrapped.StringLiteralType
		typealias StringInterpolationType = Wrapped.StringInterpolationType

		init(stringInterpolation segments: StringInterpolationSegment...) {
			self = Wrapped(stringInterpolation: segments)
		}
	}

If we think we'd rather support throwing inits instead of failable inits, that could be supported directly by `ExpressibleByStringInterpolation` if we get throwing types and support `Never` as a "doesn't throw" type.

* * *

Related question: Can the construction of the variadic parameter array be optimized? For instance, the arity of any given call site is known at compile time; can the array buffer be allocated on the stack and somehow marked so that attempting to retain it (while copying the `Array` instance) will copy it into the heap? (Are we doing that already?) I suspect that would make variadic calls a lot cheaper, perhaps enough so that we just don't need to worry about this problem at all.

-- 
Brent Royal-Gordon
Architechies