[swift-evolution] String update

Eneko Alonso eneko.alonso at gmail.com
Tue Jan 16 13:24:49 CST 2018


Thank you for the reply. The part I didn’t understand is if if giving names to the captured groups would be mandatory. Hopefully not.

Assuming we the user does not need names, the groups could be captures on an unlabeled tuple.

Digits could always be inferred to be numeric (Int) and they should always be “exact” (to match "\d"):

let usPhoneNumber: Regex = (.digits(3) + "-“).oneOrZero + .digits(3) + “-“ + .digits(4)

Personally, I like the `.optional` better than `.oneOrZero`:

let usPhoneNumber = Regex.optional(.digits(3) + "-“) + .digits(3) + “-“ + .digits(4)

Would it be possible to support both condensed and extended syntax? 

let usPhoneNumber = / (\d{3} + "-“)? + (\d{3}) + “-“ + (\d{4}) /

Maybe only extended (verbose) syntax would support named groups?

Eneko


> On Jan 16, 2018, at 10:01 AM, George Leontiev <georgeleontiev at gmail.com> wrote:
> 
> @Eneko While it sure seems possible to specify the type, I think this would go against the salient point "If something’s worth capturing, it’s worth giving it a name.” Putting the name further away seems like a step backward.
> 
> 
> I could imagine a slightly more succinct syntax where things like .numberFromDigits are replaced by protocol conformance of the bound type:
> ```
> extension Int: Regexable {
>     func baseRegex<T>() -> Regex<T, Int>
> }
> let usPhoneNumber = (/let area: Int/.exactDigits(3) + "-").oneOrZero +
>                     /let routing: Int/.exactDigits(3) + "-" +
>                     /let local: Int/.exactDigits(4)
> ```
> 
> In this model, the `//` syntax will only be used for initial binding and swifty transformations will build the final regex.
> 
> 
>> On Jan 16, 2018, at 9:20 AM, Eneko Alonso via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>> 
>> Could it be possible to specify the regex type ahead avoiding having to specify the type of each captured group?
>> 
>> let usPhoneNumber: Regex<UnicodeScalar, (area: Int?, routing: Int, local: Int)> = /
>>   (\d{3}?) -
>>   (\d{3}) -
>>   (\d{4}) /
>> 
>> “Verbose” alternative:
>> 
>> let usPhoneNumber: Regex<UnicodeScalar, (area: Int?, routing: Int, local: Int)> = / 
>>   .optional(.numberFromDigits(.exactly(3)) + "-“) +
>>   .numberFromDigits(.exactly(3)) + "-"
>>   .numberFromDigits(.exactly(4)) /
>> print(type(of: usPhoneNumber)) // => Regex<UnicodeScalar, (area: Int?, routing: Int, local: Int)>
>> 
>> 
>> Thanks,
>> Eneko
>> 
>> 
>>> On Jan 16, 2018, at 8:52 AM, George Leontiev via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>> 
>>> Thanks, Michael. This is very interesting!
>>> 
>>> I wonder if it is worth considering (for lack of a better word) *verbose* regular expression for Swift.
>>> 
>>> For instance, your example:
>>> ```
>>> let usPhoneNumber = /
>>>   (let area: Int? <- \d{3}?) -
>>>   (let routing: Int <- \d{3}) -
>>>   (let local: Int <- \d{4}) /
>>> ```
>>> would become something like (strawman syntax):
>>> ```
>>> let usPhoneNumber = /let area: Int? <- .numberFromDigits(.exactly(3))/ + "-" +
>>>                     /let routing: Int <- .numberFromDigits(.exactly(3))/ + "-"
>>>                     /let local: Int <- .numberFromDigits(.exactly(4))/
>>> ```
>>> With this format, I also noticed that your code wouldn't match "555-5555", only "-555-5555", so maybe it would end up being something like:
>>> ```
>>> let usPhoneNumber = .optional(/let area: Int <- .numberFromDigits(.exactly(3))/ + "-") +
>>>                     /let routing: Int <- .numberFromDigits(.exactly(3))/ + "-"
>>>                     /let local: Int <- .numberFromDigits(.exactly(4))/
>>> ```
>>> Notice that `area` is initially a non-optional `Int`, but becomes optional when transformed by the `optional` directive.
>>> Other directives may be:
>>> ```
>>> let decimal = /let beforeDecimalPoint: Int <-- .numberFromDigits(.oneOrMore)/ +
>>>               .optional("." + /let afterDecimalPoint: Int <-- .numberFromDigits(.oneOrMore)/
>>> ```
>>> 
>>> In this world, the `/<--/` format will only be used for explicit binding, and the rest will be inferred from generic `+` operators.
>>> 
>>> 
>>> I also think it would be helpful if `Regex` was generic over all sequence types.
>>> Going back to the phone example, this would looks something like:
>>> ```
>>> let usPhoneNumber = .optional(/let area: Int <- .numberFromDigits(.exactly(3))/ + "-") +
>>>                     /let routing: Int <- .numberFromDigits(.exactly(3))/ + "-"
>>>                     /let local: Int <- .numberFromDigits(.exactly(4))/
>>> print(type(of: usPhoneNumber)) // => Regex<UnicodeScalar, (area: Int?, routing: Int, local: Int)>
>>> ```
>>> Note the addition of `UnicodeScalar` to the signature of `Regex`. Other interesting signatures are `Regex<JSONToken, JSONEnumeration>` or `Regex<HTTPRequestHeaderToken, HTTPRequestHeader>`. Building parsers becomes fun!
>>> 
>>> - George
>>> 
>>>> On Jan 10, 2018, at 11:58 AM, Michael Ilseman via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>> 
>>>> Hello, I just sent an email to swift-dev titled "State of String: ABI, Performance, Ergonomics, and You!” at https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20180108/006407.html <https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20180108/006407.html>, whose gist can be found at https://gist.github.com/milseman/bb39ef7f170641ae52c13600a512782f <https://gist.github.com/milseman/bb39ef7f170641ae52c13600a512782f>. I posted to swift-dev as much of the content is from an implementation perspective, but it also addresses many areas of potential evolution. Please refer to that email for details; here’s the recap from it:
>>>> 
>>>> ### Recap: Potential Additions for Swift 5
>>>> 
>>>> * Some form of unmanaged or unsafe Strings, and corresponding APIs
>>>> * Exposing performance flags, and some way to request a scan to populate them
>>>> * API gaps
>>>> * Character and UnicodeScalar properties, such as isNewline
>>>> * Generalizing, and optimizing, String interpolation
>>>> * Regex literals, Regex type, and generalized pattern match destructuring
>>>> * Substitution APIs, in conjunction with Regexes.
>>>> 
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20180116/a652c5b1/attachment.html>


More information about the swift-evolution mailing list