[swift-evolution] Empower String type with regular expression

Dany St-Amant dsa.mls at icloud.com
Sun Jan 31 19:56:27 CST 2016

> Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r at gili-labs.com> a écrit :
> Hi Dany,
> Please find my response inline below.
> Cheers,
> -Patrick
>> On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>> This seem to be two proposals in one:
>> 1. Initialize NSRegularExpression with a single String which includes options
>> The ultimate goal based on the earlier mail in the thread seems to be able in a future proposal do thing like: string ~= replacePattern, if string =~ pattern, decoupled from the legacy Obj-C. Isn’t NSRegularExpression part of the legacy? The conversion of the literal string as regular expression should probably part of the proposal for these operators; as this is the time we will know how we want the text to be interpreted.
> I don't see any evidence of NSRegularExpression becoming part of any legacy. Given SE-005, SE-006, and SE-023, the name is probably changing from NSRegularExpression to RegularExpression. However, I don't think the definition of the class will change, only the name.
> I would like to see an operator regular expression matching operator, like Ruby and Perl. I was trying to keep the proposal a minimal increment that would buy the biggest bang for the buck. We can already accomplish much of what other languages can do with regard to regular expression. However, the notion of a regular expression isn't something we can work around with custom library today. Can you suggest something addition that should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am just wondering if we are sure that these future operation will use the NSRegularExpression/RegularExpression. And does the currently selected syntax allow for future expansion, it would be bad to introduce something  that need to be torn away or changed in an incompatible way, once we really start to use them in their final location.

The proposal is focused on the search, but seem to skip the substitution; I am unable to see an option to replace all matches instead of the first one only in the proposal. I, as many other, would expect regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be support for any character (within some limit) for the slash delimiter. With sed, when replacing  path component, one can do: echo $PWD | sed -e "s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in the example, with a syntax allowing such interpretation the variable should be processed to generate proper escaping. If one is to use \(filename) you get "main.c", but one must use \(filename.escaped()) to get the proper "main\.c" to avoid matching "mainac". The String.escaped() must be in a format compatible with the format used when converting the regular expression into NSRegularExpression (not sure if the two syntax are the same; I think that at least the handling of /  may differ)

>> 2. Easily create a String without escaping (\n is not linefeed, but \ and n)
>> The ability to not interpret the backslash as escape can be useful in other scenario that creating a NSRegularExpression; like creating a Windows pathname, or creating regular expression which are then given to external tool.  So this part of the proposal should probably be generalized.
> Generalize it for what? If you're thinking along the line of raw strings, I agree that we need this capability, as well as multi-line string literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String which are then interpreted; the same way as "Good Morning", "Bonjour", or "Marhaba" (even when using the arabic script) are just String when you assign then to a variable in Swift, and then interpreted by the intended user. They are not String, frenchString, rigthToLeftString. So I do not see why a regular expression should have privileged treatment and have its own language level syntax. The only difference when writing regular expression, or Windows pathname, or any String with a syntax with heavily uses of backslashes, is that one may want to disable the special meaning of the backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the \(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo "$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics, since before restarting the discussion in these topics, the previous one should be properly summarized; unless such summary already exist.


>> Dany
>>> Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> a écrit :
>>> Here is the link to the proposal on GitHub:
>>> https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md <https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md>
>>> Cheers,
>>> -Patrick
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160131/3eaece8a/attachment.html>

More information about the swift-evolution mailing list