[swift-evolution] Empower String type with regular expression

Howard Lovatt howard.lovatt at gmail.com
Tue Feb 2 16:55:48 CST 2016


The difference is that I am proposing supporting both verbal expressions
and regex literals and that - literals are converted to verbals and the
processing happens at the verbal level. The reason for this is that verbals
are easy to handle programmatically whilst literals are great for quickly
specifying a regex.

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r at gili-labs.com>
wrote:

> Hi Howard,
>
> I don't see how this is very different from the Swift Verbal Expressions.
> It would suffer from the same disadvantages I have stated previously.
>
> Cheers,
> -Patrick
>
> On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution <
> swift-evolution at swift.org
> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>> wrote:
>
> Others have suggested a programatic regex instead of a regex literal, how
> about doing both? Something like:
>
> enum RegexElement {
>     case capture(name: String, value: String)
>     case special(Special)
>     // ...
>     enum Special: String {
>         case startOfLine = "^"
>         // ...
>         case endOfLine = "$"
>     }
> }
>
> // Define a regexLiteral syntax that the compiler understands that is of
> type Regex and consists of String representations of RegexElements, e.g.
> using forward slash:
> //    /<RegexElements>*/
>
> struct Regex: CustomStringConvertible { // Compiled, immutable, thread
> safe, and bridged to NSRegularExpression
>     // ... internal compiled representation
>     let elements: [RegexElement]
>     var description: String {
>         return RegexElement.Special.startOfLine.rawValue // Example.
> Really returns all the elements converted back to a string
>     }
>     init(_ elements: RegexElement...) {
>         self.elements = elements // Example. Really also compiles the
> expression
>     }
>     // init(regexLiteral regex: Regex) {
>     // init(concatAll regexes: Regex...) {
>     // init(fromString string: String) {
>     // ... more inits
>     func map<T>(input: String, @noescape mapper: (element: RegexElement)
> throws -> T) rethrows -> [T] {
>         return [try mapper(element: RegexElement.special(.startOfLine))] //
> Example. Really does the matching
>     }
>     // func flatMap<T>(input: String, @noescape mapper: (element:
> RegexElement) throws -> T?) rethrows -> [T] {
>     // func flatMap<S: SequenceType>(input: String, @noescape mapper:
> (element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
>     // func forEach(input: String, @noescape eacher: (element:
> RegexElement) throws -> Void) rethrows {
>     // ... more funcs
> }
>
> let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex
> literal
> let asStringArray = regex.map("Example") { element -> String in //
> Returns `["^"]` in example
>     switch element {
>     case let .capture(_, v): return v
>     case let .special(s): return s.rawValue
>     }
> }
>
>
> The advantages are:
>
>    1.   We get a literal type for convenience.
>    2.   We get a programatic type when we need to manipulate regexes.
>    3.   Breaking the regex matches into the enum defined elements of the
>    regex works well with Swift pattern matching.
>
> (Above is a very rough sketch!)
>
>
> On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution <
> swift-evolution at swift.org
> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>> wrote:
>
>> Something like Scala's extractors or F#'s Active Patterns would be most
>> welcome to generalize pattern matching.
>>
>> http://docs.scala-lang.org/tutorials/tour/extractor-objects.html
>> https://en.m.wikibooks.org/wiki/F_Sharp_Programming/Active_Patterns
>>
>> -Thorsten
>>
>> Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution <
>> swift-evolution at swift.org
>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>>:
>>
>> It would be great if we could create a generic way of making this swifty.
>> You may let say want to implement a matching system for structure like JSON
>> or XML (i.e XQuery).
>>
>>
>>
>> *___________________________________*
>>
>> *James⎥Lead Engineer*
>>
>> *james at supmenow.com
>> <javascript:_e(%7B%7D,'cvml','james at supmenow.com');>⎥supmenow.com
>> <http://supmenow.com/>*
>>
>> *Sup*
>>
>> *Runway East *
>>
>> *10 Finsbury Square*
>>
>> *London*
>>
>> * EC2A 1AF *
>>
>> On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution <
>> swift-evolution at swift.org
>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>> wrote:
>>
>>> Hi Dany,
>>>
>>> My response is inline below.
>>>
>>> Cheers,
>>> -Patrick
>>>
>>> On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls at icloud.com
>>> <javascript:_e(%7B%7D,'cvml','dsa.mls at icloud.com');>> wrote:
>>>
>>>
>>> Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r at gili-labs.com
>>> <javascript:_e(%7B%7D,'cvml','gili.patrick.r at gili-labs.com');>> a écrit
>>> :
>>>
>>> Hi Dany,
>>>
>>> Please find my response inline below.
>>>
>>> Cheers,
>>> -Patrick
>>>
>>> On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution <
>>> swift-evolution at swift.org
>>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>> wrote:
>>>
>>> This seem to be two proposals in one:
>>> 1. Initialize NSRegularExpression with a single String which includes
>>> options
>>>
>>> The ultimate goal based on the earlier mail in the thread seems to be
>>> able in a future proposal do thing like: string ~= replacePattern, if
>>> string =~ pattern, decoupled from the legacy Obj-C. Isn’t
>>> NSRegularExpression part of the legacy? The conversion of the literal
>>> string as regular expression should probably part of the proposal for these
>>> operators; as this is the time we will know how we want the text to be
>>> interpreted.
>>>
>>>
>>> I don't see any evidence of NSRegularExpression becoming part of any
>>> legacy. Given SE-005, SE-006, and SE-023, the name is probably changing
>>> from NSRegularExpression to RegularExpression. However, I don't think the
>>> definition of the class will change, only the name.
>>>
>>> I would like to see an operator regular expression matching operator,
>>> like Ruby and Perl. I was trying to keep the proposal a minimal increment
>>> that would buy the biggest bang for the buck. We can already accomplish
>>> much of what other languages can do with regard to regular expression.
>>> However, the notion of a regular expression isn't something we can work
>>> around with custom library today. Can you suggest something addition that
>>> should be in the proposal?
>>>
>>>
>>> Splitting proposal in smaller ones have its advantage, but here I am
>>> just wondering if we are sure that these future operation will use the
>>> NSRegularExpression/RegularExpression. And does the currently selected
>>> syntax allow for future expansion, it would be bad to introduce something
>>>  that need to be torn away or changed in an incompatible way, once we
>>> really start to use them in their final location.
>>>
>>> The proposal is focused on the search, but seem to skip the
>>> substitution; I am unable to see an option to replace all matches instead
>>> of the first one only in the proposal. I, as many other, would expect
>>> regular expression in a language to also support substitution.
>>>
>>> As for addition to the proposal, the processing of the string could be
>>> support for any character (within some limit) for the slash delimiter. With
>>> sed, when replacing  path component, one can do: echo $PWD | sed -e
>>> "s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single
>>> slashes. Which is really handy to make thing easier to read.
>>>
>>> Also, putting aside that I think \(scheme) should not be interpreted in
>>> the example, with a syntax allowing such interpretation the variable should
>>> be processed to generate proper escaping. If one is to use \(filename) you
>>> get "main.c", but one must use \(filename.escaped()) to get the proper
>>> "main\.c" to avoid matching "mainac". The String.escaped() must be in a
>>> format compatible with the format used when converting the regular
>>> expression into NSRegularExpression (not sure if the two syntax are the
>>> same; I think that at least the handling of /  may differ)
>>>
>>>
>>> I agree. Perhaps I went too far with keeping the proposal
>>> short-and-sweet. Especially when you consider the rich syntax that Perl
>>> supports for substitution.
>>>
>>>
>>> 2. Easily create a String without escaping (\n is not linefeed, but \
>>> and n)
>>>
>>> The ability to not interpret the backslash as escape can be useful in
>>> other scenario that creating a NSRegularExpression; like creating a Windows
>>> pathname, or creating regular expression which are then given to external
>>> tool.  So this part of the proposal should probably be generalized.
>>>
>>>
>>> Generalize it for what? If you're thinking along the line of raw
>>> strings, I agree that we need this capability, as well as multi-line string
>>> literals. However, I just soon we have separate proposals for this.
>>>
>>>
>>> My point/opinion here, is that a regular expressions are just a String
>>> which are then interpreted; the same way as "Good Morning", "Bonjour", or
>>> "Marhaba" (even when using the arabic script) are just String when you
>>> assign then to a variable in Swift, and then interpreted by the intended
>>> user. They are not String, frenchString, rigthToLeftString. So I do not see
>>> why a regular expression should have privileged treatment and have its own
>>> language level syntax. The only difference when writing regular expression,
>>> or Windows pathname, or any String with a syntax with heavily uses of
>>> backslashes, is that one may want to disable the special meaning of the
>>> backslashes, to make thing more readable.
>>>
>>> On the page of geeky-ing the String there’s four main part IMHO
>>> - multi-line support
>>> - no backslash escaping version (which should include no processing the
>>> \(variable) format)
>>> - inclusion of String delimiter inside the String
>>> - concat of backslash/no backslash version. Bash example echo 'echo
>>> "$BASH" shows '"$BASH"
>>>
>>> I’m still trying to find back the mail thread crumbs on these topics,
>>> since before restarting the discussion in these topics, the previous one
>>> should be properly summarized; unless such summary already exist.
>>>
>>>
>>> I think supporting interpolation is important. Both Perl and Ruby
>>> support it, and I'm sure there are other languages. One thing I forgot to
>>> put into the proposal: an option to disable interpolation or limit it to
>>> single pass.
>>>
>>> Looking ahead at the other responses, Chris Lattner has suggested that
>>> the proposal would have more traction if we can find a way to fold this
>>> into Swift's pattern matching. I can't say as I disagree, as this makes
>>> regular expression more Swifty.
>>>
>>>
>>> Regards,
>>> Dany
>>>
>>> Dany
>>>
>>> Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution <
>>> swift-evolution at swift.org
>>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>> a écrit :
>>>
>>> Here is the link to the proposal on GitHub:
>>>
>>>
>>> https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md
>>>
>>> Cheers,
>>> -Patrick
>>>
>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>
>>>
>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>
>>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
>
>
> --
>   -- Howard.
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> <javascript:_e(%7B%7D,'cvml','swift-evolution at swift.org');>
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
>

-- 
  -- Howard.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160203/5d5c848e/attachment.html>


More information about the swift-evolution mailing list