[swift-evolution] String update

John Holdsworth mac at johnholdsworth.com
Thu Jan 11 14:12:57 CST 2018


Look no further, the iterator in a loop context is an actual iterator and lazy:

for groups in props["(\\w+)\\s*=\\s*(.*)"] {
    params[String(groups[1]!)] = String(groups[2]!)
}

If you break half way through the remaining matches will not have been performed.

This is as opposed to the following which would be exhaustive.

if let allGroupsOfAllMatches: [[Substring?]] = props["(\\w+)\\s*=\\s*(.*)"] {
    for groups in allGroupsOfAllMatches {
        params[String(groups[1]!)] = String(groups[2]!)
    }
}


> On 11 Jan 2018, at 18:15, C. Keith Ray <keithray at mac.com> wrote:
> 
> That looks great. One thing I would look for is iterating over multiple matches in a string. I'd want to see lazy and non-lazy sequences.
> 
>     let wordMatcher = Regex(":w*") // or whatever matches word-characters 
>     // separated by non-word character.
> 
>      for w in aString[allMatches: wordMatcher] { print(w) }
> 
>      for w in warAndPeaceNovel[allMatchesLazy: wordMatcher].prefix(50) { print(w) }
> 
> --
> C. Keith Ray
> 
> * https://leanpub.com/wepntk <https://leanpub.com/wepntk> <- buy my book?
> * http://www.thirdfoundationsw.com/keith_ray_resume_2014_long.pdf <http://www.thirdfoundationsw.com/keith_ray_resume_2014_long.pdf>
> * http://agilesolutionspace.blogspot.com/ <http://agilesolutionspace.blogspot.com/>
> 
> On Jan 11, 2018, at 9:50 AM, John Holdsworth via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
> 
>> Hi Michael,
>> 
>> Thanks for sending this through. It’s an interesting read. One section gave me pause however. I feel Swift should resist the siren call of combining Swift Syntax with Regex syntax as it falls on the wrong side of Occam's razor. ISO Regex syntax is plenty complex enough without trying to incorporate named capture, desirable as it may be. Also, if I was to go down that route, I’d move away from / as the delimiter which is a carry over from Perl to something like e”I am a regex” to give the lexer more to go on which could represent say, a cached instance of NSRegularExpression.
>> 
>> And now for something completely different...
>> 
>> Common usage patterns for a regex fall into 4 categories: deconstruction, replacement, iteration and switch/case. Ideally the representation of a regex match would the same for all four of these categories and I’d like to argue a set of expressive regex primitives can be created without building them into the language.
>> 
>> I’ve talked before about a regex match being coded as a string/regex subscripting into a string and I’ve been able to move this forward since last year. While this seems like an arbitrary operator to use it has some semantic sense in that you are addressing a sub-part of the string with pattern as you might use an index or a key. Subscripts also have some very interesting properties in Swift compared to other operators or functions: You don’t have to worry about precedence, they can be assigned to, used as an interator, and I've learned since my last email on this topic that the Swift type checker will disambiguate multiple subscript overloads on the basis of the type of the variable is being assigned to.
>> 
>> An extension to String can now realise the common use cases by judicious use of types:
>> 
>> var input = "Now is the time for all good men to come to the aid of the party"
>> 
>> if input["\\w+"] {
>>     print("match")
>> }
>> 
>> // receiving type controls data you get
>> if let firstMatch: Substring = input["\\w+"] {
>>     print("match: \(firstMatch)")
>> }
>> 
>> if let groupsOfFirstMatch: [Substring?] = input["(all) (\\w+)"] {
>>     print("groups: \(groupsOfFirstMatch)")
>> }
>> 
>> // "splat" out up to N groups of first match
>> if let (group1, group2): (String, String) = input["(all) (\\w+)"] {
>>     print("group1: \(group1), group2: \(group2)")
>> }
>> 
>> if let allGroupsOfAllMatches: [[Substring?]] = input["(\\w)(\\w*)"] {
>>     print("allGroups: \(allGroupsOfAllMatches)")
>> }
>> 
>> // regex replace by assignment
>> input["men"] = "folk"
>> print(input)
>> 
>> // parsing a properties file using regex as iterator
>> let props = """
>>     name1 = value1
>>     name2 = value2
>>     """
>> 
>> var params = [String: String]()
>> for groups in props["(\\w+)\\s*=\\s*(.*)"] {
>>     params[String(groups[1]!)] = String(groups[2]!)
>> }
>> print(params)
>> 
>> The case for switches is slightly more opaque in order to avoid executing the match twice but viable.
>> 
>> let match = RegexMatch()
>> switch input {
>> case RegexPattern("(\\w)(\\w*)", capture: match):
>>     let (first, rest) = input[match]
>>     print("\(first) \(rest)")
>> default:
>>     break
>> }
>> 
>> This is explored in the attached playground (repo: https://github.com/johnno1962/SwiftRegex4 <https://github.com/johnno1962/SwiftRegex4>)
>> <SwiftRegex4.playground.zip>
>> 
>> I’m not sure I really expect this to take off as an idea but I’d like to make sure it's out there as an option and it certainly qualifies as “out there”.
>> 
>> John
>> 
>>> On 10 Jan 2018, at 19:58, Michael Ilseman via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>> 
>>> Hello, I just sent an email to swift-dev titled "State of String: ABI, Performance, Ergonomics, and You!” at https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20180108/006407.html <https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20180108/006407.html>, whose gist can be found at https://gist.github.com/milseman/bb39ef7f170641ae52c13600a512782f <https://gist.github.com/milseman/bb39ef7f170641ae52c13600a512782f>. I posted to swift-dev as much of the content is from an implementation perspective, but it also addresses many areas of potential evolution. Please refer to that email for details; here’s the recap from it:
>>> 
>>> ### Recap: Potential Additions for Swift 5
>>> 
>>> * Some form of unmanaged or unsafe Strings, and corresponding APIs
>>> * Exposing performance flags, and some way to request a scan to populate them
>>> * API gaps
>>> * Character and UnicodeScalar properties, such as isNewline
>>> * Generalizing, and optimizing, String interpolation
>>> * Regex literals, Regex type, and generalized pattern match destructuring
>>> * Substitution APIs, in conjunction with Regexes.
>>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20180111/fb94a125/attachment.html>


More information about the swift-evolution mailing list