<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Disclaimer: I am not well versed in the various complexities of regex implementations.<div class=""><br class=""></div><div class="">That being said, I would very much like to see better regex support in Swift. Preferably one that is easier to pick up than the NSRegularExpression of ObjC and possibly as easy to start using as python or ruby.<div class=""><br class=""></div><div class="">Just some things to consider for the implementation:</div><div class=""><br class=""></div><div class="">- Plain old searching for a match</div><div class="">- Multiple (named) capture groups</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>- I liked the suggestion of binding straight to variables (perhaps with closures)</div><div class="">- Replacements</div><div class="">- I’m also not a fan of using the backslash for instantiating a regex literal. Many other languages use the forward slash (Off the top of my head all I can think of is Ruby, but I’m sure there are others)</div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Aug 10, 2017, at 4:28 AM, Omar Charif via swift-evolution &lt;<a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a>&gt; wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html charset=us-ascii" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Hi Joshua,</div><div class=""><br class=""></div><div class="">I also feel a huge gap when it comes to string matching, whether it is&nbsp;implementation&nbsp;or performance.</div><div class="">I also wrote a library for String matching called StringMap and it has been way for performant than regular expressions.</div><div class="">I recently proposed to include it in core foundation but it seems that I was a bit late and the team was busy finishing their unimplemented functions so they advised me to ship the code in SPM instead.</div><div class="">So I am also willing to work in this issue.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 10, 2017, at 8:25 AM, Joshua Alvarado via swift-evolution &lt;<a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a>&gt; wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html charset=us-ascii" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hey everyone,<div class=""><br class=""></div><div class="">I would like to pitch an implementation of Regex in Swift and gather all of your thoughts.</div><div class=""><br class=""></div><div class="">Motivation:</div><div class="">In the String Manifesto for Swift 4, addressing regular expressions was not in scope. Swift 5 would be a more fitting version to address the implementation of Regex in Swift. NSRegularExpression is a suitable solution for pattern matching but the API is in unfitting for the future direction of Swift.</div><div class=""><br class=""></div><div class="">Implementation:</div><div class="">The Regular expression API will be implemented by a Regex structure object which is a regular expression that you can apply to Unicode strings. The Regex struct will conform to the RegexProtocol, which is a type that can represent a regular expression. ExpressibleByRegexLiteral will be used to initialize a regex literal creating an easy to use syntax and a Match structure will be used to represent a match found with a Regex.</div><div class=""><br class=""></div><div class="">Draft of implementation:</div><div class=""><br class=""></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">protocol</span><span style="font-variant-ligatures: no-common-ligatures" class=""> ExpressibleByRegexLiteral {</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">associatedtype</span><span style="font-variant-ligatures: no-common-ligatures" class=""> RegexLiteralType</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">init</span><span style="font-variant-ligatures: no-common-ligatures" class="">(regexLiteral value: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Self</span><span style="font-variant-ligatures: no-common-ligatures" class="">.RegexLiteralType)</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">}</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(207, 135, 36);" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">// Structure of information about a match of regex on a string</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">struct</span><span style="font-variant-ligatures: no-common-ligatures" class=""> Match {</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">var</span><span style="font-variant-ligatures: no-common-ligatures" class=""> regex: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Regex</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">var</span><span style="font-variant-ligatures: no-common-ligatures" class=""> start: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures" class="">.</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Index</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">var</span><span style="font-variant-ligatures: no-common-ligatures" class=""> end: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures" class="">.</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Index</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">}</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">protocol</span><span style="font-variant-ligatures: no-common-ligatures" class=""> RegexProtocol {</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">init</span><span style="font-variant-ligatures: no-common-ligatures" class="">(pattern: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures" class="">) </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">throws</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(207, 135, 36);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">var</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> pattern: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> { </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">get</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> } </span><span style="font-variant-ligatures: no-common-ligatures" class="">// string representation of the pattern</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(207, 135, 36);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">func</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> search(string: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures;" class="">) -&gt; </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Bool</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> </span><span style="font-variant-ligatures: no-common-ligatures" class="">// used to check if a match is found at all in the string</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(207, 135, 36);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">func</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> match(string: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures;" class="">) -&gt; [</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Match</span><span style="font-variant-ligatures: no-common-ligatures;" class="">] </span><span style="font-variant-ligatures: no-common-ligatures" class="">// returns an array of all the matches</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">func</span><span style="font-variant-ligatures: no-common-ligatures" class=""> match(string: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">String</span><span style="font-variant-ligatures: no-common-ligatures" class="">, using: ((</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Match</span><span style="font-variant-ligatures: no-common-ligatures" class="">) -&gt; </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Void</span><span style="font-variant-ligatures: no-common-ligatures" class="">)) </span><span style="font-variant-ligatures: no-common-ligatures; color: #cf8724" class="">// enmuerate over matches</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">}</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(195, 89, 0);" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">struct</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> Regex: </span><span style="font-variant-ligatures: no-common-ligatures" class="">RegexProtocol</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> {</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">init</span><span style="font-variant-ligatures: no-common-ligatures" class="">(pattern: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Regex</span><span style="font-variant-ligatures: no-common-ligatures" class="">, options: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Regex</span><span style="font-variant-ligatures: no-common-ligatures" class="">.</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Options</span><span style="font-variant-ligatures: no-common-ligatures" class="">)</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">let</span><span style="font-variant-ligatures: no-common-ligatures" class=""> options: [</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Regex</span><span style="font-variant-ligatures: no-common-ligatures" class="">.</span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Options</span><span style="font-variant-ligatures: no-common-ligatures" class="">]</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">static</span><span style="font-variant-ligatures: no-common-ligatures" class=""> </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">let</span><span style="font-variant-ligatures: no-common-ligatures" class=""> word: </span><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Regex</span><span style="font-variant-ligatures: no-common-ligatures" class=""> </span><span style="font-variant-ligatures: no-common-ligatures; color: #cf8724" class="">// \w</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(207, 135, 36);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class="">&nbsp; &nbsp; </span><span style="font-variant-ligatures: no-common-ligatures" class="">// other useful regexes can be added as well</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">}</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><div style="margin: 0px; line-height: normal; color: rgb(207, 135, 36);" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">// Examples</span></div></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">let</span><span style="font-variant-ligatures: no-common-ligatures" class=""> regex = \[a-zA-Z]+\</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(232, 35, 0);" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">let</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> matches = </span><span style="font-variant-ligatures: no-common-ligatures; color: #587ea8" class="">regex</span><span style="font-variant-ligatures: no-common-ligatures;" class="">.match(</span><span style="font-variant-ligatures: no-common-ligatures" class="">"Matching words in text."</span><span style="font-variant-ligatures: no-common-ligatures;" class="">)</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">for</span><span style="font-variant-ligatures: no-common-ligatures" class=""> match </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">in</span><span style="font-variant-ligatures: no-common-ligatures" class=""> matches {</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(232, 35, 0);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class="">&nbsp; &nbsp; print(</span><span style="font-variant-ligatures: no-common-ligatures" class="">"Found a match at in string at </span><span style="font-variant-ligatures: no-common-ligatures;" class="">\</span><span style="font-variant-ligatures: no-common-ligatures" class="">(</span><span style="font-variant-ligatures: no-common-ligatures;" class="">match.start</span><span style="font-variant-ligatures: no-common-ligatures" class="">) to </span><span style="font-variant-ligatures: no-common-ligatures;" class="">\</span><span style="font-variant-ligatures: no-common-ligatures" class="">(</span><span style="font-variant-ligatures: no-common-ligatures;" class="">match.end</span><span style="font-variant-ligatures: no-common-ligatures" class="">)"</span><span style="font-variant-ligatures: no-common-ligatures;" class="">)</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">}</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(232, 35, 0);" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">let</span><span style="font-variant-ligatures: no-common-ligatures;" class=""> helloStr = </span><span style="font-variant-ligatures: no-common-ligatures" class="">"Hello world"</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; min-height: 13px;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""></span><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures; color: #c35900" class="">Regex</span><span style="font-variant-ligatures: no-common-ligatures" class="">.word.match(</span><span style="font-variant-ligatures: no-common-ligatures; color: #587ea8" class="">helloStr</span><span style="font-variant-ligatures: no-common-ligatures" class="">) { match </span><span style="font-variant-ligatures: no-common-ligatures; color: #36568a" class="">in</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">&nbsp; &nbsp; print(</span><span style="font-variant-ligatures: no-common-ligatures; color: #e82300" class="">"Matched </span><span style="font-variant-ligatures: no-common-ligatures" class="">\</span><span style="font-variant-ligatures: no-common-ligatures; color: #e82300" class="">(</span><span style="font-variant-ligatures: no-common-ligatures; color: #587ea8" class="">helloStr</span><span style="font-variant-ligatures: no-common-ligatures" class="">[match.start..&lt;match.end]</span><span style="font-variant-ligatures: no-common-ligatures; color: #e82300" class="">)"</span><span style="font-variant-ligatures: no-common-ligatures" class="">)</span></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">}</span></div></div><div class="">&nbsp;</div><div class="">Of course this is a scratch implementation I made but it is to open discussion on the topic. I feel the Regex struct itself will need more methods and variables such as for flags and number of groups. Please provide feedback with improvements to the code, concerns on the topic, or just open up discussion. Thank you!</div><div class=""><br class=""><div class="">
<div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Joshua Alvarado</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="mailto:alvaradojoshua0@gmail.com" class="">alvaradojoshua0@gmail.com</a></div></div></div><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div></div></div>_______________________________________________<br class="">swift-evolution mailing list<br class=""><a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a><br class=""><a href="https://lists.swift.org/mailman/listinfo/swift-evolution" class="">https://lists.swift.org/mailman/listinfo/swift-evolution</a><br class=""></div></blockquote></div><br class=""></div>_______________________________________________<br class="">swift-evolution mailing list<br class=""><a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a><br class="">https://lists.swift.org/mailman/listinfo/swift-evolution<br class=""></div></blockquote></div><br class=""></div></div></body></html>