[swift-evolution] Strings in Swift 4
Ted F.A. van Gaalen
tedvgiosdev at gmail.com
Sun Feb 12 12:17:39 CST 2017
> On 11 Feb 2017, at 18:33, Dave Abrahams <dabrahams at apple.com> wrote:
>
> All of these examples should be efficiently and expressively handled by the pattern matching API mentioned in the proposal. They definitely do not require random access or integer indexing.
>
Hi Dave,
then I am very interested to know how to unpack aString (e.g. read from a file record such as in the previous example:
123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453 )
without using direct subscripting like str[n1…n2) ?
(which btw is for me the most straightforward and ideal method)
conditions:
-The source string contains fields of known position (offset) and length, concatenated together
without any separators (like in a CSV)
-the contents of each field is unpredictable.
which excludes the use of pattern-matching.
-the source string needs to be unpacked in independent strings.
I made this example: (the comments also stress my point)
//: Playground - noun: a place outside the mean and harsh production environment
// No presidents were harmed during the production of this example.
import UIKit
import Foundation
// The following String extension with subscriptor "direct access"
// functionality, included in in almost each and every app I create,
// wouldn't be necessary if str[a..<b] was an integral part of Swift strings!
//
// However when str[a..<b] would or will be implemented into Swift,
// then, by all means, make sure in the documentation, notably the
// Swift language manual, that any string-element position and count
// does not necessarely correspond 1:1 with the positions and length
// on a graphical presentation devices, e.g. like dispays and printers.
//
// Leave it to the programmer to decide, wether or not to use
// direct subscripting like str[a..<b].
// as -in most cases- it only makes sense where fixed length characters
// are used, like in the example below.
//
// Like in any other programming language, an important focus should be
// to make things as intuitively simple as possible as this:
// - reduces and prevents errors caused by indirect programming
// - also note that it might reduce the risk of the normally
// very friendly but mostly stressful guys of the maintenance
// department coming after you with dangerous intentions...
extension String
{
subscript(i: Int) -> String
{
guard i >= 0 && i < characters.count else { return "" }
return String(self[index(startIndex, offsetBy: i)])
}
subscript(range: Range<Int>) -> String
{
let lowerIndex = index(startIndex, offsetBy: max(0,range.lowerBound), limitedBy: endIndex) ?? endIndex
return substring(with: lowerIndex..<(index(lowerIndex, offsetBy: range.upperBound - range.lowerBound, limitedBy: endIndex) ?? endIndex))
}
subscript(range: ClosedRange<Int>) -> String
{
let lowerIndex = index(startIndex, offsetBy: max(0,range.lowerBound), limitedBy: endIndex) ?? endIndex
return substring(with: lowerIndex..<(index(lowerIndex, offsetBy: range.upperBound - range.lowerBound + 1, limitedBy: endIndex) ?? endIndex))
}
}
// In the following example, the record's field positions and lengths are fixed format
// and will never change.
// Also, the record's contents has been validated completely by the sending application.
// Normally it is an input record, read from a storage medium,
// however for the purpose of this example it is defined here:
let record = "123A.534.CMCU3Arduino Due Arm 32-bit Micro controller. 000000034100000005680000002250$"
// Define a product data structure:
struct Product
{
var id :String // is key
var group: String
var name: String
var description : String
var inStock: Int
var ordered : Int
var price: Int // in cents: no Money data type in Swift available.
var currency: String
// of course one could use "set/get" properties here
// which could validate the input to this structure.
var priceFormatted: String // computed property.
{
get
{
let whole = (price / 100)
let cents = price - (whole * 100)
return currency + " \(whole).\(cents)"
}
}
// TODO: disable other default initiators.
init(inputrecord: String)
{
id = inputrecord[ 0..<10]
group = inputrecord[10..<14]
name = inputrecord[14..<30]
description = inputrecord[30..<60]
inStock = Int(inputrecord[60..<70])!
ordered = Int(inputrecord[70..<80])!
price = Int(inputrecord[80..<90])!
currency = inputrecord[90]
}
// Add necessary business and DB logic for products here.
}
func test()
{
let product = Product(inputrecord: record)
print("=== Product data for the item with ID: \(product.id) ====")
print("ID : \(product.id)")
print("group : \(product.group)")
print("name : \(product.name)")
print("description : \(product.description)")
print("items in stock : \(product.inStock)")
print("items ordered : \(product.ordered)")
print("price per item : \(product.priceFormatted)")
print("=========================================================")
}
test()
Which emitted the following output
=== Product data for the item with ID 123A.534.C ====
ID : 123A.534.C
group : MCU3
name : Arduino Due
description : Arm 32-bit Micro controller.
items in stock : 341
items ordered : 568
price per item : $ 22.50
====================================================
Isn’t that an elegant solution or what?
I might start a very lengthy discussion here about the threshold of where and how
to protect the average programmer (like me :o) from falling in to language pittfalls
and to what extend these have effect on working with a PL. One cannot make
a PL idiot-proof. Of course, i agree a lot of it make sense, and also the “intelligence”
of the Swift compiler (sometimes it almost feels as if it sits next to me looking at
the screen and shaking its head from time to time) But hey, remember most of
us in our profession have a brain too.
(btw, if you now of a way to let Xcode respect in-between spaces when auto-formatting please let me know, thanks)
@Ben Cohen:
Hi, you wrote:
"p.s. as someone who has worked in a bank with thousands of ancient file formats, no argument from me that COBOL rules :)"
Although still the most part of accounting software is Cobol (mostly because it is too expensive
and risky to convert to newer technologies) I don’t think that Cobol rules and that new apps definitely should
not be written in Cobol. I wouldn’t be doing Swift if I thought otherwise.
If I would be doing a Cobol project again, It would be with same enjoyment as say,
a 2017 mechanical engineer, working on a steam locomotive of a touristic railroad.
which I would do with dedication as well. However, never use this comparison
at the hiring interview..:o)
Kind Regards
TedvG
> Sent from my moss-covered three-handled family gradunza
>
> On Feb 9, 2017, at 5:09 PM, Ted F.A. van Gaalen <tedvgiosdev at gmail.com <mailto:tedvgiosdev at gmail.com>> wrote:
>
>>
>>> On 10 Feb 2017, at 00:11, Dave Abrahams <dabrahams at apple.com <mailto:dabrahams at apple.com>> wrote:
>>>
>>>
>>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" <tedvgiosdev-AT-gmail.com <http://tedvgiosdev-at-gmail.com/>> wrote:
>>>
>>>> Hello Shawn
>>>> Just google with any programming language name and “string manipulation”
>>>> and you have enough reading for a week or so :o)
>>>> TedvG
>>>
>>> That truly doesn't answer the question. It's not, “why do people index
>>> strings with integers when that's the only tool they are given for
>>> decomposing strings?” It's, “what do you have to do with strings that's
>>> hard in Swift *because* you can't index them with integers?”
>>
>> Hi Dave,
>> Ok. here are just a few examples:
>> Parsing and validating an ISBN code? or a (freight) container ID? or EAN13 perhaps?
>> of many of the typical combined article codes and product IDs that many factories and shops use?
>>
>> or:
>>
>> E.g. processing legacy files from IBM mainframes:
>> extract fields from ancient data records read from very old sequential files,
>> say, a product data record like this from a file from 1978 you’d have to unpack and process:
>> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453
>> into:
>> 123, 534, -09, EVXD45, 68,99, 1234,99, ABC, YELLOW, 12A, GRAIN, ESYSTEM, Z3453.
>> product category, pcs, discount code, product code, price Yen, price $, class code, etc…
>> in Cobol and PL/1 records are nearly always defined with a fixed field layout like this.:
>> (storage was limited and very, very expensive, e.g. XML would be regarded as a
>> "scandalous waste" even the commas in CSV files! )
>>
>> 01 MAILING-RECORD.
>> 05 COMPANY-NAME PIC X(30).
>> 05 CONTACTS.
>> 10 PRESIDENT.
>> 15 LAST-NAME PIC X(15).
>> 15 FIRST-NAME PIC X(8).
>> 10 VP-MARKETING.
>> 15 LAST-NAME PIC X(15).
>> 15 FIRST-NAME PIC X(8).
>> 10 ALTERNATE-CONTACT.
>> 15 TITLE PIC X(10).
>> 15 LAST-NAME PIC X(15).
>> 15 FIRST-NAME PIC X(8).
>> 05 ADDRESS PIC X(15).
>> 05 CITY PIC X(15).
>> 05 STATE PIC XX.
>> 05 ZIP PIC 9(5).
>>
>> These are all character data fields here, except for the numeric ZIP field , however in Cobol it can be treated like character data.
>> So here I am, having to get the data of these old Cobol production files
>> into a brand new Swift based accounting system of 2017, what can I do?
>>
>> How do I unpack these records and being the data into a Swift structure or class?
>> (In Cobol I don’t have to because of the predefined fixed format record layout).
>>
>> AFAIK there are no similar record structures with fixed fields like this available Swift?
>>
>> So, the only way I can think of right now is to do it like this:
>>
>> // mailingRecord is a Swift structure
>> struct MailingRecord
>> {
>> var companyName: String = “no Name”
>> var contacts: CompanyContacts
>> .
>> etc..
>> }
>>
>> // recordStr was read here with ASCII encoding
>>
>> // unpack data in to structure’s properties, in this case all are Strings
>> mailingRecord.companyName = recordStr[ 0..<30]
>> mailingRecord.contacts.president.lastName = recordStr[30..<45]
>> mailingRecord.contacts.president.firstName = recordStr[45..<53]
>>
>>
>> // and so on..
>>
>> Ever worked for e.g. a bank with thousands of these files unchanged formats for years?
>>
>> Any alternative, convenient en simpler methods in Swift present?
>>
>> Kind Regards
>> TedvG
>> ( example of the above Cobol record borrowed from here:
>> http://www.3480-3590-data-conversion.com/article-reading-cobol-layouts-1.html <http://www.3480-3590-data-conversion.com/article-reading-cobol-layouts-1.html> )
>>
>>
>>
>>
>>>
>>>>> On 9 Feb 2017, at 16:48, Shawn Erickson <shawnce at gmail.com <mailto:shawnce at gmail.com>> wrote:
>>>>>
>>>>> I also wonder what folks are actually doing that require indexing
>>>>> into strings. I would love to see some real world examples of what
>>>>> and why indexing into a string is needed. Who is the end consumer of
>>>>> that string, etc.
>>>>>
>>>>> Do folks have so examples?
>>>>>
>>>>> -Shawn
>>>>>
>>>>> On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org> <mailto:swift-evolution at swift.org <mailto:swift-evolution at swift.org>>> wrote:
>>>>> Hello Hooman
>>>>> That invalidates my assumptions, thanks for evaluating
>>>>> it's more complex than I thought.
>>>>> Kind Regards
>>>>> Ted
>>>>>
>>>>>> On 8 Feb 2017, at 00:07, Hooman Mehr <hooman at mac.com <mailto:hooman at mac.com> <mailto:hooman at mac.com <mailto:hooman at mac.com>>> wrote:
>>>>>>
>>>>>>
>>>>>>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org> <mailto:swift-evolution at swift.org <mailto:swift-evolution at swift.org>>> wrote:
>>>>>>>
>>>>>>> I now assume that:
>>>>>>> 1. -= a “plain” Unicode character (codepoint?) can result in one glyph.=-
>>>>>>
>>>>>> What do you mean by “plain”? Characters in some Unicode scripts are
>>>>>> by no means “plain”. They can affect (and be affected by) the
>>>>>> characters around them, they can cause glyphs around them to
>>>>>> rearrange or combine (like ligatures) or their visual
>>>>>> representation (glyph) may float in the same space as an adjacent
>>>>>> glyph (and seem to be part of the “host” glyph), etc. So, the
>>>>>> general relationship of a character and its corresponding glyph (if
>>>>>> there is one) is complex and depends on context and surroundings
>>>>>> characters.
>>>>>>
>>>>>>> 2. -= a grapheme cluster always results in just a single glyph, true? =-
>>>>>>
>>>>>> False
>>>>>>
>>>>>>> 3. The only thing that I can see on screen or print are glyphs (“carvings”,visual elements that stand on their own )
>>>>>>
>>>>>> The visible effect might not be a visual shape. It may be for example, the way the surrounding shapes change or re-arrange.
>>>>>>
>>>>>>> 4. In this context, a glyph is a humanly recognisable visual form of a character,
>>>>>>
>>>>>> Not in a straightforward one to one fashion, not even in Latin / Roman script.
>>>>>>
>>>>>>> 5. On this level (the glyph, what I can see as a user) it is not relevant and also not detectable
>>>>>>> with how many Unicode scalars (codepoints ?), grapheme, or even on what kind
>>>>>>> of encoding the glyph was based upon.
>>>>>>
>>>>>> False
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> swift-evolution mailing list
>>>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org> <mailto:swift-evolution at swift.org <mailto:swift-evolution at swift.org>>
>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>>
>>>>
>>>
>>> --
>>> -Dave
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170212/19dc05ac/attachment.html>
More information about the swift-evolution
mailing list