[swift-evolution] [Discussion] Swift for Data Science / ML / Big Data analytics

Chris Lattner clattner at nondot.org
Mon Oct 30 23:43:39 CDT 2017

JohnMC: question for you below.

On Oct 30, 2017, at 1:25 PM, Douglas Gregor <dgregor at apple.com> wrote:
>> Thinking about the Perl case makes it clear to me that this should not be built into the compiler as a monolithic thing.  Perl supports several different types (SV/AV/HV) which represent different concepts (scalars, arrays, hashes) so baking it all together into one thing would be the wrong way to map it.  In fact, the magic we need is pretty small, and seems generally useful for other things. Consider a design like this:
>> // not magic, things like Int, String and many other conform to this. 
>> protocol Pythonable {
>>  init?(_ : PythonObject)
>>  func toPython() -> PythonObject
>> }
> It’s not magic unless you expect the compiler or runtime to help with conversion between Int/String/etc. and PythonObject, as with _ObjectiveCBridgeable.

Right, as I said above “not magic”.  The conformances would be manually implemented in the Python overlay.  This provides a free implicit conversion from "T -> Pythonable” for the T’s we care about, and a failable init from Python back to Swift types.

>> // Not magic.
>> struct PythonObject : /*protocols below*/ {
>>   var state : UnsafePointer<PyObject>
>>   subscript(_ : Pythonable…) -> PythonObject {
>>     ...
>>   }
>> }
>> // Magic, must be on the struct definition.  
>> // Could alternatively allow custom copy/move/… ctors like C++.
>> protocol CustomValueWitnessTable {
>>  static func init(..)
>>  static func copy(..)
>>  static func move(..)
>>  static func destroy(..)
>> }
> Swift’s implementation model supports this. As a surface-level construct it’s going to be mired in UnsafeMutablePointers, and it’s not at all clear to me that we want this level of control there. 

There are two ways to implement it:
1) static func’s like the above, which are implemented as UnsafePointer's
2) Proper language syntax akin to the C++ “rule of 5”.

The pro’s and con’s of the first approach:

pro) full explicit control over what happens
pro) no other new language features necessary to implement this.  The second approach would need something like ownership to be in place.
con) injects another avenue of unsafety (though it would be explicit, so it isn’t that bad).  It isn’t obvious to me that approach #2 can be safe, but I haven’t thought about it enough.
???) discourages people from using this protocol because of its explicit unsafety.

I can think of two things that could tip the scale of the discussion:

a) The big question is whether we *want* the ability to write custom rule-of-5 style behavior for structs, or if we want it to only be used in extreme cases (like bridging interop in this proposal).  If we *want* to support it someday, then adding proper “safe” support is best (if possible).  If we don’t *want* people to use it, then making it Unsafe and ugly is a reasonable way to go.

b) The ownership proposal is likely to add deinit's to structs.  If it also adds explicit move initializers, then it is probably the right thing to add copy initializers also (and thus go with approach #2).  That said,  I’m not sure how the move initializers will be spelled or if that is the likely direction.  If it won’t add these, then it is probably better to go with approach #1.  John, what do you think?

> Presumably, binding to Python is going to require some compiler effort—defining how it is that Python objects are initialized/copied/moved/destroyed seems like a reasonable part of that effort.

Actually no.  If we add these three proposals, there is no other python (or perl, etc…) specific support needed.  It is all implementable in the overlay.

>> // Magic, allows anyobject-like member lookup on a type when lookup otherwise fails.
>> protocol DynamicMemberLookupable {
>>   associatedtype MemberLookupResultType
>>   func dynamicMemberLookup(_ : String) -> MemberLookupResultType
>> }
> AnyObject lookup looks for an actual declaration on any type anywhere. One could extend that mechanism to, say, return all Python methods and assume that you can call any Python method with any PythonObject instance. AnyObject lookup is fairly unprincipled as a language feature, because there’s no natural scope in which to perform name lookup, and involves hacks at many levels that don’t always work (e.g., AnyObject lookup… sometimes… fails across multiple source files for hard-to-explain reasons). You’re taking on that brokenness if you expand AnyObject lookup to another ecosystem.

Yeah, sorry, that’s not what I meant:

> Although it doesn’t really seem like AnyObject lookup is the thing you’re asking for here. It seems more like you want dynamicMemberLookup(_:) to capture “self” and the method name, and then be a callable thing as below…

That’s what I meant :-).

A type that implements this magic protocol would never fail name lookup: “foo.bar” would always fall back to calling: foo.dynamicMemberLookup(“bar")

it’s simple and more predictable than AnyObject, it also matches what dynamic languages like Python needs.

>> // Magic, allows “overloaded/sugared postfix ()”.
>> protocol CustomCallable {
>>  func call( …)
>> }
>> The only tricky thing about this is the call part of things.  At least in the case of python, we want something like this:
>>   foo.bar(1, 2, a: x, b: y)
>> to turn into:
>>  foo.dynamicMemberLookup(“bar”).call(1, 2, kwargs: [“a”:x, “b”:y])
>> We don’t want this to be a memberlookup of a value that has “bar” as a basename and “a:” and “b:” as parameter labels.
> Well, I think the MemberLookupResult is going to get the name “bar”, argument labels “_:_:a:b:”, and arguments “1”, “2”, “x”, “y”, because that’s the Swift model of argument labels. It can then reshuffle them however it needs to for the underlying interaction with the Python interpreter.
> There are definite design trade-offs here. With AnyObject lookup, it’s a known-broken feature but because it depends on synthesized Swift method declarations, it’ll behave mostly the same way as other Swift method declarations—static overloading, known (albeit weak) type signatures, etc. But, it might require more of Python’s model to be grafted onto those method declarations. With dynamic member lookup, you’re throwing away all type safety (even for a motivated Python developer who might be willing to annotate APIs with types) and creating a general language mechanism for doing that.

Right, something like this could definitely work, but keep in mind that the Swift compiler knows nothing about Python declarations.  

Perhaps the most straight-forward thing would be to support:

protocol CustomCallable {
 func call(…arg list as array and kw args...)
 func callMember(_ : String, …otherstuffabove...)

Given this, the compiler could map:

pythonThing(42)    -> pythonThing.call([42])
pythonThing.method(a: 42)   -> pythonThing.callMember(“method”, kwargs: [“a”: 42])

This is the simplest way to map the Swift semantics (where kw args are part of compound lookups) into the Python world.


