[swift-evolution] [Discussion] Swift for Data Science / ML / Big Data analytics

Douglas Gregor dgregor at apple.com
Tue Oct 31 00:47:00 CDT 2017



> On Oct 30, 2017, at 9:43 PM, Chris Lattner <clattner at nondot.org> wrote:
> 
> JohnMC: question for you below.
> 
> On Oct 30, 2017, at 1:25 PM, Douglas Gregor <dgregor at apple.com <mailto:dgregor at apple.com>> wrote:
>>> 
>>> Thinking about the Perl case makes it clear to me that this should not be built into the compiler as a monolithic thing.  Perl supports several different types (SV/AV/HV) which represent different concepts (scalars, arrays, hashes) so baking it all together into one thing would be the wrong way to map it.  In fact, the magic we need is pretty small, and seems generally useful for other things. Consider a design like this:
>>> 
>>> 
>>> // not magic, things like Int, String and many other conform to this. 
>>> protocol Pythonable {
>>>  init?(_ : PythonObject)
>>>  func toPython() -> PythonObject
>>> }
>> 
>> It’s not magic unless you expect the compiler or runtime to help with conversion between Int/String/etc. and PythonObject, as with _ObjectiveCBridgeable.
> 
> Right, as I said above “not magic”.  The conformances would be manually implemented in the Python overlay.  This provides a free implicit conversion from "T -> Pythonable” for the T’s we care about, and a failable init from Python back to Swift types.

Note that, under this scheme,

	let p: Pythonable = 17
	let i: Int = p as! i

will work if Int is Pythonable, but not when p comes back from Python. 

> 
>>> // Not magic.
>>> struct PythonObject : /*protocols below*/ {
>>>   var state : UnsafePointer<PyObject>
>>> 
>>>   subscript(_ : Pythonable…) -> PythonObject {
>>>     ...
>>>   }
>>> }
>>> 
>>> // Magic, must be on the struct definition.  
>>> // Could alternatively allow custom copy/move/… ctors like C++.
>>> protocol CustomValueWitnessTable {
>>>  static func init(..)
>>>  static func copy(..)
>>>  static func move(..)
>>>  static func destroy(..)
>>> }
>> 
>> Swift’s implementation model supports this. As a surface-level construct it’s going to be mired in UnsafeMutablePointers, and it’s not at all clear to me that we want this level of control there. 
> 
> There are two ways to implement it:
> 1) static func’s like the above, which are implemented as UnsafePointer's
> 2) Proper language syntax akin to the C++ “rule of 5”.
> 
> The pro’s and con’s of the first approach:
> 
> pro) full explicit control over what happens
> pro) no other new language features necessary to implement this.  The second approach would need something like ownership to be in place.
> con) injects another avenue of unsafety (though it would be explicit, so it isn’t that bad).  It isn’t obvious to me that approach #2 can be safe, but I haven’t thought about it enough.
> ???) discourages people from using this protocol because of its explicit unsafety.

con) much of the UnsafePointer interface is based on the value witness table, so one has to step lightly or work with something like COpaquePointer/UnsafePointer<Void>.

> 
> I can think of two things that could tip the scale of the discussion:
> 
> a) The big question is whether we *want* the ability to write custom rule-of-5 style behavior for structs, or if we want it to only be used in extreme cases (like bridging interop in this proposal).  If we *want* to support it someday, then adding proper “safe” support is best (if possible).  If we don’t *want* people to use it, then making it Unsafe and ugly is a reasonable way to go.
> 
> b) The ownership proposal is likely to add deinit's to structs.  If it also adds explicit move initializers, then it is probably the right thing to add copy initializers also (and thus go with approach #2).  That said,  I’m not sure how the move initializers will be spelled or if that is the likely direction.  If it won’t add these, then it is probably better to go with approach #1.  John, what do you think?
> 
>> Presumably, binding to Python is going to require some compiler effort—defining how it is that Python objects are initialized/copied/moved/destroyed seems like a reasonable part of that effort.
> 
> Actually no.  If we add these three proposals, there is no other python (or perl, etc…) specific support needed.  It is all implementable in the overlay.

Support for working with Python objects would be implementable in the overlay, but the result isn’t necessarily ergonomic (e.g., my “as!” case from a Python-generated integer object to Int, shown above). That might be fine! More comments on this below.

> 
>>> // Magic, allows anyobject-like member lookup on a type when lookup otherwise fails.
>>> protocol DynamicMemberLookupable {
>>>   associatedtype MemberLookupResultType
>>>   func dynamicMemberLookup(_ : String) -> MemberLookupResultType
>>> }
>> 
>> AnyObject lookup looks for an actual declaration on any type anywhere. One could extend that mechanism to, say, return all Python methods and assume that you can call any Python method with any PythonObject instance. AnyObject lookup is fairly unprincipled as a language feature, because there’s no natural scope in which to perform name lookup, and involves hacks at many levels that don’t always work (e.g., AnyObject lookup… sometimes… fails across multiple source files for hard-to-explain reasons). You’re taking on that brokenness if you expand AnyObject lookup to another ecosystem.
> 
> Yeah, sorry, that’s not what I meant:

(Good)

> 
>> Although it doesn’t really seem like AnyObject lookup is the thing you’re asking for here. It seems more like you want dynamicMemberLookup(_:) to capture “self” and the method name, and then be a callable thing as below…
> 
> That’s what I meant :-).
> 
> A type that implements this magic protocol would never fail name lookup: “foo.bar” would always fall back to calling: foo.dynamicMemberLookup(“bar")
> 
> it’s simple and more predictable than AnyObject, it also matches what dynamic languages like Python needs.
> 
>>> // Magic, allows “overloaded/sugared postfix ()”.
>>> protocol CustomCallable {
>>>  func call( …)
>>> }
>>> 
>>> The only tricky thing about this is the call part of things.  At least in the case of python, we want something like this:
>>> 
>>>   foo.bar(1, 2, a: x, b: y)
>>> 
>>> to turn into:
>>>  foo.dynamicMemberLookup(“bar”).call(1, 2, kwargs: [“a”:x, “b”:y])
>>> 
>>> We don’t want this to be a memberlookup of a value that has “bar” as a basename and “a:” and “b:” as parameter labels.
>> 
>> Well, I think the MemberLookupResult is going to get the name “bar”, argument labels “_:_:a:b:”, and arguments “1”, “2”, “x”, “y”, because that’s the Swift model of argument labels. It can then reshuffle them however it needs to for the underlying interaction with the Python interpreter.
>> 
>> There are definite design trade-offs here. With AnyObject lookup, it’s a known-broken feature but because it depends on synthesized Swift method declarations, it’ll behave mostly the same way as other Swift method declarations—static overloading, known (albeit weak) type signatures, etc. But, it might require more of Python’s model to be grafted onto those method declarations. With dynamic member lookup, you’re throwing away all type safety (even for a motivated Python developer who might be willing to annotate APIs with types) and creating a general language mechanism for doing that.
> 
> Right, something like this could definitely work, but keep in mind that the Swift compiler knows nothing about Python declarations.  
> 
> Perhaps the most straight-forward thing would be to support:
> 
> protocol CustomCallable {
>  func call(…arg list as array and kw args...)
>  func callMember(_ : String, …otherstuffabove...)
> }
> 
> Given this, the compiler could map:
> 
> pythonThing(42)    -> pythonThing.call([42])
> pythonThing.method(a: 42)   -> pythonThing.callMember(“method”, kwargs: [“a”: 42])
> 
> This is the simplest way to map the Swift semantics (where kw args are part of compound lookups) into the Python world.

Okay, I agree that this gets Swift syntax into a call to the Python interpreter fairly quickly. Over-architecting for the sake of discussion:

protocol CustomCallable {
  associatedtype CustomArgument
  associatedtype CustomNominal
  associatedtype CustomResult

  func callMember(self: CustomNominal, functionName: String, arguments: [(String, CustomArgument)]) throws -> CustomResult
  // something for class/static members
}

But this is *all* dynamic, even when one could map much of Python’s type information into Swift. For example, let’s take this:

class Dog:

    def __init__(self, name):
        self.name = name
        self.tricks = []    # creates a new empty list for each dog

    def add_trick(self, trick):
        self.tricks.append(trick)

With your don’t-modify-the-compiler approach, how can I create a Dog instance and add a trick? I probably need to look up the class by name, call __init__ manually, etc.

  let dogClass = python_getClassByName(“Dog”) // implemented in the Python “overlay’, I guess
  let dog = python_createInstance(dogClass)  // implemented in the Python “overlay’, I guess
  dog.__init__(“Brianna”)      // uses CustomCallable’s callMember
  dog.add_trick(“Roll over”)  // uses CustomCallable’s callMember

With compiler integration, 

	class Dog : PythonObject {
	  init(_ name: Pythonable)
	  func add_trick(_ trick: Pythonable)
	}

One could possibly bridge the gap with a code generator of some sort, that (for example) maps Dog’s __init__’s to global functions

	func Dog(_ name: Pythonable) -> Pythonable {
	    let dogClass = python_getClassByName(“Dog”) // implemented in the Python “overlay’, I guess
	    let dog = python_createInstance(dogClass)  // implemented in the Python “overlay’, I guess
	    dog.__init__(“Brianna”)      // uses CustomCallable’s callMember
	    return dog
	}

and maybe turns all Python methods into extensions on Pythonable:

	extension Pythonable {
  	  func add_trick
	}

With either the true “Python importer” solution or this code-generation solution, you at least get some level of code completion and basic sanity checking “for free”. In other words, you get some of the benefits of having a statically-type-checked language while still working on dynamic Pythonable types.

	- Doug


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171030/bffc0578/attachment.html>


More information about the swift-evolution mailing list