[swift-evolution] [Discussion] Swift for Data Science / ML / Big Data analytics
maxim at vekslers.org
Sat Oct 28 11:44:51 CDT 2017
The big data and machine learning world is dominated by Python, Scala an R.
I'm a Swifter by heart, but not so much by tools of trait.
I'd appreciate a constructive discussion on how that could be changed.
While R is a non goal for obvious reasons, i'd argue that since both Scala
and Python are general purpose languages, taking them head to head might be
a low hanging fruit.
To make the claim I'd like to reference to projects such as
- Hadoop, Spark, Hive are all huge eco-systems which are entirely JVM
- Apache Parquet, a highly efficient column based storage format for big
data analytics which was implemented in Java, and C++.
- Apache Arrow, a physical memory spec that big data systems can use to
allow zero transformations on data transferred between systems. Which (for
obvious reasons) focused on JVM, to C interoperability.
Python's Buffer Protocol which ensures it's predominance (for the time
being) as a prime candidate for data science related projects
While Swift's Memory Ownership manifesto touches similar turf discussing
copy on write and optimizing memory access overhead it IMHO takes a system
level perspective targeting projects such as kernel code. I'd suggest that
viewing the problem from an efficient CPU/GPU data crunching machine
perspective might shade a different light on the requirements and use
I'd be happy to learn more, and have a constructive discussion on the
puıɯ ʎɯ ɯoɹɟ ʇuǝs
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the swift-evolution