[swift-evolution] [Discussion] Swift for Data Science / ML / Big Data analytics
Maxim Veksler
maxim at vekslers.org
Sat Oct 28 11:44:51 CDT 2017
Hey Guys,
The big data and machine learning world is dominated by Python, Scala an R.
I'm a Swifter by heart, but not so much by tools of trait.
I'd appreciate a constructive discussion on how that could be changed.
While R is a non goal for obvious reasons, i'd argue that since both Scala
and Python are general purpose languages, taking them head to head might be
a low hanging fruit.
To make the claim I'd like to reference to projects such as
- Hadoop, Spark, Hive are all huge eco-systems which are entirely JVM
based.
- Apache Parquet, a highly efficient column based storage format for big
data analytics which was implemented in Java, and C++.
- Apache Arrow, a physical memory spec that big data systems can use to
allow zero transformations on data transferred between systems. Which (for
obvious reasons) focused on JVM, to C interoperability.
Python's Buffer Protocol which ensures it's predominance (for the time
being) as a prime candidate for data science related projects
https://jeffknupp.com/blog/2017/09/15/python-is-the-
fastest-growing-programming-language-due-to-a-feature-youve-never-heard-of/
While Swift's Memory Ownership manifesto touches similar turf discussing
copy on write and optimizing memory access overhead it IMHO takes a system
level perspective targeting projects such as kernel code. I'd suggest that
viewing the problem from an efficient CPU/GPU data crunching machine
perspective might shade a different light on the requirements and use
cases.
I'd be happy to learn more, and have a constructive discussion on the
subject.
Thank you,
Max.
--
puıɯ ʎɯ ɯoɹɟ ʇuǝs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20171028/6e4daf93/attachment.html>
More information about the swift-evolution
mailing list