The Python programming language is a hit for data science and machine learning projects on high-powered hardware, but one of its weaknesses is speed.
Anaconda, a company that provides a leading distribution of Python for data science, wants to change that by supporting Pyston — a new implementation of Python that sheds debugging features for speed.
Pyston, created by Kevin Modzelewski, was open-sourced in May with the promise of a 30% acceleration in Python code. Modzelewski was an engineer at Dropbox, which was a major user of Python and hired the language’s creator Guido van Rossum for five years from 2013 to improve its code.
Anaconda has now hired Modzelewski and fellow lead developer Marius Wachtler who have been tasked with building the project’s community of users, contributors, and maintainers to ensure its long-term sustainability.
“Support from Anaconda will enable us to put Pyston into the hands of more users faster than ever before,” said Modzelewski in a statement. Anaconda claims to have more than 25 million users.
Pyston executes programs on average 20% to 50% faster than standard Python, according to Anaconda.
The Python implementation was developed at Dropbox between 2014 and 2017. It was launched as a new project in 2020 as Pyston v2.
Pyston, which is derived from the official CPython from the Python Software Foundation, will remain an open-source project. With Anaconda, the project will focus on improving compatibility with Python’s legion of packages that have helped make it dominant in data science and machine learning as well as bringing Pyston to more hardware.
“The new Pyston 2.x series is a complete rewrite of the codebase from scratch, starting from a fork of CPython 3.8,” Anaconda says in a blogpost detailing its plan to become a general-purpose accelerator of all Python applications.
Anaconda co-founder Peter Wang told ZDNet recently that it was “incredibly awkward to use Python to build and distribute any applications that have actual graphical user interfaces.”
“On desktops, Python is never the first-class language of the operating system, and it must resort to third-party frameworks like Qt or wxPython,” he said.
Besides data science, Python’s strengths are in tying together backend systems.
Van Rossum, who is now employed by Microsoft, is trying to make Python twice as fast in Python version 3.11 — one stem of three Python branches planned for 2022. The latest stable version of Python is version 3.9.7.
Anaconda has already had involvement in Python optimization, scalability, and performance projects.
“One of Anaconda’s oldest open-source projects is the Numba compiler, an LLVM-based JIT compiler for numerical Python functions running on the CPU or GPU. As a result, we’ve been thinking about Python compilers for a long time, and we see the potential for Pyston to quickly bring faster Python to a mainstream audience.
“Numba addresses many numerical use cases very well but cannot optimize entire programs, and it does not address the wider world of Python use cases. Pyston comes at the Python compilation problem from a different direction. Still, common ancestry with the CPython interpreter means that Numba “just works” with it, and the two systems can be used in tandem within the same program. Numba can speed up individual functions by 2-10x (or more), and Pyston can improve the performance of everything else.”
Anaconda also reckons Pyston improvements can be upstreamed to CPython and dovetail with van Rossum’s plans at Microsoft plans to significantly speed up Python.