When Kevin Modzelewski and his colleagues at Dropbox set out to create Pyston in 2014, they had a very simple objective: to lower the costs of running Python code on Dropbox’s servers, by making the code itself faster.
“We were growing exponentially, so our server cost was growing exponentially,” Modzelewski tells TechRepublic. “If we could get Python running faster, we would spend less money running Python.”
He had realized while working on the language that there was a strong demand for faster Python among the developer community, and while there were plenty of tools around for improving the performance in smaller applications, there were none designed for big, business logic-type applications such as Dropbox.
“There’s a lot of tools out there for helping you run Python faster, but there weren’t any that were a good fit for Dropbox’s use case,” says Modzelewski.
“This was an area of the Python market where a lot of money was being spent, but not very many tools were being developed for helping. It was under served.”
Fast forward to today and Pyston is now in version 2.2, and has been open-sourced, with Modzelewski and fellow developer Marius Wachtler now leading the project as co-founders. The latest implementation promises a 30% performance improvement over Python 3.8.8, with a key benefit being that developers can simply drop their Python applications into Pyston and get going, without having to rewrite their code. It’s also a “completely separate thing” to what Modzelewski and fellow developers built for Dropbox some seven years ago.
“We very much want you to be able to just drop in Pyston instead of the normal Python, and not have to do a single other thing,” says Modzelewski.
“At the time we started, Dropbox’s code base was millions of lines of code. And you can’t really reasonably rewrite that into another language or annotate the whole thing.”
The goal for Pyston at the outset was to create a Python implementation that could push the programming language’s performance to those comparable to traditional systems languages like C++.
SEE: The best programming languages to learn–and the worst (TechRepublic Premium)
Yet so many features had been added to Python over the years that it was hard to tell what was important and what could be ditched, says Modzelewski: “I consider myself pretty knowledgeable about Python, and there were several features I didn’t know about until actually having to implement them myself. I was like, ‘I’ve never heard about this, I’ve never read about this, I’ve never heard of anyone using it’.”
The only way to figure out which features were useful to developers and which were not was to simply begin removing them and then wait for feedback. “We wanted to start removing some of them so that we could start getting feedback from people saying, ‘Hey, actually I use that feature and you removed it’.” Modzelewski says.
This aggressive approach to optimization has allowed Pyston to gain significant performance improvements over CPython. While 30% is the figure officially touted, Modzelewski suggests that this is a conservative estimate, owing to the fact that Pyston uses more realistic performance benchmarks that better reflect how Pyston developers might actually experience it.
“We measure performance pretty differently than other projects do, so our 30% isn’t directly comparable. If we were to measure performance the same way other people did our number would be higher,” he says.
“Sometimes we think, should we do that? It sounds better to be ‘more percent’ faster. But we try to measure performance similarly to how our users might actually experience it, and so that ends up giving us a lower number, and that’s 30% for us.”
SEE: Python is eating the world: How one developer’s side project became the hottest programming language on the planet (TechRepublic cover story PDF)
Pyston, meanwhile, is targeting web applications specifically. While there are a lot of tools for speeding up Python code – such as ‘rival’ Pyton implementation PyPy – Modzelewski explains that these don’t tend to work well for web applications, owing to the number of dependencies.
It’s this gap in the market that Pyston is targeting. “We’re looking to become the go-to way of speeding up Python web applications, by which I mean the components that run on a company’s servers,” he adds.
A faster, more aggressive Python
Guido van Rossum has pledged to address Python’s performance issues in upcoming versions, boldly pledging to double the language’s speed in Python 3.11, one of three Python branches slated for a pre-alpha release in 2022.
Modzelewski says there are similarities in what Python and Pyston are trying to achieve, but notes that any speed improvements in newer versions of Python won’t translate over to Pyston; while the team has back-ported some of the newer Python code, it will continue to target version Python 3.8.
Regardless, Modzelewski believes that Pyston can continue to be competitive. “We’ve been a lot more aggressive [than Python] with the types of things we’ll do,” he says, noting that Pyston’s JIT compiler offers an edge in terms of the tools it has at its disposal.
“They’re kind of restricting themselves in what they will put together.”
SEE: Hiring kit: Python developer (TechRepublic Premium)
There’s currently no way of knowing for sure how many developers are using Pyston in the field. The project clearly has a following of enthusiastic and dedicated users, many of who will chime in to the Pyston Discord channel to report bugs or just let them know how they’re getting on with it.
Modzelewski says the primary goal of the project is to build something that people will get value from. “I think in this space it’s really easy to build something incredibly technically interesting, something intellectually very interesting, but isn’t a useful product to users…We’re very much taking a product approach to this where we want to build something useful, and I think we’re doing that.”
Pyston is still very much in the growing phase. Later down the line, the team hopes it will be able to address Python’s issues with multithreading – a technique by which a single set of code is split between several processing cores – though Modzelewski notes that this isn’t on the language’s immediate roadmap.
“I think long term, we really want to be something where the trade offs are easy enough to accept, that a lot of people would use us. There’s a lot of specialized Python optimization tools out there that most people wouldn’t typically use,” he says.
“My dream for the project is that it would just be sort of standard advice: ‘Hey, you might as well use Pyston.'”