MIT-Created Compiler Speeds up Python Code (2024)

Python is a popular, beginner-friendly language. It’s also an interpreted language, which makes it easy to use but slower than a compiled language such as C or C++. At the large scale that becomes a problem, as Ariya Shajii, an MIT CSAIL Ph.D. graduate, and his colleague Ibrahim Numanagić noticed when working with genomics, which involved large data sequences.

They realized the previous efforts to create faster versions of Python were predicated on a top-down approach that started with the traditional implementation and then attempted to make it faster by doing a just-in-time compilation, which compiles the code as the program runs, Shajii said.

“The clear advantage of that is you can get a lot of backwards compatibility, but you’re really limited in the types of things you can do,” Shajii told The New Stack. “For example, Python has this thing called a global interpreter lock, which basically prevents you from doing parallel or multithreaded applications. And that’s a big problem if you really want high performance.”

Instead, Shajii and Numanagić took a bottom-up approach, implementing everything from the ground up, independent of the standard Python implementation, he said. That led them to an unusual approach: compiling Python with a tool they created, with an MIT team, called Codon.

“It gives you a lot more flexibility to do interesting things and generate optimized code, and things like that,” Shajii said. “That’s why we’re able to get such a better performance than some of these other compilation approaches, which maybe get 2 to 4 times, for example, but with Codon it’s usually like 10 to 100 times.”

The MIT team tested Codon on approximately ten commonly used genomics applications, all written in Python and compiled using Codon. The team achieved five to ten times speed-ups over the original hand-optimized implementations.

Codon’s Origin Story

Originally, Shajii and Numanagić planned to build a domain-specific language for genomics, since that was their background. What they found, however, is that people didn’t want to learn a new and specialized language — they like Python.

“That’s why we just made everything as Pythonic as possible,” he said. “Then over time, we just closed the gaps farther and farther to the point where we had sort of a general Python, sort of Python replacement pretty much.”

The team then refactored their tool into the Codon compiler by converting all its genomic-specific library, data structures, and methods of dealing with sequences into an extension. This approach allows Codon to support other domain-specific languages, which are programming languages with higher abstraction for a specific class of problems, all wrapped in a comfortable Python-like environment.

“The whole system is extensible with plugins, so you can write a plugin that has new libraries, new compiler optimizations; you can even add new keywords to the language if you want it to, or new syntax,” Shajii said. “But from the user standpoint, they’re still writing very high-level Pythonic code.”

One of the first puzzles the team had to solve was how to feed the compiler Python code. The compiler’s first step is to perform “type checking,” a process where the program figures out the different data types — string, integers, floating-point numbers, etc. — of each variable or function. Some might be strings, some might be integers. In regular Python, that information is dealt with as the program runs, which is one of the reasons Python is slow. Codon does this type-checking before running the program. Doing so allows the compiler to convert the code to native machine code, thus avoiding the overhead of dealing with data types at runtime.

They then focused on optimizations in the compiler.

“If you’re working with the genomics plugin, for example, that will do its own set of optimizations that are specific to that computing domain, which involves working with genomic sequences and other biological data, for example. The result? An executable file that runs at the speed of C or C++, or even faster once domain-specific optimizations are applied,” MIT stated.

Shajii and the team published a paper detailing how Codon works.

Compiling Python Caveats

There are a few caveats with compiling Python, however. Codon does not support dynamically changing data types at runtime, for instance.

“We said, okay, we’re targeting scientific applications, and it’s rare to do stuff like that, so let’s just like shift our focus to statically analyzable things,” Shajii explained. “So some of those dynamic features we don’t support.”

Some of these omitted features are on Codon’s roadmap to support and some aren’t. For instance, standard library modules aren’t supported yet, but the MIT team is working on it.

“It’s a huge, huge library, but we’ve tried to implement the main ones that we typically see used […] in the kinds of applications that we’re targeting,” he said.

There are also data type differences. For example, integers in Codon are 64 bit and in Python they’re “arbitrarily long,” he said.

Also, while Codon is designed to help projects scale up, don’t expect a seamless output yet.

“Larger code bases, you’ll probably end up having some [of the] incompatibilities that I mentioned. So, you know, oftentimes we give you error messages: that you need to go and change this, or [we] don’t support this yet,” he said.

There are other ways to use Codon in larger Python applications, he said, noting that there is a decorator that allows developers to allow one particular function — say a bottleneck — to compile while everything else stays in Python.

“That’s to address this problem of an all-or-nothing approach,” he said. “Often, if you have some Python application, what people would typically do is they would write the really performance-critical pieces of that in C; or Cython, for example, is another tool that’s used for that. So we’re releasing something pretty soon that lets you do that same thing in Codon, so you never have to leave the Python environment, which, again, is sort of the underlying theme of all this.”

Codon’s Coming Soon: WebAssembly and More

Codon was released in December and is in version 0.15. It’s available for free usage in academic or personal applications.

The team wants to incorporate several dynamic features and expand its Python library coverage. There’s one planned feature, however, that may appeal to frontend and web developers: They’ve planned to support compiling to WebAssembly.

“We use LLVM as a backend. LLVM is a very common sort of compiler infrastructure/framework that a lot of compilers use, and LLVM has support for WebAssembly,” he said. “So one of the things that we plan to add support for is WebAssembly for Codon, so [that] you can take a Python program and compile it to WebAssembly.”

TRENDING STORIES

Loraine Lawson is a veteran technology reporter who has covered technology issues from data integration to security for 25 years. Before joining The New Stack, she served as the editor of the banking technology site Bank Automation News. She has... Read more from Loraine Lawson
MIT-Created Compiler Speeds up Python Code (2024)
Top Articles
New York Real Estate Market Overview - 2024
Solid Power, Inc. (SLDP) Stock Forecast & Price Prediction 2025, 2030 | CoinCodex
3 Tick Granite Osrs
It's Official: Sabrina Carpenter's Bangs Are Taking Over TikTok
Form V/Legends
Tesla Supercharger La Crosse Photos
THE 10 BEST Women's Retreats in Germany for September 2024
BULLETIN OF ANIMAL HEALTH AND PRODUCTION IN AFRICA
P2P4U Net Soccer
The Best English Movie Theaters In Germany [Ultimate Guide]
Noaa Swell Forecast
Khatrimaza Movies
Pike County Buy Sale And Trade
Back to basics: Understanding the carburetor and fixing it yourself - Hagerty Media
123 Movies Babylon
Craigslistdaytona
Prices Way Too High Crossword Clue
C-Date im Test 2023 – Kosten, Erfahrungen & Funktionsweise
How to Store Boiled Sweets
Gma Deals And Steals Today 2022
Cpt 90677 Reimbursem*nt 2023
Moviesda3.Com
Trac Cbna
Ess.compass Associate Login
Plan Z - Nazi Shipbuilding Plans
Jbf Wichita Falls
Craigslist Clinton Ar
Boscov's Bus Trips
Company History - Horizon NJ Health
Roane County Arrests Today
Essence Healthcare Otc 2023 Catalog
2000 Ford F-150 for sale - Scottsdale, AZ - craigslist
Ugly Daughter From Grown Ups
Kacey King Ranch
WOODSTOCK CELEBRATES 50 YEARS WITH COMPREHENSIVE 38-CD DELUXE BOXED SET | Rhino
Ofw Pinoy Channel Su
Grandstand 13 Fenway
Sephora Planet Hollywood
Restored Republic December 9 2022
Scanning the Airwaves
Tirage Rapid Georgia
Vons Credit Union Routing Number
Craigs List Hartford
'Guys, you're just gonna have to deal with it': Ja Rule on women dominating modern rap, the lyrics he's 'ashamed' of, Ashanti, and his long-awaited comeback
Advance Auto.parts Near Me
Jackerman Mothers Warmth Part 3
Germany’s intensely private and immensely wealthy Reimann family
Smoke From Street Outlaws Net Worth
Used Curio Cabinets For Sale Near Me
Syrie Funeral Home Obituary
Koniec veľkorysých plánov. Prestížna LEAF Academy mení adresu, masívny kampus nepostaví
Craigslist Centre Alabama
Latest Posts
Article information

Author: Carlyn Walter

Last Updated:

Views: 6350

Rating: 5 / 5 (50 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Carlyn Walter

Birthday: 1996-01-03

Address: Suite 452 40815 Denyse Extensions, Sengermouth, OR 42374

Phone: +8501809515404

Job: Manufacturing Technician

Hobby: Table tennis, Archery, Vacation, Metal detecting, Yo-yoing, Crocheting, Creative writing

Introduction: My name is Carlyn Walter, I am a lively, glamorous, healthy, clean, powerful, calm, combative person who loves writing and wants to share my knowledge and understanding with you.