Available Groups

Like individual benchmarks (see “Available benchmarks” below), benchmarks group are allowed after the -b option. Use python3 -m performance list_groups to list groups and their benchmarks.

Available benchmark groups:

  • 2n3: Benchmarks compatible with both Python 2 and Python 3
  • all: Group including all benchmarks
  • apps: “High-level” applicative benchmarks (2to3, Chameleon, Tornado HTTP)
  • default: Group of benchmarks run by default by the run command
  • math: Float and integers
  • regex: Collection of regular expression benchmarks
  • serialize: Benchmarks on pickle and json modules
  • startup: Collection of microbenchmarks focused on Python interpreter start-up time.
  • template: Templating libraries

Use the python3 -m performance list_groups command to list groups and their benchmarks.

There is also a disabled threading group: collection of microbenchmarks for Python’s threading support. These benchmarks come in pairs: an iterative version (iterative_foo), and a multithreaded version (threaded_foo).

Available Benchmarks

In performance 0.5.5, the following microbenchmarks have been removed because they are too short, not representative of real applications and are too unstable.

  • call_method_slots
  • call_method_unknown
  • call_method
  • call_simple
  • logging_silent
  • pybench


Run the 2to3 tool on the performance/benchmarks/data/2to3/ directory: copy of the django/core/*.py files of Django 1.1.4, 9 files.

Run the python -m lib2to3 -f all <files> command where python is sys.executable. So the test does not only mesure the performance of Python itself, but also the performance of the lib2to3 module which can change depending on the Python version.


Files are called .py.txt instead of .py to not run PEP 8 checks on them, and more generally to not modify them.


Render a template using the chameleon module to create an HTML table of 500 lignes and 10 columns.

See the chameleon.PageTemplate class.


Create chaosgame-like fractals. Command lines options:

--thickness THICKNESS
                      Thickness (default: 0.25)
--width WIDTH         Image width (default: 256)
--height HEIGHT       Image height (default: 256)
--iterations ITERATIONS
                      Number of iterations (default: 5000)
--filename FILENAME.PPM
                      Output filename of the PPM picture
--rng-seed RNG_SEED   Random number generator seed (default: 1234)

When --filename option is used, the timing includes the time to create the PPM file.

Copyright (C) 2005 Carl Friedrich Bolz

Chaos game, bm_chaos benchmark

Image generated by bm_chaos (took 3 sec on CPython 3.5) with the command:

python3 performance/benchmarks/ --worker -l1 -w0 -n1 --filename chaos.ppm --width=512 --height=512 --iterations 50000


benchmark a pure-Python implementation of the AES block-cipher in CTR mode using the pyaes module.

The benchmark is slower on CPython 3 compared to CPython 2.7, because CPython 3 has no more “small int” type (int). The CPython 3 int type now always has an arbitrary size, as CPython 2.7 long type.

See pyaes: A pure-Python implementation of the AES block cipher algorithm and the common modes of operation (CBC, CFB, CTR, ECB and OFB).


DeltaBlue benchmark

Ported for the PyPy project. Contributed by Daniel Lindsley

This implementation of the DeltaBlue benchmark was directly ported from the V8’s source code, which was in turn derived from the Smalltalk implementation by John Maloney and Mario Wolczko. The original Javascript implementation was licensed under the GPL.

It’s been updated in places to be more idiomatic to Python (for loops over collections, a couple magic methods, OrderedCollection being a list & things altering those collections changed to the builtin methods) but largely retains the layout & logic from the original. (Ugh.)


Use the Django template system to build a 150x150-cell HTML table.

Use Context and Template classes of the django.template module.


Iterate on commits of the asyncio Git repository using the Dulwich module. Use performance/benchmarks/data/asyncio.git/ repository.

Pseudo-code of the benchmark:

repo = dulwich.repo.Repo(repo_path)
head = repo.head()
for entry in repo.get_walker(head):

See the Dulwich project.


The Computer Language Benchmarks Game:

Contributed by Sokolov Yura, modified by Tupteq.


Artificial, floating point-heavy benchmark originally used by Factor.

Create 100,000 point objects which compute math.cos(), math.sin() and math.sqrt()

Changed in version 0.5.5: Use __slots__ on the Point class to focus the benchmark on float rather than testing performance of class attributes.


Render a template using Genshi (genshi.template module):

  • genshi_text: Render a HTML template using the NewTextTemplate class
  • genshi_xml: Render an XML template using the MarkupTemplate class

See the Genshi project.


Artificial intelligence playing the Go board game. Use Zobrist hashing.


Solver of Hexiom board game (level 25 by default). Command line option:

--level {2,10,20,25,30,36}   Hexiom board level (default: 25)


Get Mercurial’s help screen.

Measure the performance of the python path/to/hg help command using perf.Runner.bench_command(), where python is sys.executable and path/to/hg is the Mercurial program installed in a virtual environmnent.

The bench_command() redirects stdout and stderr into /dev/null.

See the Mercurial project.


Parse the performance/benchmarks/data/w3_tr_html5.html HTML file (132 KB) using html5lib. The file is the HTML 5 specification, but truncated to parse the file in less than 1 second (around 250 ms).

On CPython, after 3 warmups, the benchmarks enters a cycle of 5 values: every 5th value is 10% slower. Plot of 1 run of 50 values (the warmup is not rendered):

html5lib values

See the html5lib project.

json_dumps, json_loads

Benchmark dumps() and loads() functions of the json module. command line option:

--cases CASES         Comma separated list of cases. Available cases: EMPTY,
                      SIMPLE, NESTED, HUGE. By default, run all cases.


Benchmarks on the logging module:

  • logging_format: Benchmark logger.warn(fmt, str)
  • logging_simple: Benchmark logger.warn(msg)

Script command line option:


See the logging module.


Use the Mako template system to build a 150x150-cell HTML table. Includes:

  • two template inherences
  • HTML escaping, XML escaping, URL escaping, whitespace trimming
  • function defitions and calls
  • forloops

See the Mako project.


Battle with damages and topological sorting of nodes in a graph.

See Topological sorting.


Solver for Meteor Puzzle board.

Meteor Puzzle board:

The Computer Language Benchmarks Game:

Contributed by Daniel Nanz, 2008-08-21.


N-body benchmark from the Computer Language Benchmarks Game. Microbenchmark on floating point operations.

This is intended to support Unladen Swallow’s Accordingly, it has been modified from the Shootout version:

  • Accept standard Unladen Swallow benchmark options.
  • Run report_energy()/advance() in a loop.
  • Reimplement itertools.combinations() to work with older Python versions.

Pulled from:

Contributed by Kevin Carson. Modified by Tupteq, Fredrik Johansson, and Daniel Nanz.

python_startup, python_startup_nosite

  • python_startup: Measure the Python startup time, run python -c pass where python is sys.executable
  • python_startup_nosite: Measure the Python startup time without importing the site module, run python -S -c pass where python is sys.executable

Run the benchmark with perf.Runner.bench_command().


Simple, brute-force N-Queens solver.

See Eight queens puzzle.


Test the performance of operations of the pathlib module.

This benchmark stresses the creation of small objects, globbing, and system calls.

On Python 3, use pathlib of the standard library. On Python 2, use the third-party pathlib2 module.

See the Python 3 pathlib module.


pickle benchmarks (serialize):

  • pickle: use the cPickle module to pickle a variety of datasets.
  • pickle_dict: microbenchmark; use the cPickle module to pickle a lot of dicts.
  • pickle_list: microbenchmark; use the cPickle module to pickle a lot of lists.
  • pickle_pure_python: use the pure-Python pickle module to pickle a variety of datasets.

unpickle benchmarks (deserialize):

  • unpickle: use the cPickle module to unnpickle a variety of datasets.
  • unpickle_list
  • unpickle_pure_python: use the pure-Python pickle module to unpickle a variety of datasets.


Calculating 2,000 digits of π. This benchmark stresses big integer arithmetic.

Command line option:

--digits DIGITS     Number of computed pi digits (default: 2000)

Adapted from code on:


Benchmark of a pure-Python bzip2 decompressor: decompress the performance/benchmarks/data/interpreter.tar.bz2 file in memory.

Copyright 2006–2007-01-21 Paul Sladen:

You may use and distribute this code under any DFSG-compatible license (eg. BSD, GNU GPLv2).

Stand-alone pure-Python DEFLATE (gzip) and bzip2 decoder/decompressor. This is probably most useful for research purposes/index building; there is certainly some room for improvement in the Huffman bit-matcher.

With the as-written implementation, there was a known bug in BWT decoding to do with repeated strings. This has been worked around; see ‘bwt_reverse()’. Correct output is produced in all test cases but ideally the problem would be found...


Simple raytracer.

Command line options:

--width WIDTH             Image width (default: 100)
--height HEIGHT           Image height (default: 100)
--filename FILENAME.PPM   Output filename of the PPM picture

This file contains definitions for a simple raytracer. Copyright Callum and Tony Garnock-Jones, 2008.

This file may be freely redistributed under the MIT license,


Pure Python raytracer

Image generated by the command (took 68.4 sec on CPython 3.5):

python3 performance/benchmarks/ --worker --filename=raytrace.ppm  -l1 -w0 -n1 -v --width=800 --height=600


Stress the performance of Python’s regex compiler, rather than the regex execution speed.

Benchmark how quickly Python’s regex implementation can compile regexes.

We bring in all the regexes used by the other regex benchmarks, capture them by stubbing out the re module, then compile those regexes repeatedly. We muck with the re module’s caching to force it to recompile every regex we give it.


regex DNA benchmark using “fasta” to generate the test case.

The Computer Language Benchmarks Game

regex-dna Python 3 #5 program: contributed by Dominique Wahli 2to3 modified by Justin Peel

fasta Python 3 #3 program: modified by Ian Osgood modified again by Heinrich Acker modified by Justin Peel Modified by Christopher Sean Forgeron


Some of the original benchmarks used to tune mainline Python’s current regex engine.


Python port of V8’s regex benchmark.

Automatically generated on 2009-01-30.

This benchmark is generated by loading 50 of the most popular pages on the web and logging all regexp operations performed. Each operation is given a weight that is calculated from an estimate of the popularity of the pages where it occurs and the number of times it is executed while loading each page. Finally the literal letters in the data are encoded using ROT13 in a way that does not affect how the regexps match their input.

Ported to Python for Unladen Swallow. The original JS version can be found at, r1243.


The classic Python Richards benchmark.

Based on a Java version.

Based on original version written in BCPL by Dr Martin Richards in 1981 at Cambridge University Computer Laboratory, England and a C++ version derived from a Smalltalk version written by L Peter Deutsch.

Java version: Copyright (C) 1995 Sun Microsystems, Inc. Translation from C++, Mario Wolczko Outer loop added by Alex Jacoby



Run a canned mailbox through a SpamBayes ham/spam classifier.

Data files from performance/benchmarks/data directory:

  • spambayes_mailbox: Mailbox file which contains 64 emails
  • spambayes_hammie.pkl: Ham data (serialized by pickle)

See the SpamBayes project.

Status at 2017-04-29 from Skip Montanaro: While the last commit was pushed in 2011 (svn r3273), the project is not dead: it is still actively used on Windows via the installer but also runs on for Python mailing lists. Sadly, it doesn’t support Python 3.


MathWorld: “Hundred-Dollar, Hundred-Digit Challenge Problems”, Challenge #3.

The Computer Language Benchmarks Game

Contributed by Sebastien Loisel. Fixed by Isaac Gouy. Sped up by Josh Goldfoot. Dirtily sped up by Simon Descarpentries. Concurrency by Jason Stitt.

sqlalchemy_declarative, sqlalchemy_imperative

  • sqlalchemy_declarative: SQLAlchemy Declarative benchmark using SQLite
  • sqlalchemy_imperative: SQLAlchemy Imperative benchmark using SQLite

See the SQLAlchemy project.


Benchmark Python aggregate for SQLite.

The goal of the benchmark (written for PyPy) is to test CFFI performance and going back and forth between SQLite and Python a lot. Therefore the queries themselves are really simple.

See the SQLite project and the Python sqlite3 module (stdlib).


Benchmark on the sympy module:

  • sympy_expand: Benchmark sympy.expand()
  • sympy_integrate: Benchmark sympy.integrate()
  • sympy_str: Benchmark str(sympy.expand())
  • sympy_sum: Benchmark sympy.summation()

On CPython, some sympy_sum values are 5%-10% slower:

$ python3 -m perf dump sympy_sum.json
Run 1: 1 warmup, 50 values, 1 loop
- warmup 1: 404 ms (+63%)
- value 1: 244 ms
- value 2: 245 ms
- value 3: 258 ms <----
- value 4: 245 ms
- value 5: 245 ms
- value 6: 279 ms (+12%) <----
- value 7: 246 ms
- value 8: 244 ms
- value 9: 245 ms
- value 10: 255 ms <----
- value 11: 245 ms
- value 12: 245 ms
- value 13: 256 ms <----
- value 14: 248 ms
- value 15: 245 ms
- value 16: 245 ms

Plot of 1 run of 50 values (the warmup is not rendered):

sympy_sum values

See the sympy project.


Telco Benchmark for measuring the performance of decimal calculations:

  • A call type indicator, c, is set from the bottom (least significant) bit of the duration (hence c is 0 or 1).
  • A rate, r, is determined from the call type. Those calls with c=0 have a low r: 0.0013; the remainder (‘distance calls’) have a ‘premium’ r: 0.00894. (The rates are, very roughly, in Euros or dollarates per second.)
  • A price, p, for the call is then calculated (p=r*n). This is rounded to exactly 2 fractional digits using round-half-even (Banker’s round to nearest).
  • A basic tax, b, is calculated: b=p*0.0675 (6.75%). This is truncated to exactly 2 fractional digits (round-down), and the total basic tax variable is then incremented (sumB=sumB+b).
  • For distance calls: a distance tax, d, is calculated: d=p*0.0341 (3.41%). This is truncated to exactly 2 fractional digits (round-down), and then the total distance tax variable is incremented (sumD=sumD+d).
  • The total price, t, is calculated (t=p+b, and, if a distance call, t=t+d).
  • The total prices variable is incremented (sumT=sumT+t).
  • The total price, t, is converted to a string, s.

The Python benchmark is implemented with the decimal module.

See the Python decimal module (stdlib).


Benchmark HTTP server of the tornado module

See the Tornado project.


Microbenchmark for unpacking lists and tuples.


a, b, c, d, e, f, g, h, i, j = to_unpack

where to_unpack is tuple(range(10)) or list(range(10)).


Benchmark the ElementTree API of the xml.etree module:

  • xml_etree_generate: Create an XML document
  • xml_etree_iterparse: Benchmark etree.iterparse()
  • xml_etree_parse: Benchmark etree.parse()
  • xml_etree_process: Process an XML document

See the Python xml.etree.ElementTree module (stdlib).