Python libraries for finance: Six of the best
Python is a popular language in finance. But there isn’t much you can do with just the core language. To help you out, just over 50 built in modules come built into the language. For example, if you wanted to calculate a discount curve you would need to the exponential and logarithmic functions, which can be found in the built in ‘math’ module. Still, this is well short of what a financial quant would need in order to do data analysis, backtesting, option pricing, or machine learning. For serious work you’re going to need to import some of the thousands of open source third party libraries that are available for Python. Here are half a dozen of the most popular.
NumPy is the starting point for financial Pythonistas, and you will struggle to find a Python installation that doesn’t have it. Kdnuggets says it was the 7th most popular library in 2018. The NumPy library allows you to manipulate arrays and matrices, as well as implementing functions for random number generation, which is neccessary for certain optimisation techniques such as boosting and bagging. Importantly much of the core code is written in C, making up for the relative slowness of Python.
SciPy builds on the basic functions provided by NumPy. It has a wide variety of functions which are vitally important for working with financial data, covering techniques such as linear algebra, signal processing, statistics, interpolation, and optimisation.
Once you have your data, you’re probably going to want to look at it. This is where Matplotlib comes in. It’s a visualisation module, allowing you to plot almost any concievable graph or chart, in 2 or 3 dimensions. Be warned - it takes a while to get used to Matplotlib’s API. The interface deliberately mimicks Matlab functionality, making it very irritating for native Python programmers. To address this Matplotlib actually has two APIs, but this just ends up being confusing. Nevertheless, it’s worth making the effort to learn how to use this powerful package.
Pandas builds on SciPy and NumPy and is a widely used library for data manipulation and analysis. In particular it’s perfect for manipulating time series data, so absolutely essential analysing the movements of prices in financial markets. Indeed, pandas was originally created by developers inside the giant quant fund AQR, before being released as an open source project.
Together numpy, SciPy, matplotlib, and pandas form part of the ‘NumPy / SciPy stack’. Installing this stack can be a little fiddly, due to the many dependencies. One solution to this is the Anaconda distribution, which installs these packages (and many more), and allows you to use them in a neat virtual environment which includes the popular Jupyter notebook which allows you to run Python examples inside your web browser.
However Anaconda installs over 200 libraries, so it’s not exactly lightweight. This would make deploying a Python based system for trading or risk management on a cloud computer or cluster an expensive business. An alternative is the more minimalist ‘miniconda’.
Another module that comes in the anaconda distribution is scikit-learn, which was the most popular Python machine learning package in 2008. Like pandas it is built on top of SciPy and NumPy, with visualisation driven by matplotlib. It handles most common machine learning techniques, including classification and clustering.
A large number of quant finance professionals still work in structuring and valuation. Fortunately for them there is a Python wrapped version of the widely used QuantLib C++ library for valuing and calculating the risk of financial derivatives. Because it has C++ at its core it is very fast. But because QuantLib isn’t a native python library, and there is no Python specific documentation, there is a steep learning curve to get it working. However a pure Python derivative pricing library would be far too slow for on demand pricing on a hectic trading desk.
Robert Carver is the former head of fixed income at quantitative hedge fund AHL, a former exotic options trader at Barclays investment bank, and the author of 'Systematic Trading' and 'Smart Portfolios’. Robert currently trades his own capital, using a fully automated Python based system that he developed himself.