Introduction
|
Numpy comprises primarily compiled, explicitly-typed code, which allows optimisations that can’t be done for Python’s interpreted, implicitly typed code
Avoid Python loops and using whole-array functions to get better performance
Using compiled code will frequently improve performance over Numpy. This can be done in a compiled language, or by using Numba on Python code.
|
Whole-array operations
|
Numpy will broadcast operations to all elements, and has many functions to implement commonly-performed tasks across arrays
Conditionals can be recast as array masks, arrays of true/false values
Use numpy.einsum for more complex array multiplications and reductions, but only where the performance boost is worth the reduced readability
|
Broadcasting
|
Numpy automatically expands smaller arrays to match the shape of larger ones
Axes are read right to left, and must be either the same size or size 1
Where one array has more dimensions, the smaller array is interpreted as having size 1 on the additional axes
|
Custom ufuncs
|
|
Generalised ufuncs
|
Use the @guvectorize decorator to turn elemental functions into generalised ufuncs
Both a signature (showing datatypes and dimensionalities) and a layout (showing relationships between indices) are required
Explicitly initialise the output array within your generalised ufunc where possible
|
Compiling regular functions with Numba
|
Use the @jit decorator to just-in-time compile plain Python functions (operating on Numpy arrays or otherwise)
Use the nopython=True argument to @jit to raise an error if something can’t be compiled, so you know to fix it to get maximum speed
|