Profiling your Numba code

  • I still remember discovering & using numba for the first time in university. At the time I didn't know any programming languages except for Python, and wasn't super great at Python either. We had to write and run molecular simulations for our numerical methods and simulations class, and people were writing their code in C/C++ and were blazing through it. I remember finding out about Numba through the High Performance Python O'Reilly book, and by just adding a single @jit decorator, my simulations were dramatically faster.

    Amazing stuff, honestly felt like black magic at the time.

  • Do people use numba outside of scientific computing code? feel like for most "normal" uses, you'd probably be fine with having your code written in numpy or pandas, so I'd love to hear what people generally use numba for outside of scientific computing.

  • I didn't realize Numba supported CUDA:

    https://numba.readthedocs.io/en/stable/cuda/index.html

  • This addresses a real pain point, of Numba not being so easy to profile, (especially compared to Python and line-profiler), so will definitely be trying this out!

  • I tend to disagree with the following sentence mentioned in the article:

    > One hypothesis is instruction-level parallelism

    This is Python code, whose execution has a massive gap to the actual CPU instructions executed. The experiment result feels more like something related to the memory cache.