Faster Parallel Python Without Python Multiprocessing

  • I am a terrible python scriptwriter. Still, I use my own scripts that I have written through trial and error and reading lots of stack overflow. I process about 500 images per day and each one takes about 30 seconds each. Adrian Rosenbrock has been a real lifesaver. I have a machine dedicated to this. I tried using multiprocessing once and could not get it to work. Being able to process in a parallel fashion would be a gamechanger for me.

    The beauty of python for someone like me is that I can get my job done without actually having to do it. I free up my own time to leverage more of my creativity and have a multiplicative effect on my productivity.

    The reason I bring all of this up is that so many of the examples for advanced libraries I see are geared towards seasoned software engineers. The examples include false arrays of data so that you can "just see" how to use it. I don't think anyone realizes how confusing this is to the guy who is a restaurant manager, or the gal who is a researcher that just needs to know how to make this work for them.

    If it's an image, how about putting image = cv2.imread("C:\image.jpg") or whatever?

    Anyway, a bit of a rant but there are people who are very thankful that smart people in this world like yourself write libraries we can use to make our daily lives better. Including example code that is stupid simple would make me so much happier.

  • While I appreciate the efforts of authors and believe in long term mission, they seem to not mention anywhere some key shortcomings of Ray, while marketing it pretty hard (eg see the paper).

    I have used ray (a year ago) in one of the advertised basic applications: parallelising the environments for RL. It was unusable back then, as it was clogging up the memory.

    The plasma store which is backend for arrow was never cleaned which made the computation stop after 3 hours

    Here’s the issue:

    https://github.com/ray-project/ray/issues/2128

    Or perhaps this has been fixed already?

  • Every time I see a performance issue/solution of Python, I was wondering why there is no company maintaining a distribution with a high-performance Python JIT compiler with patched, GIL-free packages. Given the prevalence of Python and what have succeeded in JVM, it seems like a fruitful business.

  • I love Python but they best way to do do parallel Python is to use Go.