Onyx: fault tolerant data processing for Clojure

  • This looks very interesting. I'm doing some log file processing in Apache Spark in Clojure. Spark is written in Scala, but has a Java API, which is wrapped by Flambo. It looks and feels entirely Clojure.

    The semantics look very similar indeed. Does anyone have a comparison between Onyx and Spark?

  • Hi folks! I'm Michael Drogalis - the primary author. I'm happy to answer any questions.

  • Checkout the original video introducing Onyx: http://youtu.be/vG47Gui3hYE

  • If this interests you, then you should also check out the post where Michael Drogalis first introduced this:

    http://michaeldrogalis.tumblr.com/post/98143185776/onyx-dist...

  • Re: Onyx's architecture. I would wonder about performance when keeping a shared log in ZooKeeper. Why not use something like Kafka -- it is designed for high-volume, immutable logging. ZK works best for less-frequently changing configuration, such as node connection information or snapshotting. I could be wrong. I'd like to hear your thoughts and experience.

  • Looks superficially simmilar to https://github.com/aphyr/tesser anyone know both and can give a comparison?

    From a brief examination tesser looks a lot simpler (probably because of encoding most of the folding using various monoids). Does onyx have a similar abstraction model that I missed?