The Achilles heel of the BEAM is that if it crashes in native code then it has no way to recover and its much vaunted robustness goes out the window. So writing native hooks in Rust makes it a bit harder to crash the whole VM.
On the plus side it makes IPC pretty straightforward, so you can move the processes that need the native code (NIFs) to a separate VM if you’re feeling paranoid.
FYI: your preview image from the html header meta tag is broken.
[flagged]
I've been thinking a lot about how to accomplish various RAG things in Elixir (for LLM applications). PDF is one of the missing pieces, so glad to see work here. The really tricky part is not just parsing out the text (you can just call the pdftotext unix command line utility for that), but accurately pulling out things like complex tables, etc in a way that could be chunked/post processed in a useful way. I'd love to see something like Unstructured or Marker but in Rust (i.e., fast) that Elixir could NIF out to it. And maybe some kind of hybrid system that uses open llm models with vision capabilities. Ref:
- https://github.com/Unstructured-IO/unstructured
- https://github.com/VikParuchuri/marker