Hacker News

LLM Stack for 2024 – Initial Survey

by yujianon 2/19/2024, 5:48:49 PM with 3 comments

by stevekaramon 2/19/2024, 7:35:28 PM
I think one of the biggest struggles small startups and practitioners are facing is lack of a good option between "I wonder if this works" and "ready for prime time." Running locally is an option with consumer hardware but is cost prohibitive for a team. Cloud providers are full of complications and hidden costs. Tools like Friendli and Bento are good but ambiguous on costs and get difficult to price end-to-end once you need the full stack of options. Hugging Face inference endpoints and other tools still seem like the best option around along with cloud DBs like Zilliz.
That said, it's no wonder people just pay extra for the simplicity of a slightly smarter endpoint like OpenAI. Sure, over time the costs are insane and you lack any flexibility to create a truly targeted solution, but it feels like an all-in-one easy fix.
by yujianon 2/19/2024, 5:48:49 PM
Hi everyone, I put together this survey of tools for the LLM Stack in 2024. I've linked the friend-link for the Medium article in the URL. I'd love feedback from you guys about any tools I've missed.
If you're a Medium member and want to support my writing, feel free to use the regular link - https://medium.com/plain-simple-software/the-llm-app-stack-2...
by cybereporteron 2/19/2024, 5:58:28 PM
This is great! Out of curiosity, what's the difference between choosing a dedicated vector database vs. a traditional database with vector indices (e.g. pgvector with postgres?