Ask HN: What is the right way to fuse my enterprise data with an LLM?

  • I think an important baseline question is how much you care when the user of the LLM forces your model to divulge sensitive details or large chunks of that data.

    Imagine that the LLM portion is running as client-side browser JavaScript application: Nothing in the training data or prompt is reliably-secret, and a determined user can get it to emit almost anything they like to whatever is downstream.

  • Unless you can define and measure expected business value and risk, walk away and let someone else spend money and take the risk on a mostly hype- and FOMO-driven fad. If LLMs ever actually deliver real business value we'll all have time to integrate them.

  • Prompt stuffing: Quick, dirty, and like cramming for an exam, works until it doesn’t.

    Fine tuning: Great if you have static data and deep pockets for compute.

    RAG inclusive Vector DB: The gold standard. Think of it as having your data whisper the answers to the LLM.

    With AI Squared, you can keep your data fresh, dynamic, and external because nobody wants to retrain a model every time the boss changes their mind. :D