Who is building LLM Chatbots, and what issues are you running into?

  • Chain of thought is underutilized. It almost never makes sense to show the user the "bare" response of the LLM. It's so easy to have LLMs self-critique, think through user intent, etc. to drastically improve the final output

  • For example, for us, we are building an LLM chatbot that pulls in the data of a technical book publisher. They have 20 years of technical books, and 20 years of videotaped conference talks.

    Hard:

    - We're using LangChain, which isn't always great

    - The data pipeline was trickier than I had initially thought

    - Indexing embeddings (in PostGres) is just hard (requires tons of ram)

    But the hardest thing has been working on conversation quality. We've started to use LangSmith, which was a godsend for tracing and observability, and came out fairly recently. But it's not perfect and I wish there were better tools out there.