At my company, we developed an open source library to measure if the context the model received is accurate or not. While not exactly the same as what you're asking, you could in theory use it to measure when an LLM deviates from the context to tweak the LLM to not always use the provided context.
At my company, we developed an open source library to measure if the context the model received is accurate or not. While not exactly the same as what you're asking, you could in theory use it to measure when an LLM deviates from the context to tweak the LLM to not always use the provided context.
Shameless plug for the library: https://github.com/TonicAI/tvalmetrics