Description
Question
Hello,
I'm implementing a chatbot which relies on tool calling to retrieve some data when the user asks for a relevant question. My challenge is that since a chat conversation can be multiple turns long, the tool data add up really quickly.
E.g.
- User: "Tell me what invoices are due this week"
- [Tool call retrieves raw data of invoices, stripped to essential info]
- [LLM converts the tool call output into an answer]
- User: "How about this month?"
- [Repeat tool call / re-structuring]
- User: "How about last 3 months?"
(etc.)
So initially I tried a naive method of removing stale tool call information e.g. in above, when I'm at step 6, the "tool call raw data" in step 2 isn't required, so I can purge it from message history. But it requires some extra 're-processing" of the pydantic output (e.g. if I remove the tool call data, I also need to remove the tool call ID reference, restructure the ModelMessage, etc.)
But I'm not sure if that's the right way to go about it (since it feels I'm fighting "against" pydantic)?
Additional Context
No response