Skip to content

Best practices for managing tokens in context with tool calling #1495

Open
@jpmasud

Description

@jpmasud

Question

Hello,

I'm implementing a chatbot which relies on tool calling to retrieve some data when the user asks for a relevant question. My challenge is that since a chat conversation can be multiple turns long, the tool data add up really quickly.

E.g.

  1. User: "Tell me what invoices are due this week"
  2. [Tool call retrieves raw data of invoices, stripped to essential info]
  3. [LLM converts the tool call output into an answer]
  4. User: "How about this month?"
  5. [Repeat tool call / re-structuring]
  6. User: "How about last 3 months?"
    (etc.)

So initially I tried a naive method of removing stale tool call information e.g. in above, when I'm at step 6, the "tool call raw data" in step 2 isn't required, so I can purge it from message history. But it requires some extra 're-processing" of the pydantic output (e.g. if I remove the tool call data, I also need to remove the tool call ID reference, restructure the ModelMessage, etc.)

But I'm not sure if that's the right way to go about it (since it feels I'm fighting "against" pydantic)?

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions