Fix Assistant OpenAI adapter to handle message content structure returned by to_hash method #952

IMhide · 2025-04-12T20:56:05Z

While using the gem for the first time I was discovering Langchain::Assistant and experimenting with it.
While using naively I came across an issue :

@assistant = Langchain::Assistant.new(
  llm: Langchain::LLM::OpenAI.new(
	api_key: ENV['OPENAI_API_KEY'],
	default_options: {
	  temperature: 0.7,
	  chat_model: 'gpt-4o-mini'
	}
  ),
  instructions: "FOO BAR ZOO",
)
@assistant.add_message(content: "Foo bar Zoo") 
messages_hash = @assistant.messages.map(&:to_hash)

Here message_hash equal

[{role: "system", content: [{type: "text", text: "FOO BAR ZOO"}]}, {role: "user", content: [{type: "text", text: "Foo bar Zoo"}]}]

From there I was thinking great I can just persist messages_hash somewhere and then use the method @assistant.add_messages to resume the conversation
But this happened :

@assistant.clear_messages!
@assistant.add_messages(messages: messages_hash)
resumed_hash = @assistant.messages.map(&:to_hash)

resumed_hash equal this

[{role: "system", content: [{type: "text", text: "[{type: \"text\", text: \"FOO BAR ZOO\"}]"}]},
 {role: "user", content: [{type: "text", text: "[{type: \"text\", text: \"Foo bar Zoo\"}]"}]}]

The message_hash get stringified and nested into a new hash

After some investigation I found out that this behaviour comes from the function build_message at /lib/langchain/assistant/llm/adapters/openai.rb:42

def build_message(role:, content: nil, image_url: nil, tool_calls: [], tool_call_id: nil)
	Messages::OpenAIMessage.new(role: role, content: content, image_url: image_url, tool_calls: tool_calls, tool_call_id: tool_call_id)
end

That ignore the fact that the method to_hash, from the same object, transform content into an hash that merge text message and image url

This PR is extracted from a monkey patch I made in my project :

module Langchain
  class Assistant
    module LLM
      module Adapters
        class OpenAI < Base
          def build_message(role:, content: nil, image_url: nil, tool_calls: [], tool_call_id: nil)
            if content.is_a?(Array)
              content.each do |c|
                content = c[:text] if c[:type] == 'text'
                image_url = c[:image_url][:url] if c[:type] == 'image_url'
              end
            end
            Messages::OpenAIMessage.new(
              role: role,
              content: content,
              image_url: image_url,
              tool_calls: tool_calls,
              tool_call_id: tool_call_id
            )
          end
        end
      end
    end
  end
end

I also added tests for the #build_message that covers the previous behavior and the one added

… order to accept .to_hash content array Adding test case

Copilot

Pull Request Overview

This PR fixes an issue with the OpenAI adapter in Langchain where messages returned by the to_hash method were being stringified when re-added to the assistant. The changes update the build_message method to handle an array of content hashes and add tests to verify both string and array content scenarios.

Updated build_message to process content arrays by extracting text and image URL values.
Extended the unit tests to cover both string and array-based message content.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
spec/langchain/assistant/llm/adapters/openai_spec.rb	Added tests for build_message with string and array content
lib/langchain/assistant/llm/adapters/openai.rb	Updated build_message to process array content as expected

Comments suppressed due to low confidence (2)

lib/langchain/assistant/llm/adapters/openai.rb:43

Consider adding explicit handling or documentation for scenarios where the content array contains multiple 'text' or 'image_url' elements, as the current implementation will override earlier values in favor of later ones.

if content.is_a?(Array)

spec/langchain/assistant/llm/adapters/openai_spec.rb:50

[nitpick] Consider adding tests to cover scenarios where the array contains multiple entries for 'text' or 'image_url' to ensure the intended behavior is maintained.

context "when content is an array" do

Patching Langchain::Assistant::Llm::Adapters::Openai#build_message in…

6bbb1de

… order to accept .to_hash content array Adding test case

IMhide changed the title ~~Fix Assistant OpenAI adapter to handle message content structure returned by to_hash message method~~ Fix Assistant OpenAI adapter to handle message content structure returned by to_hash method Apr 12, 2025

sergiobayona requested a review from andreibondarev April 17, 2025 18:00

Merge branch 'main' into to_hash_add_messages_issue

c9fd8f2

andreibondarev requested a review from Copilot April 21, 2025 17:01

Copilot AI reviewed Apr 21, 2025

View reviewed changes

Merge branch 'main' into to_hash_add_messages_issue

e188787

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Assistant OpenAI adapter to handle message content structure returned by to_hash method #952

Fix Assistant OpenAI adapter to handle message content structure returned by to_hash method #952

IMhide commented Apr 12, 2025 •

edited

Loading

Copilot AI left a comment

Fix Assistant OpenAI adapter to handle message content structure returned by to_hash method #952

Are you sure you want to change the base?

Fix Assistant OpenAI adapter to handle message content structure returned by to_hash method #952

Conversation

IMhide commented Apr 12, 2025 • edited Loading

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

IMhide commented Apr 12, 2025 •

edited

Loading