OPENAI Agents SDK notes

just putting here for my reference more than anything else

Jun 07, 2025

AI Agent- system that observes environment, thinks, acts, learns + remembers over time

When you use the run method in Runner, you pass in a starting agent and input. The input can either be a string (which is considered a user message), or a list of input items, which are the items in the OpenAI Responses API.

The runner then runs a loop:

We call the LLM for the current agent, with the current input.
The LLM produces its output.
1. If the LLM returns a final_output, the loop ends and we return the result.
2. If the LLM does a handoff, we update the current agent and input, and re-run the loop.
3. If the LLM produces tool calls, we run those tool calls, append the results, and re-run the loop.
If we exceed the max_turns passed, we raise a MaxTurnsExceeded exception.

Streaming - allows additional streaming events as LLM runs

Run-Config - configurations for agent

ie model, model temperature, tracing, workflow info

turns - logical flows in a chat convo, just done with handoffs either to user or other agent

result.to_input_list() turns output into input list, allows agent loops

RunItem - wraps raw item generated by LLM

MessageOutputItem indicates a message from the LLM. The raw item is the message generated.
HandoffCallItem indicates that the LLM called the handoff tool. The raw item is the tool call item from the LLM.
HandoffOutputItem indicates that a handoff occurred. The raw item is the tool response to the handoff tool call. You can also access the source/target agents from the item.
ToolCallItem indicates that the LLM invoked a tool.
ToolCallOutputItem indicates that a tool was called. The raw item is the tool response. You can also access the tool output from the item.
ReasoningItem indicates a reasoning item from the LLM. The raw item is the reasoning generated.

Streaming

allows you to get agent updates

raw events can be streamed to user as soon as they happen

run item stream event- more significant event such as when LLM finishes output, etc

Tools

Tools let agents take actions: things like fetching data, running code, calling external APIs, and even using a computer. There are three classes of tools in the Agent SDK:

Hosted tools: these run on LLM servers alongside the AI models. OpenAI offers retrieval, web search and computer use as hosted tools.
Function calling: these allow you to use any Python function as a tool.
Agents as tools: this allows you to use an agent as a tool, allowing Agents to call other agents without handing off to them

there are quite a few built in tools, which are quite handy - such as websearch tool, filesearch tool, code interpreter, etc

hosted mcp tool allows mcp tooling to be integrated

can do any python function as a tool as well

MCP Configuration

three types of MCP servers

stdio - run as subprocess of application
HTTP over SSE - remote servers you connect to by URL
Streamable HTTP - run remotely in Streamable HTTP transport

Tracing - captures MCP operations and MCP-related info on function calls

Handoffs

allow agents to hand tasks off to other agent

is considered a tool for each agent

can also send data in handoff

Context Management

local context - define your own objects, pass as arg to tooling(it will be wrapped in a generic type), access via wrapper.context

locally used in tools but thats it

LLM Context -

agent can only get info from chat history

way to include context is through system prompts

also add it to input in Runner.run

Guardrails

run alongside an agent(parallel), do checks and validations

if guardrail detects bad usage, can stop the model from wasting compute

guardrails run on input/output

multiple agent orchestration

2 ways of orchestrating agents

allow LLM to make decisions
determine flow of agents within code

agent is LLM equipped with tools and instructions

things that matter for llm decisions

good prompt engineering
monitoring
self-critique
specialized agents

things that matter for code orchestration

use structured outputs
chain multiple agents in a row
run agents in loop with RAG setup of evaluator and performer till good quality
multiple agents in parallel

agents SDK has model calls with both response API and chat completions

can use litellm for non openai

lucky's substack

Discussion about this post