AI Agent- system that observes environment, thinks, acts, learns + remembers over time
When you use the run method in Runner
, you pass in a starting agent and input. The input can either be a string (which is considered a user message), or a list of input items, which are the items in the OpenAI Responses API.
The runner then runs a loop:
We call the LLM for the current agent, with the current input.
The LLM produces its output.
If the LLM returns a
final_output
, the loop ends and we return the result.If the LLM does a handoff, we update the current agent and input, and re-run the loop.
If the LLM produces tool calls, we run those tool calls, append the results, and re-run the loop.
If we exceed the
max_turns
passed, we raise aMaxTurnsExceeded
exception.
Streaming - allows additional streaming events as LLM runs
Run-Config - configurations for agent
ie model, model temperature, tracing, workflow info
turns - logical flows in a chat convo, just done with handoffs either to user or other agent
result.to_input_list()
turns output into input list, allows agent loops
RunItem - wraps raw item generated by LLM
MessageOutputItem
indicates a message from the LLM. The raw item is the message generated.HandoffCallItem
indicates that the LLM called the handoff tool. The raw item is the tool call item from the LLM.HandoffOutputItem
indicates that a handoff occurred. The raw item is the tool response to the handoff tool call. You can also access the source/target agents from the item.ToolCallItem
indicates that the LLM invoked a tool.ToolCallOutputItem
indicates that a tool was called. The raw item is the tool response. You can also access the tool output from the item.ReasoningItem
indicates a reasoning item from the LLM. The raw item is the reasoning generated.
Streaming
allows you to get agent updates
raw events can be streamed to user as soon as they happen
run item stream event- more significant event such as when LLM finishes output, etc
Tools
Tools let agents take actions: things like fetching data, running code, calling external APIs, and even using a computer. There are three classes of tools in the Agent SDK:
Hosted tools: these run on LLM servers alongside the AI models. OpenAI offers retrieval, web search and computer use as hosted tools.
Function calling: these allow you to use any Python function as a tool.
Agents as tools: this allows you to use an agent as a tool, allowing Agents to call other agents without handing off to them
there are quite a few built in tools, which are quite handy - such as websearch tool, filesearch tool, code interpreter, etc
hosted mcp tool allows mcp tooling to be integrated
can do any python function as a tool as well
MCP Configuration
three types of MCP servers
stdio - run as subprocess of application
HTTP over SSE - remote servers you connect to by URL
Streamable HTTP - run remotely in Streamable HTTP transport
Tracing - captures MCP operations and MCP-related info on function calls
Handoffs
allow agents to hand tasks off to other agent
is considered a tool for each agent
can also send data in handoff
Context Management
local context - define your own objects, pass as arg to tooling(it will be wrapped in a generic type), access via wrapper.context
locally used in tools but thats it
LLM Context -
agent can only get info from chat history
way to include context is through system prompts
also add it to input in Runner.run
Guardrails
run alongside an agent(parallel), do checks and validations
if guardrail detects bad usage, can stop the model from wasting compute
guardrails run on input/output
multiple agent orchestration
2 ways of orchestrating agents
allow LLM to make decisions
determine flow of agents within code
agent is LLM equipped with tools and instructions
things that matter for llm decisions
good prompt engineering
monitoring
self-critique
specialized agents
things that matter for code orchestration
use structured outputs
chain multiple agents in a row
run agents in loop with RAG setup of evaluator and performer till good quality
multiple agents in parallel
agents SDK has model calls with both response API and chat completions
can use litellm for non openai