Instrument AI Agents
Learn how to manually instrument your code to use Sentry's Agents module.
With Sentry AI Agent Monitoring, you can monitor and debug your AI systems with full-stack context. You'll be able to track key insights like token usage, latency, tool usage, and error rates. AI Agent Monitoring data will be fully connected to your other Sentry data like logs, errors, and traces.
As a prerequisite to setting up AI Agent Monitoring with Python, you'll need to first set up tracing. Once this is done, the Python SDK will automatically instrument AI agents created with supported libraries. If that doesn't fit your use case, you can use custom instrumentation described below.
The Python SDK supports automatic instrumentation for some AI libraries. We recommend adding their integrations to your Sentry configuration to automatically capture spans for AI agents.
For your AI agents data to show up in the Sentry AI Agents Insights, two spans must be created and have well-defined names and data attributes. See below.
Describes AI agent invocation.
- The spans
op
MUST be"gen_ai.invoke_agent"
. - The span
name
SHOULD be"invoke_agent {gen_ai.agent.name}"
. - The
gen_ai.operation.name
attribute MUST be"invoke_agent"
. - The
gen_ai.agent.name
attribute SHOULD be set to the agent's name. (e.g."Weather Agent"
) - All Common Span Attributes SHOULD be set (all
required
common attributes MUST be set).
Additional attributes on the span:
Data Attribute | Type | Requirement Level | Description | Example |
---|---|---|---|---|
gen_ai.request.available_tools | string | optional | List of objects describing the available tools. [0] | "[{\"name\": \"random_number\", \"description\": \"...\"}, {\"name\": \"query_db\", \"description\": \"...\"}]" |
gen_ai.request.frequency_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.max_tokens | int | optional | Model configuration parameter. | 500 |
gen_ai.request.messages | string | optional | List of objects describing the messages (prompts) sent to the LLM [0], [1] | "[{\"role\": \"system\", \"content\": [{...}]}, {\"role\": \"system\", \"content\": [{...}]}]" |
gen_ai.request.presence_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.temperature | float | optional | Model configuration parameter. | 0.1 |
gen_ai.request.top_p | float | optional | Model configuration parameter. | 0.7 |
gen_ai.response.tool_calls | string | optional | The tool calls in the model’s response. [0] | "[{\"name\": \"random_number\", \"type\": \"function_call\", \"arguments\": \"...\"}]" |
gen_ai.response.text | string | optional | The text representation of the model’s responses. [0] | "[\"The weather in Paris is rainy\", \"The weather in London is sunny\"]" |
gen_ai.usage.input_tokens.cached | int | optional | The number of cached tokens used in the AI input (prompt) | 50 |
gen_ai.usage.input_tokens | int | optional | The number of tokens used in the AI input (prompt). | 10 |
gen_ai.usage.output_tokens.reasoning | int | optional | The number of tokens used for reasoning. | 30 |
gen_ai.usage.output_tokens | int | optional | The number of tokens used in the AI response. | 100 |
gen_ai.usage.total_tokens | int | optional | The total number of tokens used to process the prompt. (input and output) | 190 |
- [0]: Span attributes only allow primitive data types (like
int
,float
,boolean
,string
). This means you need to use a stringified version of a list of dictionaries. Do NOT set[{"foo": "bar"}]
but rather the string"[{\"foo\": \"bar\"}]"
. - [1]: Each message item uses the format
{role:"", content:""}
. Therole
can be"user"
,"assistant"
, or"system"
. Thecontent
can be either a string or a list of dictionaries.
import sentry_sdk
sentry_sdk.init(...)
# some example agent implementation for demonstration
my_agent = MyAgent(
name="Weather Agent",
model_provider="openai",
model="o3-mini",
)
with sentry_sdk.start_span(
op="gen_ai.invoke_agent",
name=f"invoke_agent {my_agent.name}",
) as span:
# set data about LLM and agent
span.set_data("gen_ai.operation.name", "invoke_agent")
span.set_data("gen_ai.system", my_agent.model_provider)
span.set_data("gen_ai.request.model", my_agent.model)
span.set_data("gen_ai.agent.name", my_agent.name)
# run the agent
result = my_agent.run()
# set agent response
# we assume result.output is a dictionary
# type of `gen_ai.response.text` needs to be a string
span.set_data("gen_ai.response.text", json.dumps([result.output]))
# set token usage
# we assume the result includes the tokens used
span.set_data("gen_ai.usage.input_tokens", result.usage.input_tokens)
span.set_data("gen_ai.usage.output_tokens", result.usage.output_tokens)
This span represents a request to an AI model or service that generates a response or requests a tool call based on the input prompt.
- The span
op
MUST be"gen_ai.{gen_ai.operation.name}"
. (e.g."gen_ai.chat"
) - The span
name
SHOULD be{gen_ai.operation.name} {gen_ai.request.model}"
. (e.g."chat o3-mini"
) - All Common Span Attributes SHOULD be set (all
required
common attributes MUST be set).
Additional attributes on the span:
Data Attribute | Type | Requirement Level | Description | Example |
---|---|---|---|---|
gen_ai.request.available_tools | string | optional | List of objects describing the available tools. [0] | "[{\"name\": \"random_number\", \"description\": \"...\"}, {\"name\": \"query_db\", \"description\": \"...\"}]" |
gen_ai.request.frequency_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.max_tokens | int | optional | Model configuration parameter. | 500 |
gen_ai.request.messages | string | optional | List of objects describing the messages (prompts) sent to the LLM [0], [1] | "[{\"role\": \"system\", \"content\": [{...}]}, {\"role\": \"system\", \"content\": [{...}]}]" |
gen_ai.request.presence_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.temperature | float | optional | Model configuration parameter. | 0.1 |
gen_ai.request.top_p | float | optional | Model configuration parameter. | 0.7 |
gen_ai.response.tool_calls | string | optional | The tool calls in the model's response. [0] | "[{\"name\": \"random_number\", \"type\": \"function_call\", \"arguments\": \"...\"}]" |
gen_ai.response.text | string | optional | The text representation of the model's responses. [0] | "[\"The weather in Paris is rainy\", \"The weather in London is sunny\"]" |
gen_ai.usage.input_tokens.cached | int | optional | The number of cached tokens used in the AI input (prompt) | 50 |
gen_ai.usage.input_tokens | int | optional | The number of tokens used in the AI input (prompt). | 10 |
gen_ai.usage.output_tokens.reasoning | int | optional | The number of tokens used for reasoning. | 30 |
gen_ai.usage.output_tokens | int | optional | The number of tokens used in the AI response. | 100 |
gen_ai.usage.total_tokens | int | optional | The total number of tokens used to process the prompt. (input and output) | 190 |
- [0]: Span attributes only allow primitive data types. This means you need to use a stringified version of a list of dictionaries. Do NOT set
[{"foo": "bar"}]
but rather the string"[{\"foo\": \"bar\"}]"
. - [1]: Each message item uses the format
{role:"", content:""}
. Therole
can be"user"
,"assistant"
, or"system"
. Thecontent
can be either a string or a list of dictionaries.
import sentry_sdk
sentry_sdk.init(...)
# some example implementation for demonstration
my_ai = MyAI(
model_provider="openai",
model="o3-mini",
model_config={
"temperature": 0.1,
"presence_penalty": 0.5,
}
)
with sentry_sdk.start_span(
op="gen_ai.chat",
name=f"chat {my_ai.model}",
) as span:
# set data about LLM
span.set_data("gen_ai.operation.name", "chat")
span.set_data("gen_ai.system", my_ai.model_provider)
span.set_data("gen_ai.request.model", my_ai.model)
# set up messages for LLM
max_tokens = 1024
prompt = "Tell me a joke"
messages = [
{"role": "user", "content": prompt},
]
# set chat request data
span.set_data("gen_ai.request.message", json.dumbs(messages))
span.set_data("gen_ai.request.max_tokens", max_tokens)
span.set_data(
"gen_ai.request.temperature",
my_ai.model_config.get("temperature"),
)
span.set_data(
"gen_ai.request.presence_penalty",
my_ai.model_config.get("presence_penalty"),
)
# ask the LLM
result = my_ai.messages.create(
messages=messages,
max_tokens=max_tokens,
)
# set response
# we assume result.output is a dictionary
# type of `gen_ai.response.text` needs to be a string
span.set_data("gen_ai.response.text", json.dumps([result.output]))
# set token usage
# we assume the result includes the tokens used
span.set_data("gen_ai.usage.input_tokens", result.usage.input_tokens)
span.set_data("gen_ai.usage.output_tokens", result.usage.output_tokens)
Describes a tool execution.
- The span
op
MUST be"gen_ai.execute_tool"
. - The span
name
SHOULD be"gen_ai.execute_tool {gen_ai.tool.name}"
. (e.g."gen_ai.execute_tool query_database"
) - The
gen_ai.tool.name
attribute SHOULD be set to the name of the tool. (e.g."query_database"
) - All Common Span Attributes SHOULD be set (all
required
common attributes MUST be set).
Additional attributes on the span:
Data Attribute | Type | Requirement Level | Description | Example |
---|---|---|---|---|
gen_ai.tool.description | string | optional | Description of the tool executed. | "Tool returning a random number" |
gen_ai.tool.input | string | optional | Input that was given to the executed tool as string. | "{\"max\":10}" |
gen_ai.tool.name | string | optional | Name of the tool executed. | "random_number" |
gen_ai.tool.output | string | optional | The output from the tool. | "7" |
gen_ai.tool.type | string | optional | The type of the tools. | "function" ; "extension" ; "datastore" |
import sentry_sdk
sentry_sdk.init(...)
# some example implementation for demonstration
my_ai = MyAI(
model_provider="openai",
model="o3-mini",
)
with sentry_sdk.start_span(op="gen_ai.chat", ...):
# .. some code ..
result = my_ai.messages.create(
messages=messages,
max_tokens=1024,
)
# .. some code ..
# we assume the llm tells us to call a tool in the result
if my_should_call_tool(result):
# we parse the result to know which tool to call
tool = my_get_tool_to_call(result)
with sentry_sdk.start_span(
op="gen_ai.execute_tool",
name=f"gen_ai.execute_tool {tool.name}"
) as span:
# set data about LLM and tool
span.set_data("gen_ai.system", my_ai.model_provider)
span.set_data("gen_ai.request.model", my_ai.model)
span.set_data("gen_ai.tool.type", "function")
span.set_data("gen_ai.tool.name", tool.name)
span.set_data("gen_ai.tool.description", tool.description)
# run tool
tool_result = my_run_tool(tool)
# set tool result
span.set_data("gen_ai.tool.output", json.dumps(tool_result))
A span that describes the handoff from one agent to another.
- The spans
op
MUST be"gen_ai.handoff"
. - The spans
name
SHOULD be"handoff from {from_agent} to {to_agent}"
. - All Common Span Attributes SHOULD be set.
import sentry_sdk
sentry_sdk.init(...)
# some example agent implementation for demonstration
my_agent = MyAgent(
name="Weather Agent",
model_provider="openai",
model="o3-mini",
)
with sentry_sdk.start_span(op="gen_ai.invoke_agent", ...):
# .. some code ..
result = my_agent.run()
# .. some code ..
# we assume the llm tells us to handoff to another agent
if my_should_handoff(result):
# we parse the result to know which agent to handoff to
other_agent = my_get_agent(result)
with sentry_sdk.start_span(
op="gen_ai.handoff",
name=f"handoff from {my_agent.name} to {other_agent.name}",
):
# the handoff span just marks the handoff
pass
with sentry_sdk.start_span(op="gen_ai.invoke_agent", ...):
# .. some code ..
other_agent.run()
# .. some code ..
Some attributes are common to all AI Agents spans:
Data Attribute | Type | Requirement Level | Description | Example |
---|---|---|---|---|
gen_ai.system | string | required | The Generative AI product as identified by the client or server instrumentation. [0] | "openai" |
gen_ai.request.model | string | required | The name of the AI model a request is being made to. | "o3-mini" |
gen_ai.operation.name | string | optional | The name of the operation being performed. [1] | "chat" |
gen_ai.agent.name | string | optional | The name of the agent this span belongs to. | "Weather Agent" |
[0] Well-defined values for data attribute gen_ai.system
:
Value | Description |
---|---|
"anthropic" | Anthropic |
"aws.bedrock" | AWS Bedrock |
"az.ai.inference" | Azure AI Inference |
"az.ai.openai" | Azure OpenAI |
"cohere" | Cohere |
"deepseek" | DeepSeek |
"gcp.gemini" | Gemini |
"gcp.gen_ai" | Any Google generative AI endpoint |
"gcp.vertex_ai" | Vertex AI |
"groq" | Groq |
"ibm.watsonx.ai" | IBM Watsonx AI |
"mistral_ai" | Mistral AI |
"openai" | OpenAI |
"perplexity" | Perplexity |
"xai" | xAI |
[1] Well-defined values for data attribute gen_ai.operation.name
:
Value | Description |
---|---|
"chat" | Chat completion operation such as OpenAI Chat API |
"create_agent" | Create GenAI agent |
"embeddings" | Embeddings operation such as OpenAI Create embeddings API |
"execute_tool" | Execute a tool |
"generate_content" | Multimodal content generation operation such as Gemini Generate Content |
"invoke_agent" | Invoke GenAI agent |
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").