Traces API
Traces give EvalGate visibility into LLM calls and multi-step AI workflows. Each trace belongs to the organization associated with your API key. Each span captures one operation inside the trace, such as an LLM call, retrieval step, tool call, or agent step.POST /api/collector
Use the collector when you want to ingest one trace and all of its spans in a single request. The trace and spans are inserted transactionally.Response
Sampling controls whether an ingested trace is queued for failure analysis. Errors and thumbs-down feedback are always analyzed. Successful traces are sampled for analysis at the server default rate of 10%.
POST /api/collector/batch
Use the batch collector for up to 100 traces per request. Each trace is ingested independently, so one failed trace does not fail the whole batch.Response
GET /api/traces
Returns traces for the authenticated organization. Supports filtering and pagination.Maximum number of traces to return. Defaults to 50, maximum 100.
Number of results to skip for pagination. Defaults to 0.
Filter by trace status:
pending, success, or error.Filter by trace name using a partial match.
POST /api/traces
Creates a single trace record directly. Use this for low-volume programmatic workflows. Use/api/collector when you want to create the trace and spans together.
Display name for this trace.
Unique identifier string you assign. If omitted, EvalGate generates one.
Initial status:
pending, success, or error. Defaults to pending.End-to-end duration of the traced operation in milliseconds.
Arbitrary JSON object for model name, user ID, session ID, feature flags, or other context.
GET /api/traces/
Returns a single trace and its spans.Numeric database ID of the trace to retrieve.
POST /api/traces//spans
Adds a span to an existing trace.Numeric database ID of the parent trace.
Display name for this span.
Unique identifier string you assign to this span. If omitted, EvalGate generates one.
Span type, such as
llm, tool, or retrieval.The input to this operation.
The output from this operation.
Additional context such as token counts, latency, model name, or tool name.