Annotations API — human labeling and review

Create annotation tasks, assign traces for human review, and submit labels to build the golden dataset used for measuring LLM judge credibility.

Human annotations are the foundation of judge credibility in EvalGate. When you label a set of traces as pass or fail, those labels become the ground truth that the LLM Judge alignment endpoint compares against automated judge scores. A judge with high alignment against a well-labeled dataset is one you can trust to gate your CI pipeline.

GET /api/annotations/tasks — list annotation tasks

Returns annotation tasks for the authenticated organization.

curl https://evalgate.com/api/annotations/tasks \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "tasks": [
    {
      "id": 12,
      "name": "Support quality review — March",
      "status": "in_progress",
      "itemCount": 120,
      "completedCount": 87,
      "organizationId": "00000000-0000-4000-8000-000000000001",
      "createdAt": "2026-03-01T09:00:00.000Z"
    }
  ]
}

tasks

array

Show Task fields

integer

Unique task ID.

name

string

Display name for the task.

status

string

Task status: draft, in_progress, or completed.

itemCount

integer

Total number of items (traces) assigned to this task.

completedCount

integer

Number of items that have received a label.

organizationId

string

Owning organization UUID.

createdAt

string

ISO 8601 creation timestamp.

POST /api/annotations/tasks — create an annotation task

Creates a new task and assigns a set of traces for labeling.

curl https://evalgate.com/api/annotations/tasks \
  -X POST \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Support quality review — March",
    "traceIds": [42, 43, 44, 45]
  }'

Request body

name

string

required

Display name for this annotation task.

traceIds

array

required

Array of numeric trace IDs to include in this task. Each trace will become one annotation item.

instructions

string

Optional guidance text shown to annotators when they open the task.

labelOptions

array

Optional array of label strings annotators can choose from. Defaults to ["pass", "fail"] when not specified.

Response (201)

{
  "id": 13,
  "name": "Support quality review — March",
  "status": "draft",
  "itemCount": 4,
  "completedCount": 0,
  "organizationId": "00000000-0000-4000-8000-000000000001",
  "createdAt": "2026-03-15T11:30:00.000Z"
}

GET /api/annotations/tasks/ — get task details

Returns a single annotation task with its items.

curl https://evalgate.com/api/annotations/tasks/12 \
  -H "Authorization: Bearer YOUR_API_KEY"

Path parameters

integer

required

Numeric ID of the annotation task.

Response

{
  "id": 12,
  "name": "Support quality review — March",
  "status": "in_progress",
  "items": [
    {
      "id": 201,
      "traceId": 42,
      "label": "pass",
      "notes": "Clear and complete response",
      "labeledAt": "2026-03-10T14:22:00.000Z",
      "labeledBy": "user@example.com"
    },
    {
      "id": 202,
      "traceId": 43,
      "label": null,
      "notes": null,
      "labeledAt": null,
      "labeledBy": null
    }
  ]
}

items

array

Show Item fields

integer

Unique item ID.

traceId

integer

ID of the trace this item references.

label

string | null

The label assigned by the annotator. null if not yet labeled.

notes

string | null

Optional free-text notes from the annotator.

labeledAt

string | null

ISO 8601 timestamp when the label was submitted. null if not yet labeled.

labeledBy

string | null

Email or identifier of the annotator who submitted the label.

POST /api/annotations/tasks//items — submit an annotation

Submits a label for a single annotation item within a task.

curl https://evalgate.com/api/annotations/tasks/12/items \
  -X POST \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "itemId": 202,
    "label": "fail",
    "notes": "Response did not address the user'\''s core question"
  }'

Path parameters

integer

required

Numeric ID of the annotation task.

Request body

itemId

integer

required

Numeric ID of the annotation item to label.

label

string

required

The label to assign. Must be one of the task’s configured labelOptions, or pass / fail by default.

notes

string

Optional free-text notes explaining the label decision. These are stored alongside the label for audit and inter-rater review.

Response

{
  "id": 202,
  "traceId": 43,
  "label": "fail",
  "notes": "Response did not address the user's core question",
  "labeledAt": "2026-03-15T12:05:00.000Z",
  "labeledBy": "user@example.com"
}

Once a task has enough labels, run the LLM Judge alignment check to measure how well your automated judge agrees with your team’s ground truth. A high-alignment judge is safe to use as an automated CI gate.

​Annotations API — human labeling and review

​GET /api/annotations/tasks — list annotation tasks

​Response

​POST /api/annotations/tasks — create an annotation task

​Request body

​Response (201)

​GET /api/annotations/tasks/ — get task details

​Path parameters

​Response

​POST /api/annotations/tasks//items — submit an annotation

​Path parameters

​Request body

​Response

Annotations API — human labeling and review

GET /api/annotations/tasks — list annotation tasks

Response

POST /api/annotations/tasks — create an annotation task

Request body

Response (201)

GET /api/annotations/tasks/ — get task details

Path parameters

Response

POST /api/annotations/tasks//items — submit an annotation

Path parameters

Request body

Response