Get started with EvalGate in 5 minutes
Start with the smallest useful version of EvalGate: one local gate that blocks test and eval regressions in CI. No account is required for that first path. Add the platform when you need dashboard traces, historical eval runs, LLM judge scoring, and review workflows.Zero-config quick start
No account required for local regression gating. Run two commands to create a baseline and add the CI gate.npx @evalgate/sdk init detects your package manager, runs your existing test script to capture a baseline, and scaffolds evals/baseline.json, evalgate.config.json, and .github/workflows/evalgate-gate.yml. When you push and open a PR, the installed CI workflow runs the same test script, compares test health against the baseline, and fails the build if the baseline regresses.
No API key or EvalGate account is needed for local regression gating. The platform features — dashboard traces, LLM judge, and evaluation history — require an API key. See the manual setup section below.
Manual setup with the platform
If you want dashboard traces, historical evaluation runs, and the LLM judge, create an account and follow these steps.Create an API key
Sign in to your EvalGate account and navigate to the Developer Dashboard. Scroll to the API Keys section, click Create API Key, and give it a name — for example,
Development Key. Select the scopes you need (start with all scopes for initial testing), then click Create Key.You’ll also see your Organization ID in the key creation dialog. Save that value alongside the key — you’ll need both.Configure environment variables
Create a Add The SDK reads both variables automatically — no additional configuration required.
.env file in your project root and add your credentials:.env
.env to your .gitignore immediately to avoid committing secrets:Initialize the client
Import and initialize the SDK in your application code. Calling
AIEvalClient.init() with no arguments auto-loads EVALGATE_API_KEY and EVALGATE_ORGANIZATION_ID from the environment.Create your first trace
A trace represents a single LLM interaction. Spans within the trace capture the individual steps — the model call, tool use, retrieval, or any sub-operation you want to observe.After running this code, the trace appears in your EvalGate dashboard under Traces.
Write your first eval
An eval suite defines test cases with inputs and assertions that verify your LLM’s output for correctness, safety, and quality. The suite runner handles execution, parallelism, and reporting.EvalGate includes 20+ built-in assertions covering text content, safety and compliance, JSON structure, quality, and numeric thresholds. Each assertion in a failing case surfaces a precise failure reason in run artifacts, the dashboard, and GitHub annotations when you use the platform
check --format github path.Add a CI regression gate
Once your evals are in place, add one step to your CI workflow to block regressions on every PR..github/workflows/evalgate.yml
.evalgate/, and compares results against the base branch when --base is provided. Add --impacted-only to run only specs affected by the current diff. With --format github, the command writes a GitHub step summary and emits annotations for failed or regressed specs. Exit codes: 0 for clean, 1 for regressions, 2 for a configuration issue.
What’s next
TypeScript SDK reference
Full API for traces, assertions, test suites, judge configuration, and CLI commands.
Python SDK reference
Python parity for all core workflows: traces, evals, gate, CI, and the assertion library.
CI/CD integration guide
Advanced CI configuration — custom base branches, JSON output, impact analysis, and GitLab CI.
Authentication
How to create and manage API keys, configure environment variables, and secure your credentials.