What are Tests?
Tests are automated regression tests that validate your agent’s behavior. Unlike Evaluations (which run after every real call), Tests run on-demand in the Advanced Editor to verify your agent responds correctly to specific scenarios before you deploy changes.Run tests after modifying prompts, knowledge base, or actions to catch regressions before they affect real customers.
Accessing Tests
Tests are located in the right panel of the Advanced Editor. Click the Tests tab (clipboard icon) to access them.Tests vs Evaluations
| Feature | Tests | Evaluations |
|---|---|---|
| When they run | On-demand in the editor | After every real call |
| Purpose | Validate specific scenarios before deploy | Measure call quality over time |
| Setup | Define conversation flow + success criteria | Define yes/no questions |
| Output | Pass/fail per test | Pass/fail per evaluation per call |
Test Types
Scenario Tests
A Scenario test defines a fixed conversation flow. You specify the exact messages (user and agent) and a success condition. The system evaluates whether the agent’s response meets your criteria.| Property | Description | Example |
|---|---|---|
| Chat history | Alternating user/agent messages | User: “Hola” → Agent: “Buenos días…” |
| Success condition | What the agent should achieve | ”El agente responde de manera profesional y útil” |
| Success examples | Sample responses that should pass | ”Buenos días, ¿en qué puedo ayudarle?” |
| Failure examples | Sample responses that should fail | ”No tengo idea” |
Simulation Tests
A Simulation test uses an AI to simulate a user. You define a persona, goal, and first message. The system runs a full conversation and evaluates the outcome.| Property | Description | Example |
|---|---|---|
| First message | How the simulated user starts | ”Hola, buenos días” |
| Persona | Simulated user’s profile | ”Cliente interesado en información” |
| Goal | What the simulated user wants | ”Obtener información sobre el servicio” |
Creating a Test
Step 2 - Conversation
For Scenario: Add the conversation flow (user/agent messages). For Simulation: Set first message, persona, and goal.
Step 3 - Criteria
Define the success condition and add success/failure examples to guide the evaluation.
Context Variables
Tests can use dynamic variables that replace placeholders in your agent’s prompts during test execution. Common use cases:| Variable | Purpose | Example |
|---|---|---|
fecha_y_hora_actual | Current date/time in Spanish | ”Hoy es Miércoles 12 de Febrero de 2026 a las 14:30” |
| Input parameters | Agent-specific variables | Customer name, order number, etc. |
Setting Context Variables
- Click the Context variables icon (code brackets) in the Tests header
- Enter values for each variable
- Click the refresh icon next to
fecha_y_hora_actualto update the current time - Run tests - variables are injected into the agent’s context
Running Tests
Run All Tests
Click Run All to execute every test. Results appear in the History tab.Run Selected Tests
- Select tests using the checkboxes
- Click Run All (runs selected when any are selected) or the play button on individual test cards
Run a Single Test
Click the play icon (▶) on any test card to run just that test.History Tab
The History tab shows past test runs:| Column | Description |
|---|---|
| Date | When the run was executed |
| Status | Pass count, fail count, or running |
| Actions | View details, retry failed, retry all |
Viewing Run Details
- Switch to the History tab
- Click a run to open the detail modal
- See individual test results with agent responses and evaluation reasons
- Use Retry failed to re-run only failed tests
- Use Retry all to re-run the entire batch
Simulate Conversation
The Simulate feature lets you run a one-off simulation without creating a test:- Click the chat bubble icon in the Tests header
- Configure first message, persona, and goal
- Set turn limit (default 10)
- Click to start - watch the simulated conversation unfold in real-time
Simulate is useful for quick exploratory testing. Use Scenario/Simulation tests when you need repeatable, automated validation.
Managing Tests
Edit a Test
Click the menu (⋮) on a test card → Edit. Modify the conversation, criteria, or examples.Clone a Test
Click the menu (⋮) → Clone to create a copy. Useful for creating variations (e.g., different first message, same criteria).Delete a Test
Click the menu (⋮) → Delete and confirm.Filter Tests
Use the filter dropdown to show:- All Tests - Every test
- Passing - Only tests that passed last run
- Failing - Only tests that failed last run
Test Results
Each test card shows its last result:| Icon | Status | Meaning |
|---|---|---|
| ✓ | Pass | Agent response met success criteria |
| ✗ | Fail | Agent response did not meet criteria |
| ⟳ | Running | Test is currently executing |
| ○ | Pending | Not run yet |
Best Practices
Test Critical Paths
Test Critical Paths
Create tests for your most important flows: greetings, main use case, objections, and compliance.
Use Success/Failure Examples
Use Success/Failure Examples
Provide clear examples so the LLM evaluator understands what “good” and “bad” responses look like.
Set Context Variables
Set Context Variables
If your agent uses or input parameters, set them in the context panel before running tests.
Run Before Publish
Run Before Publish
Run your test suite before publishing changes to catch regressions early.
Combine with Copilot
Combine with Copilot
After Copilot suggests changes, create tests to validate those improvements.

