Skip to main content

What are Tests?

Tests are automated regression tests that validate your agent’s behavior. Unlike Evaluations (which run after every real call), Tests run on-demand in the Advanced Editor to verify your agent responds correctly to specific scenarios before you deploy changes.
Run tests after modifying prompts, knowledge base, or actions to catch regressions before they affect real customers.

Accessing Tests

Tests are located in the right panel of the Advanced Editor. Click the Tests tab (clipboard icon) to access them.

Tests vs Evaluations

FeatureTestsEvaluations
When they runOn-demand in the editorAfter every real call
PurposeValidate specific scenarios before deployMeasure call quality over time
SetupDefine conversation flow + success criteriaDefine yes/no questions
OutputPass/fail per testPass/fail per evaluation per call

Test Types

Scenario Tests

A Scenario test defines a fixed conversation flow. You specify the exact messages (user and agent) and a success condition. The system evaluates whether the agent’s response meets your criteria.
PropertyDescriptionExample
Chat historyAlternating user/agent messagesUser: “Hola” → Agent: “Buenos días…”
Success conditionWhat the agent should achieve”El agente responde de manera profesional y útil”
Success examplesSample responses that should pass”Buenos días, ¿en qué puedo ayudarle?”
Failure examplesSample responses that should fail”No tengo idea”

Simulation Tests

A Simulation test uses an AI to simulate a user. You define a persona, goal, and first message. The system runs a full conversation and evaluates the outcome.
PropertyDescriptionExample
First messageHow the simulated user starts”Hola, buenos días”
PersonaSimulated user’s profile”Cliente interesado en información”
GoalWhat the simulated user wants”Obtener información sobre el servicio”

Creating a Test

1

Open Tests Panel

Click the Tests tab in the right panel.
2

Click New

Click the “New” button to start the creation wizard.
3

Step 1 - Name & Type

Enter a descriptive name and choose Scenario or Simulation.
4

Step 2 - Conversation

For Scenario: Add the conversation flow (user/agent messages). For Simulation: Set first message, persona, and goal.
5

Step 3 - Criteria

Define the success condition and add success/failure examples to guide the evaluation.
6

Save

The test is created and appears in the list.

Context Variables

Tests can use dynamic variables that replace placeholders in your agent’s prompts during test execution. Common use cases:
VariablePurposeExample
fecha_y_hora_actualCurrent date/time in Spanish”Hoy es Miércoles 12 de Febrero de 2026 a las 14:30”
Input parametersAgent-specific variablesCustomer name, order number, etc.

Setting Context Variables

  1. Click the Context variables icon (code brackets) in the Tests header
  2. Enter values for each variable
  3. Click the refresh icon next to fecha_y_hora_actual to update the current time
  4. Run tests - variables are injected into the agent’s context
Use fecha_y_hora_actual when your agent greets with “Buenos días” or references the current date. Tests run with the values you set, not the live time.

Running Tests

Run All Tests

Click Run All to execute every test. Results appear in the History tab.

Run Selected Tests

  1. Select tests using the checkboxes
  2. Click Run All (runs selected when any are selected) or the play button on individual test cards

Run a Single Test

Click the play icon (▶) on any test card to run just that test.

History Tab

The History tab shows past test runs:
ColumnDescription
DateWhen the run was executed
StatusPass count, fail count, or running
ActionsView details, retry failed, retry all

Viewing Run Details

  1. Switch to the History tab
  2. Click a run to open the detail modal
  3. See individual test results with agent responses and evaluation reasons
  4. Use Retry failed to re-run only failed tests
  5. Use Retry all to re-run the entire batch

Simulate Conversation

The Simulate feature lets you run a one-off simulation without creating a test:
  1. Click the chat bubble icon in the Tests header
  2. Configure first message, persona, and goal
  3. Set turn limit (default 10)
  4. Click to start - watch the simulated conversation unfold in real-time
Simulate is useful for quick exploratory testing. Use Scenario/Simulation tests when you need repeatable, automated validation.

Managing Tests

Edit a Test

Click the menu (⋮) on a test card → Edit. Modify the conversation, criteria, or examples.

Clone a Test

Click the menu (⋮) → Clone to create a copy. Useful for creating variations (e.g., different first message, same criteria).

Delete a Test

Click the menu (⋮) → Delete and confirm.

Filter Tests

Use the filter dropdown to show:
  • All Tests - Every test
  • Passing - Only tests that passed last run
  • Failing - Only tests that failed last run

Test Results

Each test card shows its last result:
IconStatusMeaning
PassAgent response met success criteria
FailAgent response did not meet criteria
RunningTest is currently executing
PendingNot run yet
Click a test card to see inline details: agent response, evaluation reason, and full conversation when available.

Best Practices

Create tests for your most important flows: greetings, main use case, objections, and compliance.
Provide clear examples so the LLM evaluator understands what “good” and “bad” responses look like.
If your agent uses or input parameters, set them in the context panel before running tests.
Run your test suite before publishing changes to catch regressions early.
After Copilot suggests changes, create tests to validate those improvements.

Next Steps