Datasets

This topic explains how to create, manage, and use datasets in offline evaluations. Datasets define the inputs AgentControl uses to evaluate model behavior before release. Datasets are only used during offline evaluations, and never for online evaluations.

Offline evaluations run config variations or LLM inputs against datasets that you upload. They score the outputs of those evaluations using criteria such as built-in scorers or judges defined as AgentControl configs. You can reuse the same dataset across multiple evaluation runs. This lets you compare variations and validate changes before you roll them out.

Prepare your dataset

A dataset is a file in CSV or JSONL format. Uploaded dataset files must be smaller than 32MB and have fewer than 10,000 rows. Each row represents a single evaluation task that LaunchDarkly evaluates independently during a run.

Each row can include the following fields:

input: The prompt or request sent to the model.
expected_output: The ideal or target output for the associated input. After you complete an evaluation, you can use this field to compare what you expected against what the model actually returns.
context: Supporting information provided alongside the input, such as retrieved documents or tool responses.
variables: Named values that populate placeholders in your config prompt templates at runtime.

LaunchDarkly generates one model output per row and evaluates it using the criteria you configure.

Example dataset row

1 { "input": "What is the price of the iPhone 15?", "expected_output": "$799" }

Use this structure to compare generated outputs against expected results for known scenarios.

Upload datasets

To use a dataset in an offline evaluation, upload a CSV or JSONL file.

In the left sidebar, click Agents. The AgentControl menu appears.
Click Library
Select the Datasets tab.
Click Upload dataset. The “Upload dataset” panel opens.
(Optional) Enter a Name for the dataset. If you don’t specify a name, the dataset will use the same name as the file you upload.
Drag and drop or click to select your dataset file.
Click Save dataset.

After you upload a dataset, LaunchDarkly validates and processes the file for use in evaluation runs. This includes validating the file format, detecting the dataset schema, and enforcing row and size limits. LaunchDarkly also computes a dataset hash for deduplication and stores dataset metadata.

When validation completes, the dataset “Status” field in the Datasets tab updates to “ready.” The dataset is now available for evaluation runs.

If validation fails, an error appears. If you correct the error, you can upload the dataset again.

Datasets in evaluations

When you configure an offline evaluation, you select a dataset to use as input. To view evaluations that have used a dataset:

Navigate to the library and click into the Datasets tab.
Find the dataset for which you wish to view more information.
Click the three-dot overflow menu and choose View evaluations. The “Evaluations” page opens.

Manage datasets

After you upload a dataset, it appears in the AgentControl library. You can now use it in evaluation runs.

As you use the dataset, the ”# of Evaluations” column in the Datasets tab updates to show how many evaluations have used that dataset.

Delete datasets

You can delete a dataset if you no longer need it. Here’s how:

Navigate to the library and click into the Datasets tab.
Find the dataset for which you wish to view more information.
Click the three-dot overflow menu and choose Delete. A confirmation message appears.
Verify that you wish to delete the dataset and click Delete dataset.