For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inTry it free
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
  • Get started
    • Overview
    • Onboarding
    • Get started
    • Launch Insights
    • LaunchDarkly architecture
    • LaunchDarkly vocabulary
  • AgentControl
    • AgentControl
      • Playgrounds
        • Offline evaluations
          • Datasets
        • Online evaluations
        • Run experiments with AgentControl
    • Manage AgentControl
  • Feature flags
    • Create flags
    • Target with flags
    • Flag templates
    • Manage flags
    • Code references
    • Contexts
    • Segments
  • Releases
    • Releasing features with LaunchDarkly
    • Release policies
    • Percentage rollouts
    • Progressive rollouts
    • Guarded rollouts
    • Feature monitoring
    • Release pipelines
    • Engineering insights
    • Release management tools
    • Applications and app versions
    • Change history
    • Restoring previous flag versions
  • Observability
    • Observability
    • Session replay
    • Error monitoring
    • Logs
    • Traces
    • Observability metrics
    • Product analytics events
    • LLM observability
    • Alerts
    • Dashboards
    • Service map
    • Vega for auto-remediation
    • Observability MCP server
    • Search specification
    • Observability settings
    • Observability integrations
  • Experimentation
    • Experimentation
    • Experiment metric types
    • Experiment configuration
    • Managing experiments
    • Analyzing experiments
    • Multi-armed bandits
    • Holdouts
  • Metrics and events
    • Metrics in LaunchDarkly
    • Creating metrics
    • Metric groups
    • Events
    • Autogenerated metrics
  • Warehouse native
    • Warehouse native metrics
    • Setting up external warehouses
    • Creating experiments using warehouse native metrics
  • Infrastructure
    • Connect apps and services to LaunchDarkly
    • LaunchDarkly in China and Pakistan
    • LaunchDarkly in the European Union (EU)
    • LaunchDarkly in federal environments
    • Public IP list
  • Your account
    • Projects
    • Views
    • Environments
    • Tags
    • Teams
    • Members
    • Roles
    • Account security
    • Feature previews
    • Billing and usage
    • Changelog
Sign inTry it free
LogoLogo
On this page
  • Overview
  • Prepare your dataset
  • Example dataset row
  • Upload datasets
  • Datasets in evaluations
  • Manage datasets
  • Delete datasets
AgentControlConfig evaluationsEvaluationsOffline evaluations

Datasets

Was this page helpful?
Previous

Online evaluations

Next
Built with

Overview

This topic explains how to create, manage, and use datasets for offline evaluations. Datasets define the inputs AgentControl uses to evaluate model behavior before release.

Offline evaluations run config variations or LLM inputs against uploaded datasets. They score the outputs of those evaluations using criteria such as built-in scorers or judges defined as AgentControl configs. Datasets support repeatable evaluation workflows. You can reuse the same dataset across multiple evaluation runs to compare variations and validate changes before rollout.

Prepare your dataset

A dataset is a file in CSV or JSONL format. Each row represents a single evaluation task that LaunchDarkly evaluates independently during a run.

Each row can include the following fields:

  • input: The prompt or request sent to the model.
  • expected_output: The ideal or target output for the associated input. After you complete an evaluation, you can use this field to compare what you expected against what the model actually returns.
  • context: Supporting information provided alongside the input, such as retrieved documents or tool responses.
  • variables: Named values that populate placeholders in your config prompt templates at runtime.

LaunchDarkly generates one model output per row and evaluates it using the criteria you configure.

Example dataset row

Example dataset row
1{ "input": "What is the price of the iPhone 15?", "expected_output": "$799" }

Use this structure to compare generated outputs against expected results for known scenarios.

Upload datasets

To use a dataset in an offline evaluation, upload a CSV or JSONL file.

  1. In the left navigation, click Agents. The AgentControl menu appears.
  2. Click Library
  3. Select the Datasets tab.
  4. Click Upload dataset. The “Upload dataset” panel opens.
  5. (Optional) Enter a Name for the dataset. If you don’t specify a name, the dataset will use the same name as the file you upload.
  6. Drag and drop or click to select your dataset file.
  7. Click Save dataset.

After you upload a dataset, LaunchDarkly validates and processes the file for use in evaluation runs. This includes validating the file format, detecting the dataset schema, and enforcing row and size limits. LaunchDarkly also computes a dataset hash for deduplication and stores dataset metadata.

When validation completes, the dataset “Status” field in the Datasets tab updates to “ready.” The dataset is now available for evaluation runs.

If validation fails, an error appears. If you correct the error, you can upload the dataset again.

Datasets in evaluations

When you configure an offline evaluation, you select a dataset to use as input. To view evaluations that have used a dataset:

  1. Navigate to the library and click into the Datasets tab.
  2. Find the dataset for which you wish to view more information.
  3. Click the three-dot overflow menu and choose View evaluations. The “Evaluations” page opens.

Manage datasets

After you upload a dataset, it appears in the AgentControl library. You can now use it in evaluation runs.

As you use the dataset, the ”# of Evaluations” column in the Datasets tab updates to show how many evaluations have used that dataset.

Delete datasets

You can delete a dataset if you no longer need it. Here’s how:

  1. Navigate to the library and click into the Datasets tab.
  2. Find the dataset for which you wish to view more information.
  3. Click the three-dot overflow menu and choose Delete. A confirmation message appears.
  4. Verify that you wish to delete the dataset and click Delete dataset.