Creating warehouse native experiments | LaunchDarkly

Overview

This topic explains how to set up and configure an experiment in LaunchDarkly that uses metric events from your own data warehouse.

Before you create a warehouse native experiment, you must enable Snowflake Data Export and configure the Warehouse Native Experimentation app in Snowflake for your LaunchDarkly account.

Configuring a warehouse native experiment requires several steps:

These steps are explained in detail below.

Prerequisites

Before you build a warehouse native experiment, you must:

enable Snowflake Data Export in your LaunchDarkly account
configure the Warehouse Native Experimentation app in Snowflake
understand randomization units

Create flags

Before you begin an experiment, create a flag the variations you plan to test the performance of. You do not need to toggle on the flag before you create an experiment, but you do have to toggle on the flag before you start an experiment iteration.

You cannot run a warehouse native experiment on a flag if:

the flag has an active guarded rollout
the flag has an active progressive rollout
the flag is already in a running experiment
the flag is a migration flag

You can build multiple warehouse native experiments on the same flag, but you can run only one of those experiments at a time.

To learn more, read Creating new flags and Creating flag variations.

Create metrics

Warehouse native experiments do not work with all metric types

Warehouse native experiments are compatible with custom conversion binary, custom conversion count, and custom numeric metrics that use the “average” analysis method. You cannot use clicked or tapped or page viewed metrics, you cannot use metric groups, and you cannot use metrics that use a percentile analysis method with warehouse native experiments.

Any metrics you use in a warehouse native experiment must be receiving metric events from your external warehouse. You can create new custom metrics for this, or you can use any existing metrics that already measure metric events from your external warehouse.

Build experiments

To build an experiment:

Click Create and choose Experiment. The “Create experiment” dialog appears.
Enter an experiment Name.
Enter a Hypothesis.
Click Create experiment. The experiment Design tab appears.
Select the Snowflake native experiment type.
Choose a context kind to Randomize by.
Select one or more Metrics. The metrics you select must use the “average” analysis method and must include custom events coming from Snowflake.
- (Optional) If you have added multiple metrics and want to change the primary metric, hover on the metric name and click primary.
- Click Create to create and use a new metric or new standard metric group.

A metric group with the primary metric called out.

Choose a Flag to use in the experiment.
- Click Create flag to create and use a new flag.
Choose a targeting rule for the Experiment audience.
- If you want to restrict your experiment audience to only contexts with certain attributes, create a targeting rule on the flag you include in the experiment and run the experiment on that rule.
- If you don’t want to restrict the audience for your experiment, run the experiment on the default rule. If the flag or doesn’t have any targeting rules, the default rule will be the only option.

The "Audience targeting" section with the default rule chosen.

(Optional) If you want to exclude contexts in this experiment from certain other experiments, click Add experiment to exclusion layer and select a layer.

Expand layer options

A layer is a set of experiments that cannot share traffic with each other. All of the experiments within a layer are mutually exclusive, which means that if a context is included in one experiment, LaunchDarkly will exclude it from any other experiments in the same layer.

To add the experiment to an existing layer:

Click Select layer.
Search for and choose the layer you want to add the experiment to.
Enter a Reservation amount. This is the percentage of the contexts within this layer you want LaunchDarkly to include in this experiment.
Click Save layer.

If you need to create a new layer:

Click Create layer.
Add a Name and Description.
Click Create layer.
Enter a Reservation amount. This is the percentage of the contexts within this layer you want LaunchDarkly to include in this experiment.
Click Save layer.

Choose the Variation served to users outside this experiment. Contexts that match the selected targeting rule but are not in the experiment will receive this variation.
Select the Sample size for the experiment. This is the percentage of all of the contexts that match the experiment’s targeting rule that you want to include in the experiment.
(Optional) Click Advanced to edit variation reassignment. For most experiments, we recommend leaving this option on its default setting. To learn more, read Carryover bias and variation reassignment.
(Optional) Click Edit to update the variation split for contexts that are in the experiment.
- You can Split equally between variations, or assign a higher percentage of contexts to some variations than others.
- Click Save audience split.
Select a variation to serve as the Control.
Select a Statistical approach of Bayesian or frequentist.
- If you selected a statistical approach of Bayesian, select a preset or Custom success threshold.
- If you selected a statistical approach of frequentist, select:
  - a Significance level.
  - a one-sided or two-sided Direction of hypothesis test.

Expand statistical approach options

You can select a statistical approach of Bayesian or Frequentist. Each approach includes one or more analysis options.

We recommend Bayesian when you have a small sample size of less than a thousand contexts, and we recommend Frequentist when you have a larger sample size of a thousand or more.

The Bayesian options include:

Threshold:
- 90% probability to beat control is the standard success threshold, but you can raise the threshold to 95% or 99% if you want to be more confident in your experiment results.
- You can lower the threshold to less than 90% using the Custom option. We recommend a lower threshold only when you are experimenting on non-critical parts of your app and are less concerned with determining a clear winning variation.

The frequentist options include:

Significance level:
- 0.05 p-value is the standard significance level, but you can lower the level to 0.01 or raise the level to 0.10, depending on if you need to be more or less confident in your results. A lower significance level means that you can be more confident in your winning variation.
- You can raise the significance level to more than 0.10 using the Custom option. We recommend a higher significance level only when you are experimenting on non-critical parts of your app and are less concerned with determining a clear winning variation.
Direction of hypothesis test:
- Two-sided: We recommend two-sided when you’re in doubt about whether the difference between the control and the treatment variations will be negative or positive, and want to look for indications of statistical significance in both directions.
- One-sided: We recommend one-sided when you feel confident that the difference between the control and treatment variations will be either negative or positive, and want to look for indications of statistical significance only in one direction.

To learn more, read Bayesian versus frequentist statistics.

(Optional) If you want to include the experiment in a holdout, click Advanced, then select a Holdout name.

Experiments cannot be in a holdout and in a layer at the same time

Experiments can either be in a holdout or in a layer, but not both. If you added the experiment to a layer, you will not see the option to add it to a holdout.

(Optional) If you want to be able to filter your experiment results by attribute, click Advanced, then select up to five context attributes to filter results by.
Scroll to the top of the page and click Save.

If needed, you can save your in-progress experiment design to finish later. To save your design, click Save at the top of the creation screen. Your in-progress experiment design is saved and appears on the Experiments list. To finish building the experiment, click on the experiment’s name and continue editing.

After you have created your experiment, the next step is to toggle on the flag. Then, you can start an iteration.

You can also use the REST API: Create experiment

Turn on feature flags

For an experiment to begin recording data, the flag used in the experiment must be on. To learn how, read Turning flags on and off.

Start experiment iterations

After you create an experiment and toggle on the flag, you can start an experiment iteration in one or more environments.

To start an experiment iteration:

Navigate to the Experiments list.
Click on the environment section containing the experiment you want to start.
- If the environment you need isn’t visible, click the + next to the list of environment sections. Search for the environment you want, and select it from the list.

Click on the name of the experiment you want to start an iteration for. The Design tab appears.
Click Start.
Repeat steps 1-4 for each environment you want to start an iteration in.

An experiment with the "Start" button called out.

Experiment iterations allow you to record experiments in individual blocks of time. To ensure accurate experiment results, when you make changes that impact an experiment, LaunchDarkly starts a new iteration of the experiment.

To learn more about starting and stopping iterations, read Starting and stopping experiment iterations..

You can also use the REST API: Create iteration