Metric winsorization

This topic describes how to use configure metric winsorization to eliminate extreme values in experiment results.

Restricted to Snowflake data sources

You can configure winsorization only for metrics created from Snowflake data sources, for use with warehouse native Experimentation. You cannot configure winsorization for metrics created from LaunchDarkly hosted events or from other warehouse data sources. To learn more, read Metric event sources.

Limiting the impact of extreme values on experiments

Winsorization is a statistical technique that replaces outlying values with values at a configured percentile. Using metric winsorization helps you limit the impact of extreme values in LaunchDarkly experiment results without introducing bias by selectively removing values.

For example, consider a metric that generates the following (sorted) latency values for a single randomization unit during the course of an experiment:

5, 120, 135, 140, 145, 150, 155, 160, 170, 950

The value 950 is an extreme outlier in this sample of data, and would significantly affect the mean latency value. The P90 value for the data is 170, so configuring winsorization for the metric at upper bound P90 percentile replaces all values higher than 170 with the value 170. This yields the modified data set:

5, 120, 135, 140, 145, 150, 155, 160, 170, 170

LaunchDarkly supports configuring one-sided or two-sided winsorization as needed to limit the impact of extremely high values, extremely low values, or both. For example, winsorizing the example metric at both the lower bound P10 percentile and the upper bound P90 percentile mitigates the lower outlying 5 value, yielding the modified data set:

120, 120, 135, 140, 145, 150, 155, 160, 170, 170

Common use cases

Winsorization is most commonly used with metrics that produce a long tail distribution of values. This generally corresponds to metrics that measure revenue, latency, or session duration. Using winsorization at the upper bound for P90 or higher percentiles limits the impact of extremely high values that would negatively skew experiment results.

How LaunchDarkly computes winsorization percentiles

LaunchDarkly computes the percentile values for winsorization using all metrics collected for a randomization unit across all arms of a warehouse native experiment. For metrics that use a metrics measurement window configuration, LaunchDarkly uses only those values collected within the configured window to determine the percentile values. If no window is configured, the measurement duration corresponds to the length of the experiment itself.

If you choose Include units and set the value to 0 for a numeric metric, LaunchDarkly does not include assigned zero values when it computes winsorization percentiles. To learn more, read Units without events.

Prerequisites and limitations

You can configure metric measurement winsorization for any warehouse native metric created from a Snowflake data source. To learn more, read Snowflake native Experimentation.

You cannot configure winsorization for metrics created from Launchdarkly hosted events or other warehouse data sources.

Configuring winsorization

You configure optional winsorization properties after you specify the analysis method for a warehouse native metric.

Configuring metric winsorization.

Configuring metric winsorization.

To configure winsorization:

  1. Open the Data section and navigate to the Metrics list.

  2. Click Create metric. The “Create metric” dialog appears.

  3. Select Warehouse native from the “Event source” drop-down menu.

  4. Select an available Snowflake data source from the “Metric data source” drop-down menu, or create a new data source. To learn more, read Metric data sources.

  5. Search for or enter an Event key to use for the metric.

  6. Choose the metric aggregation from the “Metric definition” section. The window populates a full metric definition using default values.

  7. Change options in the “Metric definition” drop-down menus as needed to change the analysis units or other metric analysis options. To learn more, read Components of a metric.

  8. (Optional) Choose Enable custom measurement window if you want to configure a metric measurement window. To learn more, read Metric measurement window.

  9. Choose Enable winsorization.

  10. Enter percentile values in the Lower bound and Upper bound fields as needed to specify the percentile value(s) used to winsorize extreme values. If you are configuring two-sided winsorization, the Upper bound value must be greater than the Lower bound value.

    Set Lower bound to zero, or Upper bound to 100, to disable winsorization for that bound.

  11. Enter a Metric name and and optional Description.

  12. Click Create.