OpenTelemetry metrics

Overview

This topic describes metrics that LaunchDarkly autogenerates from OpenTelemetry events.

LaunchDarkly’s SDKs support instrumentation for OpenTelemetry traces. Traces provide an overview of how your application handles requests. For example, traces may show that a particular feature flag was evaluated for a particular context as part of a given HTTP request. When LaunchDarkly receives OpenTelemetry trace data, it processes and converts this data into events that LaunchDarkly metrics track over time.

There are two types of events that LaunchDarkly creates from OpenTelemetry traces: route-specific events and global events. Route-specific events are useful when you are experimenting with a change that is known to impact a small subset of your server’s HTTP routes. Global events are useful when you believe your change may impact all routes, or when you are not sure of the impact of your change.

To learn more, read OpenTelemetry for server-side SDKs.

OpenTelemetry events are prefixed with otel. LaunchDarkly automatically creates the following metrics from the events that LaunchDarkly produces from your OpenTelemetry trace data. This trace data includes the feature flag and the context for which you evaluated the flag. You can also create these metrics manually if you wish.

These expandable sections explain the metrics that LaunchDarkly autogenerates from OpenTelemetry traces:

Metric kind: Custom conversion binary

Suggested randomization unit: User

Definition:

  • Measurement method: Occurrence
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Include units that did not send any events and set their value to 0

Description: Measures the percentage of users that encountered an error inside HTTP spans at least once, as reported by OpenTelemetry. Useful when running a guarded rollout.

Metric kind: Custom conversion binary

Suggested randomization unit: Request

Definition:

  • Measurement method: Occurrence
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Include units that did not send any events and set their value to 0

Examples:

  • http.error;method=GET;route=/api/v2/flags
  • http.error;method=PATCH;route=/api/v2/flags/{id}

Metric kind: Custom conversion binary

Suggested randomization unit: User

Definition:

  • Measurement method: Occurrence
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Include units that did not send any events and set their value to 0

Description: Measures the percentage of users that encountered an HTTP 5XX response at least once, as reported by OpenTelemetry. Useful when running a guarded rollout.

Metric kind: Custom conversion binary

Suggested randomization unit: Request

Definition:

  • Measurement method: Occurrence
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Include units that did not send any events and set their value to 0

Examples:

  • http.5XX;method=GET;route=/api/v2/flags
  • http.5XX;method=PATCH;route=/api/v2/flags/{id}

Metric kind: Custom conversion binary

Suggested randomization unit: User

Definition:

  • Measurement method: Occurrence
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Include units that did not send any events and set their value to 0

Description: Measures the percentage of users that encountered an exception outside of HTTP spans at least once, as reported by OpenTelemetry. Useful when running a guarded rollout.

Metric kind: Custom numeric

Suggested randomization unit: Request

Definition:

  • Measurement method: Value/size
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Exclude units that did not send any events

Description: Measures the average request latency, as reported by OpenTelemetry. Useful when running a guarded rollout. For best results, use a ‘request’ randomization unit and send ‘request’ contexts.

Metric kind: Custom numeric

Suggested randomization unit: Request

Definition:

  • Measurement method: Value/size
  • Unit aggregation method: Average
  • Analysis method: P95
  • Success criterion: Lower is better
  • Units without events: Exclude units that did not send any events

Description: Measures the 95th percentile request latency, as reported by OpenTelemetry. For many applications, this represents the experience for most requests. You can adjust the percentile to fit your application’s needs. Useful when running a guarded rollout. For best results, use a ‘request’ randomization unit and send ‘request’ contexts.

Metric kind: Custom numeric

Suggested randomization unit: Request

Definition:

  • Measurement method: Value/size
  • Unit aggregation method: Average
  • Analysis method: P99
  • Success criterion: Lower is better
  • Units without events: Exclude units that did not send any events

Description: Measures the 99th percentile request latency, as reported by OpenTelemetry. For many applications, this represents the worst-case experiences. You can adjust the percentile to fit your application’s needs. Useful when running a guarded rollout. For best results, use a ‘request’ randomization unit and send ‘request’ contexts.

Metric kind: Custom numeric

Suggested randomization unit: Request

Definition:

  • Measurement method: Value/size
  • Unit aggregation method: Average
  • Analysis method: Average
  • Success criterion: Lower is better
  • Units without events: Exclude units that did not send any events

Examples:

  • http.latency;method=GET;route=/api/v2/flags
  • http.latency;method=PATCH;route=/api/v2/flags/{id}