Autogenerated metrics

Overview

This topic explains the metrics LaunchDarkly automatically generates from SDK events and how you can use them to monitor the health of your applications.

Metric events

An “event” happens when someone takes an action in your app, such as clicking on a button, or when a system takes an action, such as loading a page. Your SDKs send these metric events to LaunchDarkly, where, for certain event kinds, LaunchDarkly can automatically create metrics from those events. You can use these metrics with experiments and guarded rollouts to track how your flag changes affect your customers’ behavior.

LaunchDarkly autogenerates metrics:

Autogenerated metrics are marked on the Metrics list with an autogenerated tag. You can view the events that autogenerated these metrics from the Metrics list by clicking View, then Events.

Randomization units for autogenerated metrics

LaunchDarkly sets the randomization unit for autogenerated metrics to your account’s default context kind for experiments. For most accounts, the default context kind for experiments is user. However, you may have updated your default context kind to account, device, or some other context kind you use in experiments most often. To learn how to change the default context kind for experiments, read Map randomization units to context kinds.

All autogenerated metrics are designed to work with a randomization unit of either user or request. Depending on your account’s default context kind for experiments, you may need to manually update the randomization unit for autogenerated metrics as needed. The recommended randomization units for each autogenerated metric are listed in the tables below. To learn how to manually update the randomization unit for a metric, read Edit metrics.

Metrics autogenerated from AI SDK events

An AI config is a resource that you create in LaunchDarkly and then use to customize, test, and roll out new large language models (LLMs) within your generative AI applications.

As soon as you start using AI configs in your application, you can track how your AI model generation is performing, and your AI SDKs begin sending events to LaunchDarkly. These events are prefixed with $ld:ai and LaunchDarkly automatically generates metrics from these events.

This table explains the metrics that are autogenerated from AI SDK events:

Event kindEvent keyMetric definitionRandomization unitMetric name and example usage
Custom$ld:ai:feedback:user:positive

Measurement method: Count
Unit aggregation method: Sum
Analysis method: Average
Success criterion: Higher is better
Units without events: N/A

User

Name: The average number of positive feedback ratings per user
Example usage: Running an experiment to find out which variation causes more users to click “thumbs up”

Custom$ld:ai:feedback:user:positive

Measurement method: Occurrence
Unit aggregation method: Average
Analysis method: Average
Success criterion: Higher is better
Units without events: N/A

Request

Name: Positive feedback ratio
Example usage: Running a guarded rollout to make sure there is a positive feedback ratio throughout the rollout

Custom$ld:ai:feedback:user:negative

Measurement method: Count
Unit aggregation method: Sum
Analysis method: Average
Success criterion: Lower is better
Units without events: N/A

User

Name: Average number of negative feedback ratings per user
Example usage: Running an experiment to find out which variation causes more users to click “thumbs down”

Numeric$ld:ai:tokens:input

Measurement method: Value/size
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: Exclude units that did not send any events

Request

Name: Average size of input per request
Example usage: Running an experiment to find out which variation results in fewer input tokens, reducing cost

Numeric$ld:ai:tokens:output

Measurement method: Value/size
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: Exclude units that did not send any events

Request

Name: Average size of output per request
Example usage: Running an experiment to find out which variation results in fewer output tokens, reducing cost

Numeric$ld:ai:tokens:total

Measurement method: Value/size
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: Exclude units that did not send any events

Request

Name: Average tokens per request
Example usage: Running an experiment to find out which variation results in fewer total tokens, reducing cost

Numeric$ld:ai:duration:total

Measurement method: Value/size
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: Exclude units that did not send any events

Request

Name: Average duration per request
Example usage: Running an experiment to find out which variation results in faster user completion, improving engagement

Custom$ld:ai:generation:success

Measurement method: Count
Unit aggregation method: Sum
Analysis method: Average
Success criterion: Higher is better
Units without events: N/A

User

Name: Average number of successful generations per user
Example usage: Running an experiment to find out which variation results in more user completion requests (“chattiness”), improving engagement

Custom$ld:ai:generation:error

Measurement method: Occurrence
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: N/A

Request

Name: Error rate (% of requests with an error)
Example usage: Running a guarded rollout to make sure the change doesn’t result in a higher error rate

Custom$ld:ai:generation:error

Measurement method: Occurrence
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: N/A

User

Name: Error rate (% of users that encountered an error)
Example usage: Running a guarded rollout to make sure the change doesn’t result in a higher error rate

Custom$ld:ai:generation:error

Measurement method: Count
Unit aggregation method: Sum
Analysis method: Average
Success criterion: Lower is better
Units without events: N/A

User

Name: Average number of errors each user encountered
Example usage: Running a guarded rollout to make sure the change doesn’t result in a higher number of errors

Custom$ld:ai:generation

Measurement method: Count
Unit aggregation method: Sum
Analysis method: Average
Success criterion: Higher is better
Units without events: N/A

User

Name: Average number of generations per user
Example usage: Running an experiment to find out which variation results in more user completion requests (“chattiness”), improving engagement

Example: Average number of positive feedback ratings per user

The autogenerated metric in the first row of the above table tracks the average number of positive feedback ratings per user.

Here is what the metric setup looks like in the LaunchDarkly user interface:

An autogenerated metric.

An autogenerated metric.

Metrics autogenerated from telemetry integration events

The LaunchDarkly telemetry integrations provide error monitoring and metric collection. Each telemetry integration is a separate package, which you install in addition to the LaunchDarkly SDK. After you initialize the telemetry integration, you register the LaunchDarkly SDK client with the telemetry instance. The instance collects and sends telemetry data to LaunchDarkly, where you can review metrics, events, and errors from your application.

This table explains the metrics that are autogenerated from events recorded by the telemetry integration for LaunchDarkly browser SDKs:

Event kindEvent keyMetric definitionRandomization unitMetric name and example usage
Custom$ld:telemetry:error

Measurement method: Occurrence
Unit aggregation method: Average
Analysis method: Average
Success criterion: Lower is better
Units without events: N/A

User

Name: Percentage of user contexts that experienced an error (SDK)
Example usage: Running a guarded rollout to make sure the change doesn’t result in a higher error rate

Built with