Experimentation in LaunchDarkly: Feature Roundup

Key Takeaways

Unified Feature Management and Experimentation: LaunchDarkly merges feature management with experimentation to enhance the impact understanding of each release.
Robust Experimentation Tools: The platform offers a comprehensive set of tools for data visualization, multivariate testing, and funnel experiments.
Dynamic Experiment Control: Precise control over experiment variables is achieved through flexible randomization and advanced audience bucketing.
Reusable Components and Integration: LaunchDarkly enhances efficiency by enabling the reuse of components and integration with business intelligence tools.
Security and Collaboration: Strong security features and integration with enterprise tools facilitate secure and collaborative experiment management.
Future Enhancements: LaunchDarkly is advancing with AI-assisted designs and improved statistical methods for more efficient and accurate experiments.

Pairing feature management and release with experimentation is a natural fit for building exceptional user experiences. This combination allows you to understand the business impact of every release, from major features to minor bug fixes. You will no longer be rolling the dice and hoping for the best—experimentation allows you to measure, analyze, and fine-tune your product based on user data.

Having feature management and experimentation built into the same tooling and processes reduces the potential technical debt or miscommunication that can occur when bouncing between tools.

Let’s dive into LaunchDarkly’s experimentation feature set and see what is possible.

LaunchDarkly offers robust feature management tools that safely enable the continuous release of new features and provide developers and product managers with a unified platform for conducting experiments on every feature (tool bloat, be gone!) From multivariate testing to metric groups, funnel experiments, and new ways to visualize and explore your data, LaunchDarkly has you covered when measuring the impact of every feature.

Experiment interface

LaunchDarkly's experiment interface offers an intuitive way for experienced and novice testers to monitor and manage features, configurations, and release statuses across various environments. This comprehensive dashboard is fully customizable, enabling users to prioritize and easily switch between environments that matter most to their projects, thus facilitating smooth cross-system workflows.

Experiment Design Workflow

The first step to running an experiment is to set up all the design parameters. What is the intended goal of the experiment — specifically, which metrics are you looking to improve? Is this an A/B test, or are you testing multiple variants simultaneously?
LaunchDarkly’s experiment Design Workflow makes setting up and running experiments directly from any feature flag super simple. This user-friendly system caters to advanced experimenters and those new to the field. It guides you through creating, managing, and iterating on experiments, from specifying the type of experiment to creating and assigning metrics within metric groups.

Before launching an experiment, you can make a draft for your team to review and initiate the test with just a click. The workflow also provides a complete view of the history of experiments, including past designs, changes, and results, offering valuable insights at a glance. Thanks to its seamless integration with feature flags, LaunchDarkly enables the easy rollout of successful variations without needing code changes, streamlining the process from design to deployment.

Key metric groups

LaunchDarkly’s flexible metrics definition engine allows teams to easily define and track key metrics (for example, conversion rates, error rates, or latency metrics). These metrics help track everything from URL access frequency to page load times, aligning with your business goals to gauge the impact of flag variations over time.

This functionality is supported directly in LaunchDarkly's SDKs, enabling robust tracking within customer applications. We support data ingestion via the open standard, OpenTelemetry, or error tracking through integrations like Sentry for broader compatibility.

Metrics can be defined as standalone metrics or a larger metric group that can be used to create multiple metrics in an experiment. LaunchDarkly facilitates easy access to experiment iterations, historical data, and results, allowing for the seamless implementation of winning variations without needing code changes or redeployments.

Our metrics configuration allows you to identify further and analyze metrics based on the metric analysis method (either by mean or percentile), the unit aggregation method (by average or by sum), or even to track units without events (excludes or sets to zero).

Advanced metric controls

LaunchDarkly's advanced metric controls streamline your experimentation with handy tools for seamless setup. Let’s dive into these advanced metric controls, which allow you to save time while keeping your experiments clean and effective.

Metric ‘recipe’ groups

Struggling to get started with defining what your standard metric should be? "Metric recipes" can help you create standard or common metric types that fit your use case. Users can filter results by context attributes for enhanced analysis. From breaking down common examples like e-commerce workflows calculating average purchase price to tracking latency, you can see how to create metrics and follow along to understand better how they can be used to identify areas of improvement to drive business value.

Flexible randomization units

In addition to audience allocation tools, LaunchDarkly offers flexible randomization units. An experiment’s randomization unit is the context used to assign traffic to each variation. If you choose the user context kind as an experiment’s randomization unit, the experiment will divide contexts into the experiment’s variations by individual users.

Advance audience bucketing and reassignment controls

When you build your experiment inside LaunchDarkly, you can allocate all or a percentage of the traffic that encounters a flag. Audience allocation gives you flexibility when selecting your experiment audience and ensures accurate results. Don’t worry, LaunchDarkly only analyzes your chosen contexts—targeting FTW!

Sample ratio mismatch tests

A sample ratio mismatch (SRM) is a mismatch between the proportion of users expected in your experimentation treatments and the actual number of users in each treatment.

An SRM often indicates an error in the experiment implementation, and results may not be valid. Fortunately, LaunchDarkly proactively alerts you when a sample ratio mismatch has occurred when the posterior odds favoring a mismatch are larger than 99%.

Collaboration features

Additionally, with reusable metric groups and the ability to share audience segments across experiments or teams, you can save time and energy and reduce the risk of human error. Already have metrics available to you from other business intelligence tools? Use our metric import integrations to ingest existing event streams from internal sources of truth.

Traffic allocation rules

Experiments are essential for guiding product decisions, but they must be correctly set up with the proper context and audience targeting to be effective. One of the main challenges is ensuring an experiment's audience is appropriately targeted; missteps here can result in unreliable data and misdirected product strategies.

LaunchDarkly addresses this with its advanced experiment traffic allocation tool, which enables precise definition and management of experiment audiences. Users can assign audience members to specific variations and decide exactly what portion of the audience should be included in each aspect of the analysis. This includes ensuring the audience is appropriately randomized within the experiment to avoid bias.

Sample size estimation

The number of contexts you include in an experiment is your sample size. The larger the sample size, the more confident you can be in its outcome. Your sample size depends on how confident you want to be in the outcome and how large the credible intervals are for your metrics. Metrics with large, credible intervals are sometimes called "noisy" metrics.

LaunchDarkly created a handy sample size estimation tool to help you determine the appropriate sample size for your audience. In experiments built in LaunchDarkly with two variations, you will notice a sample size estimator that estimates how much more traffic needs to encounter your experiment before reaching your chosen probability best.

Funnel Experiments

Thanks to LaunchDarkly’s Funnel Experiments and metric groupings, you can easily optimize key customer funnels. Measure different steps in your product workflows, categorize metrics, and track their performance across a customer journey to truly understand the impact of product changes on the user flow and your bottom line.

By tracking changes and metrics at each step in the customer journey, you can see where someone is keeping pace or falling off the expected path and iterate accordingly. Running funnel experiments within LaunchDarkly allows you to maximize the long-term impact of product changes and gain valuable insights and data about your users' preferences.

Multivariate (A/B/n) experimentation

Multivariate experimentation allows you to test and experiment across several feature configurations. While most folks are typically familiar with the standard A/B experiment, you can create multiple variations to suit your needs.

LaunchDarkly helps engineering, product, and data teams execute successful experiments together in a single place. Our SDKs support A/B/n testing, experimentation, and metrics tracking on top of our core feature management capabilities, making it easy to improve your product consistently.

Data export

Data export provides a real-time export of raw analytics data, including feature flag requests, analytics events, custom events, experimentation events, and more. By exporting your data to a location of your choice, you can use your data warehouse and tools to analyze event data. Data export allows you to send data to one of our supported destinations.

LaunchDarkly currently supports the following destinations: Amazon Kinesis, Azure Event Hubs, Google Cloud Pub/Sub, mParticle, Segment, and Snowflake. These integrations facilitate deeper feature usage and experimentation data analysis, enhancing the decision-making process.

Experiment results

Whether you’re a seasoned data scientist or just starting to dabble with data, we’ve designed experiment results to be easily interpreted. From distribution plots that demonstrate entire experiment results, time series graphs breaking down credible interval histories, and more — rest easy knowing that you don’t need a master's degree in statistics to make the most of our experimentation platform. LaunchDarkly provides easily interpretable results and detailed data export — both novice experimenters and resident data geeks can rejoice in data happiness.

Additionally, LaunchDarkly offers the following monitoring and metrics tools to ensure you’re delivering trustworthy outcomes:

Metric status health indicator, which lets you know if an experiment is falling off course
Automatic sample-ratio mismatch detection
Credible interval history chart allowing for time series analysis
Remaining sample size estimator for A/B experiments

These tools, in combination with our advanced visualizations designed with data science teams in mind, including time series of metric values, probability density plots of distributions, and a highly differentiated 'probability to be best' statistic, help simplify reliable decision-making under various conditions, setting LaunchDarkly apart from the competition when it comes to ease of use. Results can be accessible to the entire team, with further advanced detailed statistics and data exports for data science teams.

Data security

Rest easy knowing LaunchDarkly has your back with solid security across your features and experiments, keeping your team safe from threats. Our security-first approach includes tailored SDKs with authentication measures and secure modes for JavaScript SDKs to prevent impersonation. You can set properties to private, maintaining data security while still targeting effectively. Our platform is locked down with TOTP MFA, SAML, RBAC, and comprehensive audit logs for fine-grained access control and compliance with standards like FedRamp, SOC 2, and GDPR, even accommodating HIPAA requirements through agreements. In short, we're all about keeping your data safe and your experiments seamless.

Integrations

At LaunchDarkly, we believe that collaboration is vital to innovation. Our platform enables teams to design, review, and launch experiments seamlessly. It provides easy access to iterations, historical designs, and results, facilitating smooth handoffs and ongoing adjustments. LaunchDarkly's tailored permission controls and roles support fast-moving, cross-functional teams, ensuring everyone has the proper access.

LaunchDarkly integrates with essential enterprise tools like Jira, Slack, Teams, GitHub, and GitLab to further support team alignment and streamline operations. This allows teams to stay connected, manage feature flag settings, and handle approvals directly from their preferred platforms. Additionally, our integrations with APM tools like Datadog, Dynatrace, and New Relic bring feature management closer to daily engineering activities, enhancing visibility and control.

Managing tech debt

Are you worried about another tool clogging your workflow and adding to the backlog of tech debt you already have? Worry no more. LaunchDarkly provides capabilities that safely enable feature and experiment management across all stages of their lifecycle, including visibility, automation, and archival.

In addition to our existing Engineering Insights Hub, which provides a view into project and flag health, Code References, which provides a code-level view of feature flag usage in a code base, and our flag insights tool, LaunchDarkly’s experimentation supports the archival of existing experiments while still retaining their iteration results for future analysis.

Partner ecosystem

Significant features aren’t built in a silo. That’s why LaunchDarky has invested in building an ever-growing partner ecosystem with great relationships and integrations with top-tier technology partners for our experimentation platform. As of April 2024, LaunchDarkly supports 84 integrations, of which 1/3 are partner-built and maintained.

Running a high-powered experimentation team requires sophisticated tooling to serve data scientists, product managers, and researchers across the organization. This enables you to move faster and develop your experimentation practice with confidence Our flagship experimentation technology partners, Tealium, Snowplow, Census, and Sprig, allow you to take your experimentation capabilities to the next level.

Experimentation design AI assistant

Help design and run experiments, with tailored guidance, using our experimentation design AI assistant built on AWS Bedrock. Using Bedrock models like Titan combined with customer-specific, encoded data in LaunchDarkly, the models provide responses that incorporate the experimenter’s business-specific needs and domain knowledge. Users can define new metrics or select from existing metrics, including groups of standardized metrics their team measures across their organization. Audiences can be custom-defined or used to create pre-defined audience segments a team uses across every new launch.

This new capability eliminates the complexities of experimentation, using LaunchDarkly to solve problems in product with a few lines of code and unlocking the ability to measure the business impact of every feature release.

Mutual exclusion/layers

Use mutual exclusion experimentation as well as experiment layers to streamline multiple experiments running simultaneously. This feature allows you to create distinct layers, ensuring data integrity by keeping experiments isolated from one another. Run concurrent experiments on the same user base without risking data contamination and move with improved accuracy and speed as you ship.

Experiment holdouts

LaunchDarkly leverages experiment holdouts as your built-in control group by intentionally excluding users. This provides a clear comparison point for analyzing the long-term impact of changes across the product. This method accurately assesses how multiple or significant product updates influence the broader customer base.

LaunchDarkly is continuing to prioritize capabilities that accelerate how developers are building the future of software. To that end - we're looking at ways to keep the pace forward for engineering teams and remove friction points in the development process.

CUPED

Our experimentation platform is working to adopt CUPED, a specific statistical method for variance reduction that can provide more power to results with less data. For users, this means you’ll be able to identify the best feature to implement within an experiment even quicker than before.

Past-release impact monitoring

Soon you will be able to identify the impact of a past release on performance by zooming in on features that may have altered key metrics. With each improvement, teams can know with increased certainty the business impact or performance of each release by default.

Ready to step into the lab, errr… software platform designed for experimentation? Check out our free trial of LaunchDarkly and get right into the thick of it with our powerful tools designed for effective feature management and robust experimentation. It’s straightforward to get started, and you'll be up and running quickly, making more intelligent, data-driven decisions to help you build the software platforms of the future.

Like what you read?

Get a demo

Experimentation in LaunchDarkly: Feature Roundup

Like what you read?

Key Takeaways

Experiment interface

Experiment Design Workflow

Key metric groups

Advanced metric controls

Metric ‘recipe’ groups

Flexible randomization units

Advance audience bucketing and reassignment controls

Sample ratio mismatch tests

Collaboration features

Traffic allocation rules

Sample size estimation

Funnel Experiments

Multivariate (A/B/n) experimentation

Data export

Experiment results

Data security

Integrations

Managing tech debt

Partner ecosystem

Experimentation design AI assistant

Mutual exclusion/layers

Experiment holdouts

CUPED

Past-release impact monitoring

Like what you read?

More about Product Updates

Like what you read?