Setting up product analytics in Databricks
Overview
This topic explains how to set up LaunchDarkly product analytics in Databricks.
Databricks is a cloud-based data processing and analysis platform that lets you work with large sets of data.
Databricks does not support all product analytics features
Not all of the features available in LaunchDarkly product analytics are available in Databricks. User activity, some cohorts features, and some funnel conversion criteria are not supported. To learn more, read Using product analytics charts.
Prerequisites
Before completing this procedure, you must have the following:
- Access to a Databricks workspace, including a cluster and application ID.
Find Databricks configuration information
In order to connect LaunchDarkly product analytics to Databricks, you must provide some information about your Databricks workspace to LaunchDarkly. To do this, you must create a service principal and SQL warehouse in Databricks.
Here’s how to create a new service principal:
- Log into the Databricks workspace you want to connect to LaunchDarkly.
- Navigate to the workspace’s Settings, then Identify and access.
- Find the “Service principals” section, then click Manage. The “Service principals” page opens.
- Click Add service principal. The “Add service principal” dialog opens.
- Click Add new, then enter a Service principal name in the text field. Click Add. The new service principal appears in the “Service principals” page.
- Click the new service principal’s name to open its details.
- In the “Configurations” tab, verify that the service principal has the following entitlements:
Databricks SQL access
, andWorkspace access
. Then click into the “Secrets” tab. - In the “Secrets” tab, click “Generate secret”. The “Generate OAuth secret” menu opens.
- Specify a duration for the secret, then click Generate. The secret and client ID appear. Copy both of them and save them somewhere safe.
Now, create a new serverless SQL warehouse and connect it to the service principal. Here’s how:
- In Databricks, navigate to the SQL Warehouses page.
- Click Create SQL warehouse. The “New serverless SQL warehouse” dialog opens.
Configure a serverless SQL warehouse
You must use a serverless SQL warehouse with LaunchDarkly product analytics.
- Enter a Name for the warehouse.
- Specify a Cluster size for the warehouse. We recommend choosing the “medium” size.
- Click Create. The new warehouse appears in the SQL Warehouses page.
- Click the warehouse’s name to open its details.
- Click Permissions. The “Manage permissions” dialog opens.
- Search for the service principal you created earlier, then select it.
- Choose
can monitor
from the permissions dropdown, then click Add. - Now, find and save the warehouse connection information. You will need it for a step later. a. Navigate to the “SQL Warehouses” page and click into the “Connection details” tab. b. Copy the “Server hostname” and “HTTP path” values and save them somewhere safe.
Now you have a SQL warehouse and an agent that can access it. You also need an allEvents
table in the warehouse. Here’s how to create it:
- In Databricks, navigate to the Catalog page.
- Find your organization’s database and the schema within that database. The name and location of these things are unique to your Databricks organization workspace. The database schema contains the
allEvents
table.
Use the correct naming schema
Name these items correctly. The catalog should be named as ld_product_analytics_<project_key>__<environment_key>
and the schema as product_analytics_<project_key>__<environment_key>
.
- Click into the database, then into its “Permissions” tab.
- Click Grant. The “Grant on…” dialog opens.
- In the “Principals” field, type to find the name of the service principal you created earlier, then click to select it.
- In the “Privilege presets” field, type to find the
Data Editor
privilege, then click to select it. Click Grant.
Set up product analytics with Databricks
To use LaunchDarkly product analytics, you must first enable LaunchDarkly’s Databricks Native product analytics integration. Here’s how:
- Click Product analytics in the left navigation, or find it by searching “Databricks native product analytics” on the Integrations page.
- Click Configure. The “Configure Databricks Native Product Analytics” menu opens.
- Click Manage integration. The “Configure Databricks Native Product Analytics” menu opens.
- Choose an environment to set up the integration in. Click Next step.
- Select “Use CDP/Custom SDK” as an event tracking method. Event tracking with a LaunchDarkly SDK is not supported. Click Next step.
- Give your Databricks warehouse a human-readable Name.
- Enter the server hostname of your Databricks workspace in the Host field.
- Enter the Databricks SQL Warehouse HTTP path in the Cluster Path field.
- Enter the Databricks client ID in the Client ID field.
- Enter the client secret in the Client Secret field.
- Read and click to acknowledge the Integration Terms and Conditions. Click Save configuration.
On the Product analytics screen in LaunchDarkly, the landing page will update to show a “Waiting for data…” status. Events and other information will begin to populate the screen within 15 minutes. Events from the last 30 days will be available within an hour. Load time varies based on the volume of data you’re importing from Databricks.
To verify that data is loading, refresh the page. The Dashboards tab will not have any information in it until you create a dashboard, but you can confirm that setup was successful by checking the Events and Attributes tabs. After the import completes, both of those tabs display pre-populated data.
After the event data appears, you will be able to access different aspects of the product analytics UI.