One of the critical elements of a future-proof martech stack is an enterprise data warehouse. Whether you use BigQuery, Databricks, Snowflake, Redshift, or another data cloud, the most accurate and effective marketing campaigns are built from a single source of truth in the cloud.
With a composable architecture, you can use an orchestration tool like GrowthLoop to pull data directly from the cloud and build highly targeted audiences and personalized customer journeys.
In this video tutorial, Sales Engineer Alex Sotis walks through the quick process of connecting your data cloud to GrowthLoop. He covers:
- How to connect GrowthLoop to BigQuery, Snowflake, Redshift, Azure SQL, or Databricks
- Best practices for mapping your customer, transaction, and event tables
- How to build audience segments
- Tips for optimizing your data modeling for better performance
- How to ingest external marketing data from tools like Braze, Salesforce Marketing Cloud, Facebook Ads, and more
Video transcript
Welcome everybody. Today we're going to be discussing GrowthLoop, how it connects to your cloud data warehouse, as well as some data modeling best practices.
Just want to quickly start here in the Audience Builder to show the purpose of us connecting to your data warehouse. So here we have an audience segment, California coupon users. You can see here I have a customer's table with a filter on a state.
And then I'm also joining in this coupons table and I just have a filter on that the coupon use date was the last 30 days. Up in the top right, I can see my audience count, which is being queried live.
And now I have the ability to activate this audience to send it for paid media campaigns, email, SMS, and push notifications.
But let's take a step back. How are we able to surface this data in the first place?
Connecting GrowthLoop to your cloud data warehouse
The first step is to establish the source connection from GrowthLoop to the cloud data warehouse. And to let us know what is your first customer's table, because this is really what we're going to need to build audiences.
So here in our documentation, this is the screen that you'll be shown when you create a new organization. You'll type your organization name, your domain, and industry.
The next step, this is really where we're establishing the connection. So here in our docs at docs.growthloop.com, you will find documentation on how to generate the service account for any of the cloud data warehouses that we support. Data warehouses we support are BigQuery, Snowflake, Redshift, Azure SQL Server, and Databricks.
So here's where you will add your service account key, and we will test the connection.
Once that's established, the next thing that you'll do is show where does that customer table live. So that main customers table that was shown on the audience builder essentially we need the location of where that lives in the data warehouse.
And that's it. Fairly simple steps, and now we are all connected to the data warehouse, and we can start segmenting audiences.
Mapping tables
So now we've established connection from GrowthLoop to your data warehouse, and we have mapped this main customers table. The next step are going to be to map the transaction and event tables associated with customer behavior. So for example, here we have coupon usage, abandoned carts, purchase orders, and mall visits. So in order to do that, I'm going to come down here to datasets.
So the first step is to select your dataset type. This will really not have much effect on how this table is used in Audience Builder. It's more so just some information for the marketers such that we have a self-documenting label. So let's say we're going to add our transactions table to join to the customer table that we included during setup. I'm going to skip over this primary data set checkbox for now, but we will come back to that and explain what that means.
So I'm going to establish my source connection, choose my source connection. This is going to be essentially the service account that was created during the onboarding steps, but GrowthLoop can support more than one service account. Then we'll select Table or View. Then we will select our BigQuery project and data set.
Now we've selected our table name, which is on transactions.
Next step is to select a unique key. So we have transaction ID, which should be unique, and we've confirmed that it is. One quick note about this step. It is a data modeling best practice to have a unique key on each table. Of course, this is not enforced in modern cloud data warehouses. So if for some reason you do not have a unique key, we still can use this table, we would just need to enable duplicate resolution. So what duplicate resolution will do is it's going to de-duplicate the rows that have a duplicate key for the field that we selected as the unique key. We're going to do so by first checking, let's say we have two rows with customer ID 123. We're going to check, does one of those rows have fewer null values?
And if so, we're going to grab that one. Otherwise, we will randomly select one of the rows to bring into audience builder, but we will do so deterministically such that every time you go to select that audience, we will be grabbing that same row.
But for this example, we will choose Transaction ID, which is in fact unique.
Here we have the option to add a filter. So if there are certain rows that you want to restrict from marketers ever being able to activate, for example, maybe we have a opt-in flag, we're able to add a filter here.
And now as a marketer, I will never be able to select rows where this filter is applied.
All right, the next step, which is also optional, is to give any of these columns a field alias. So if you have some machine readable prefix or suffix on this column, it can be a good idea to add a alias for marketers to be able to see a self-documenting column. Then we also have these options here, show by default, PII, and exclude from personalization.
So any fields that contain PII such as email, first name, last name, date of birth, we want to select these and this ensures that marketers will never be able to use the PII column for filtering in the audience builder. It'll also make sure that we never show any actual values of this column in the application.
This will allow us to still use this column for filtering when we're building audiences, but we will never be able to export this data outside the warehouse.
Show by default, these are just to select for certain columns that you know you're going to be filtering on often, such that they appear in the audience builder when that data set is selected.
Now the last step. So in this case, we're joining a transactions table. And so we want to be able to join that to our customer table. So we're going to select our dataset group. We haven't gone over that yet, but we'll be showing that in just a moment. And now we need to join the transaction table to our customers table.
We're going to select the cardinality. So for most transaction event tables, it's going to be many to one, which means that there may be multiple records on our transactions table for each record in the customer table.
And now we just give the joint criteria. So we select customer ID on both sides and we can create. And now we have just successfully added our transaction table. Just a quick note here that we are able to support multiple joint keys if needed.
Explaining datasets
All right, so back on the the data sets page, want to explain what a data set group is. So essentially a dataset group are going to be a universe of tables that join together.
Reasons that you may have multiple dataset groups would be, let's say, for example, you have one table that are all of your known customers and then another table, which are all of your prospects. So those are two different tables that are going to have differing columns and attributes as well as differing additional tables that join in. So in this case you would set up one customer's dataset group and one prospect dataset group.
The second reason that you may have multiple data set groups are to resolve audiences to different levels of granularity. So for example, you may want to create an audience that is at the individual level. And then you also may want to be able to create an audience at the household level. So in this case, you would have one dataset group be households. And now when you go into audience builder and you see the audience size in the top right corner, that'll actually be the number of households that meet certain criteria.
So for any dataset group, you're going to have one data set, which is going to be the primary dataset. So here, our users table is our primary data set. All that means is essentially when you come to Audience Builder and you select your dataset group, whatever the primary data et is for that dataset group is what's going to show here under customers. And now, all of the available datasets that exist on the bottom portion, those are going to be your non-primary datasets.
Ingesting data back into your cloud
All right, so GrowthLoop also has the ability to accelerate ingesting data. So if you have important data from your email service provider or maybe your paid media tool, you're going to want to ingest that data back into your warehouse so that you can use it for further segmentation and reporting.
So to add a new ingest connection is just a few clicks. Click new ingest connection. Now here we have all of the popular marketing tools that your organization may be using. And you would simply need the credentials to authenticate our connection to this tool. And that's it. Now we will be ingesting data from Braze, Salesforce Marketing Cloud, Facebook Ads, Google Ads, or whatever it is, on a daily basis.
Data modeling best practices
All right, so next we're going to talk about how to model your data to be best used for GrowthLoop. So the first thing is that we actually have a very flexible schema mapper such that we're going to connect to your schema as it exists and we do not need it in any particular column order or with column names. So this is an advantage compared to monolithic customer data platforms, which may require a very rigid schema.
With that said, there are some best practices in order for marketers to get the best user experience when segmenting audiences and building journeys, but also to reduce the compute of your cloud data warehouse.
So here on the screen, I just have a few simple tables. I have my users, my events, my transactions. The point here is that you should be opting for wide tables in order to reduce joins instead of using a more traditional star schema approach, which is going to have more tables and more joins. This is generally better for cloud data warehouses besides using for GrowthLoop. Since they are columnar storage, joins are less performant than traditional on-prem row-based databases.
So a typical way to think about what tables you need are to spell out a certain business flow in a sentence. For example, my customers attend certain in-person events and they may or may not make a number of transactions. So this is a pretty vague process. You can get more specific when focusing on the process for your organization, but essentially what you want to do is take that sentence and then extract the nouns. So my nouns there were customers, events, and transactions. And so now I know that those are three entities that I need modeled such that marketers can use that for segmentation.
Connecting your data warehouse to GrowthLoop takes only a few minutes
All right, so today we've shown how to go from a brand new GrowthLoop account to establishing a source connection, defining the customer tables, as well as the event and transaction tables that will map in and the ability to build audiences based on those mapped tables. All this was done in a matter of minutes and adding new tables in when you use the data sets tab takes about 30 seconds to a minute per table.
So even as there are new tables that maybe the data engineering team or data science team are writing to your warehouse, because of GrowthLoop's zero copy architecture and flexible schema mapper, we can add that table in for marketers to use in under a minute.
If you're curious to explore GrowthLoop and see if it's a good fit for your organization, feel free to reach out and we'd be happy to give you a demo.