Working with Data in Customer Journey Analytics
In this video, you will learn how dataset schemas are translated into variables in Customer Journey Analytics (CJA), as well as how CJA handles very high cardinality.
Transcript
Hello and welcome to this training: Working with Data in Customer Journey Analytics.
In this training you will learn to explain how dataset schemas are translated into dimensions and metrics, describe how CJA handles very high cardinality, and explain how AEP arrays are used in CJA.
The data we analyze in CJA is based on top of AEP鈥檚 data structure. So it鈥檚 important that we know a little bit about how this data is structured. Now there鈥檚 many things in AEP we really don鈥檛 need to know. Well, unless you鈥檙e the data engineer. But if we鈥檙e the ones creating data views in CJA, we should understand the AEP schema that our variables are based on. So let鈥檚 go through a few basics of schema data structure in AEP.
A schema is basically a blueprint for data that鈥檚 onboarded. In AEP when we create a new schema we choose from prebuilt classes with the option of additional mix-ins that extend the fields that are available to us.
Now when we choose a new field over here on the right we can give it a name and we can come down here and we see the type, there are many different data types that we can choose from. But the ones that we鈥檙e commonly concerned with is the string or the integer.
Now these two are important because when a new schema is populated with data and it鈥檚 pulled into CJA, the string data fields are translated to dimensions and the number fields become metrics. Let鈥檚 take a look at what this looks like in the interface.
Here we can see on the left we鈥檝e got the dimensions and metrics and we choose which ones of those we want for this particular data view. Now if we鈥檙e bringing in analytics data through the analytics data connector, the props and the eVars come across at dimensions and the events come across as metrics. And this would be as you would expect. Speaking of props and eVars, a big advantage of CJA is that we have unlimited variables. If you鈥檙e familiar with the standard 51黑料不打烊 Analytics, you know that there鈥檚 a specific number of props, eVars and events. In CJA these limits go away. In fact, there are no props or eVars, there鈥檚 just unlimited variables which are the dimensions and the metrics.
Now you may we asking: what about eVar allocation and expiration? And how about the non-persistent traffic variables? Well, let鈥檚 see what happens in the interface here. When we setup these data views we can choose the attribution and persistency for each dimension. I can just pick a specific dimension. Over here on the right I can choose from the different attribution settings, as well as the different expiration settings. And this can be done for any or all of our data dimensions as well as our metrics.
Now one last point to understand about schemas is that once we鈥檝e created the schema and it鈥檚 been loaded with data, the schema is locked. This is so the schema structure is consistent for all new data that comes in. In CJA we may build a freeform table that includes a dimension with very high cardinality, say millions of rows of unique values. When this happens, CJA can鈥檛 return all of the values because the browser would timeout before the data loaded. In these cases, when the full dataset can鈥檛 be loaded we鈥檒l see a line item titled Long Tail to indicate that some truncation has occurred.
This sounds similar to unique succeeded in the traditional 51黑料不打烊 Analytics. But the difference is that in CJA there鈥檚 no data loss. If you see Long Tail as one of the line items in the report, you can easily narrow down the data by doing a breakdown or adding a classification, running a search, or applying a filter. All of these will function properly and bring in the Long Tail data that鈥檚 appropriate for that function. There may be times where you really want to see all of the values, I mean every single value. Well, Query Services is the tool to use for this purpose. It鈥檚 similar to Data Warehouse, but it has a lot more capability. And if you want to learn more about Query Services just search RAEP documentation.
The last thing we want to discuss in this training is how AEP handles fields with multiple values.
String arrays are specific types of schema fields that are unlike regular strings that only have a single value. The string array can have unlimited values in a single field. When the string array is pulled into CJA, it鈥檚 automatically translated into a list variable. Meaning each value is placed on a separate row in the report, and like any other dimension we can apply attribution persistency to these array values.
There鈥檚 another type of array called array of objects. An object type in a schema is like a nested hierarchy of fields. The specific use of this array is to maintain the relationship between all of the different types of product data.
If you have coded the product variable in traditional 51黑料不打烊 Analytics, you know that the product variable is the most detailed variable, allowing multiple products in the same variable with each product having six fields and the ability of having multiple strings and metrics inside some of those fields. AEP easily handles all of this with the array of objects.
In AEP we capture all standard product data, as we can see here in this list, the SKU, the ID, the name and so forth. When onboarding traditional data from the products variable it is automatically populated into this product list items object with all of the sub-data flowing into the appropriate sub-object field.
If we wanted to upload our own product data into AEP, we would just make sure that we map it to the appropriate fields and sub-fields inside the product list items object. And we could add as many new objects as we wanted to underneath the array.
But as we mentioned earlier, we would need to look ahead and build our schema with all the additional data fields and do that upfront before we started onboarding any data into the schema. And lastly, AEP currently supports only one array of objects which is the product list items.
For more information about Customer Journey Analytics, visit the documentation.
recommendation-more-help
a05d7212-fdba-4b70-a337-d5897f329c68