51黑料不打烊

Recharge your customer data to deliver electrifying experiences

Omnichannel data is a critical ingredient to power actionable customer profiles used by marketers to orchestrate activation and measure the resulting customer journeys. However, organizations face challenges in managing the quality, scale, and variety of this data. They require streamlined solutions to mitigate the impact of low-quality data, reduce time to value, and multiply ROI by using the same data for multitude of use cases.
For more information, please visit the Query Service documentation.

This video explores:

  • 51黑料不打烊 Experience Platform data preparation capabilities that you can leverage
  • Increasing ROI from 51黑料不打烊 Real-Time CDP, 51黑料不打烊 Journey Optimizer, and Customer Journey Analytics

video poster

Transcript

My name is Alex Alinoli and I鈥檓 a senior data architect. Today I鈥檒l be talking about advanced data processing and follow that up with a demo. To demonstrate the power of advanced data processing, we鈥檙e going to use an example called Abandoned Browse. Now an example of Abandoned Browse is imagine I had signed up for a yoga class and my yogi told me that this Luma Apparel site was the best place for new gear. So imagine I鈥檓 on the site, browsing for a couple of hours, can鈥檛 find what I like and I take a break. Me taking that break is considered an abandoned browse. I was on the site viewing products but did not ultimately purchase anything. Now a standard campaign might say, hey Alex, we saw you were on our site, why don鈥檛 you continue shopping? But with advanced data processing you could say something like, Hey Alex, how about you crush that first yoga session with these buttery soft lime green yoga pants? If I saw that, I鈥檇 be like, wow guys, you totally get me. Now you might be thinking, can a database query really ignite that kind of passion? The answer is yes. Because think about what went into generating that message. We use advanced data processing to analyze this person鈥檚 web behavior, to understand what products they鈥檙e viewing and calculate the most expensive thing they looked at were lime green yoga pants. And we use those data points to provide a personalized and targeted message. Now we鈥檝e talked about how advanced data processing can help your marketing campaigns. Let鈥檚 go and talk to you how we can build this audience in AEP. The first thing we need to understand is the data architecture. The data architecture refers to understanding what tables you have available to you and which ones you may need to build your audience and how they鈥檙e all connected together. So in our case, we need three key pieces of data. We need our browsing history, which would come from our analytics data. We need our product data, which will give us the price of the SKUs that our customers have been browsing. And we need our customer data. And customer data can come from an out of the box table called profile attributes, which is our unified view of the profile. So now that we know what tables we need to build on our query, before we actually go and build out the query, we need to understand the quality of the data that we鈥檒l be using. Right? So the analytics data is really the key here. It is what鈥檚 joining our profile to our product, to our web browsing behavior. So let me drill into the analytics schema, and we can take a look at how data quality is important. First I鈥檒l talk about is this SKU field. This is what we鈥檙e using to join our analytics data to product. And you may think because it鈥檚 called SKU that it always has SKU. But you鈥檇 be wrong. Right? So we鈥檝e seen this populated with device IDs or product IDs or any other non-SKU data. Just because it鈥檚 called SKU, or just as any field is called whatever it鈥檚 called, you cannot assume that the data contained within is quality data. Thankfully, data processing gives us some tools to mitigate these kind of issues. For example, if we knew we had different types of data coming into SKU, but we knew that valid SKU data was, let鈥檚 say, alphanumeric, in data processing, in our query, we could enforce a rule saying, I only want to capture data that has an alphanumeric pattern. That way I maintain the quality of my query, which would result in a quality audience. Similarly, if I knew I had device and product data that I didn鈥檛 want to include in my query, I could always write a criteria to exclude that information as well. So you always want to make sure we know what data is flowing into our datasets so we know that when we generate our audience, how reliable it鈥檚 going to be. So talk about architecture, the data quality, and before we get into the actual demo, I want to talk about why we don鈥檛 even need data processing. We talked about earlier that we can use it to compute attributes, which allows us to have a very dynamic way of segmenting and personalizing our messages. But outside of that, we don鈥檛 have any ability in even segmentation destinations to calculate these data points on the fly. That鈥檚 why it鈥檚 important that we need data processing. Similarly, if we didn鈥檛 have the feature of data processing, what would you have to do to get this kind of personalized information? You would have to take data out of AEP, manipulate it, massage it, and then bring it back in. And if you鈥檙e talking about a couple of attributes here and there, maybe that鈥檚 not a big deal. But if you鈥檙e talking about hundreds of attributes or hundreds of campaigns, that can greatly impact your project timeline. You have to ship data in and out constantly every time you want to build something new. And that goes for the marketer as well. If they鈥檙e in the platform and they know you have the data available, they don鈥檛 want to have to worry about having to contact other teams to aggregate this data so they can build out their campaigns. So AEP provides you both the data repository and the tools you need to activate upon your data in the most efficient and reliable way possible. So having said that, let鈥檚 talk about how we could actually build this audience in the tool. And the first thing we鈥檒l talk about is the actual query you want to build out. Let me open that up. So in AEP, there are two ways to execute a query. One is via the UI, and the other is via a command line prompt. I prefer command line prompts, so that鈥檚 what I鈥檒l be demoing today. Let鈥檚 talk about the actual query that we鈥檙e looking to build out. This one has two main sections to it. We have a select statement, which is generating our abandoned browse audience. And then we have an insert statement, which is saving this audience into a custom table. Let鈥檚 first focus on our select statement. So you can see here, we鈥檙e collecting four key data points, customer ID, SKU, the time they abandoned, and the price of the SKU. We鈥檙e then connecting the three tables we talked about earlier, analytics, unified profile, and product. And we鈥檙e then connecting analytics to profile, and then connecting analytics to product. And then you鈥檒l see here, we have two more conditions. This is an abandoned browse use case, so we want to avoid including any data containing order confirmation pages, because that would imply that they鈥檝e actually gone ahead and purchased an actual IDLE. Next part on timestamp is analytics data is usually pretty huge. We always want to find ways to minimize the amount of data that we鈥檙e crunching for performance reasons and just for a use case. We don鈥檛 need people who abandoned browse like 20 days ago, 15 days ago. We want to be very prescriptive in these queries. In this case, we鈥檝e chosen an interval of four days. So it gives us the flexibility in our segmentation. We can choose anyone who鈥檚 abandoned browse in the past hour, two hours, two days, up to four days ago. This is a flexible rolling four-day window in which to choose our audiences. And then you鈥檒l see at the end, we have an ordering. So this ties back to our original example where we鈥檙e capturing the most expensive SKU they browsed. So we鈥檙e ordering our data by price from highest to lowest and choosing the highest priced SKU they were browsing. At the top, we have our insert statement. This is essentially saving our audience into a table called Summit Advanced Data Prep Dataset. This shows you that we can create schemas and tables not only for incoming data feeds or for streaming data, we can actually create a query that calculates computed attributes and save those attributes in its own custom table. And we can take that table, enable it for profile, and then use these data points, customer ID or SKU or abandoned timestamp or SKU price, use those attributes for segmentation and in personalization. Having said that, let鈥檚 talk about how we can actually execute this query. So as I mentioned, I prefer the command line interface, and that鈥檚 what I鈥檒l show you in today鈥檚 demo. So to execute it, we can go to the query screen and we go to the credentials tab. And this gives you the connection string you need to connect to the database. Now, if anyone鈥檚 a little worried about giving people access to connect to the production database, you can absolutely restrict access to this page. Okay, so I鈥檓 going to copy my connection string. I will open up a command prompt and then simply paste it. And now I鈥檓 obviously my token, so we can refresh our page and get a new token generated. And I will copy this and I鈥檒l open this up and put this in there. Enter. And I鈥檓 now properly connected to the database. So now that I鈥檓 in, I can just simply copy paste my SQL statement and execute it. So I鈥檒l go back to our query. Copy it. Put it in here. And hit enter. Now we can execute this query directly in this window and see the results. Now, queries are not always run in a bubble. In our case, let鈥檚 say we want to schedule this to run every day. We can easily do that in the UI as well. I鈥檒l go to under query still. I go under browse. I can click on create query. Enter my query that I want to schedule. Give it a name. Let鈥檚 say Summit data processing. It鈥檚 safe. Now I have my query saved. Now that it鈥檚 saved, I can click into it and I now have the ability to add a schedule to it. We talked about creating a daily. You have the ability to choose your frequency, start and end dates, and the data set where you want to save the output of your query. So in here, I鈥檓 going to choose a frequency of daily. I鈥檓 going to choose every one day. And maybe I鈥檒l have this run until the end of the year. And then I want to save the output to my advanced data prep schema. And I click save. And save again. And I鈥檓 all set. So it鈥檚 that easy to create and execute a query in the command prompt. And I also scheduled to run on your desired frequency. Daily, hourly, weekly, etc. Let鈥檚 take a look at the resulting schema that we just talked about. We鈥檙e saving everything into a schema called advanced data prep. You鈥檒l see in this schema, it matches the data points that we are querying in our SQL statement. The output of the SQL query will populate the data set with these four data points. And we can now use these in our segmentation and destinations. You鈥檒l see also this is profile enabled, which is key. So we have our query. It鈥檚 scheduled. We鈥檝e built our schema to accept the output of that query. Now let鈥檚 see how we can actually use that audience in a segment. Let鈥檚 go to segments. Band and browse. And let鈥檚 take a look at the criteria. So in our segment window, there鈥檚 two main panels. There鈥檚 the profile attribute panel. And then there鈥檚 the event panel at the bottom. Let鈥檚 first start at the profile panel. You can see here, we鈥檙e looking to only select customers who have abandoned in the past four days. You can see this abandoned timestamp is the attribute that we calculated in our query. You can also see under attributes that go under my tenant. You鈥檒l see my abandoned browse attributes are right here. So you can see I can just use these directly in my segmentation criteria for any sort of logic I want to perform. In this case, I want to use the abandoned timestamp, make sure that the people I鈥檓 selecting have only abandoned in the past four days. And on top of that, I want to also exclude anybody that鈥檚 hit the order confirmation page in the past four days as well. So I have a clear cut audience of people who have abandoned browse without having made a purchase. Now you may be looking at this and be like, this looks super simple. This can鈥檛 be real. But it is because that鈥檚 the beauty of data processing. Having the query, the query is doing all the hard work behind the scenes. So you can just focus on the data points you care about. Right. Some customers may have, you know, five, six, seven attributes that they always need to include on every segment, like emailable yes, no, or certain preferences. And there could be a number of static criteria that are always needed. One of the other advantages of data processing is that you鈥檙e not limited just using it to compute attributes. You can include logic in your query that satisfies some of these data conditions you always have. You can include the email constraints or preference constraints in your query itself. So all that work is done on the back end. And so it greatly simplifies how you build out your segments. And so the last part is we鈥檙e going to talk about how we can actually activate this data. We鈥檝e now segmented. And that is via destinations.

So you can see here I have my segment that activated it to an S3 location. So here we鈥檒l look at what the attributes are for personalization. Right. So I have my destination configured. I have selected my abandoned browse segment. And now this is asking me what attributes do I want to send in the data feed that I create on the S3 location. And you can see here I am selecting all the computer attributes that I generated from my SQL query. Picking the customer ID, the SKU price, the SKU, and the abandoned timestamp. These can all now be used in personalization downstream. For example, if an email service provider can use these attributes to populate an email. If there was more metadata I want to include here, such as the category or the name of the item, I could simply add that to my query as data points and then use them here to export them into destination. So your data processing logic has many functions. You can use it for accommodating complex logic and for segmentation. You can use it for calculating various personalized attributes we use downstream. And using it to greatly simplify how you build out your segments. All right. So that brings us to the end of this demo. We鈥檝e talked through why data processing is key. An example of having used it for an abandoned browse situation. And that how we want to build that query depends on our existing data architecture. Ensuring that the quality of the data meets our needs. And if it doesn鈥檛, how we can mitigate that. And how we can actually schedule this query to run on a certain frequency. And then use it downstream in segmentation and destinations for personalization. So thank you all for your time today. I hope this was informative.

SQL example

INSERT INTO summit_adv_data_prep_dataset
SELECT STRUCT(
customerId AS crmCustomerId, struct(sku AS sku, price AS sku_price, abandonTS AS abandonTS) AS abandonBrowse) AS _pfreportingonprod
FROM
(SELECT
B.personKey.sourceID AS customerId,
A.productListItems[0].SKU AS sku,
max(A.timestamp) AS abandonTS,
max(C._pfreportingonprod.price) AS price
FROM summit_adobe_analytics_dataset A,profile_attribute_14adf268_2a20_4dee_bee6_a6b0e34616a9 B,summit_product_dataset C
WHERE A._experience.analytics.customDimensions.eVars.eVar1 = B.personKey.sourceID
AND A.productListItems[0].sku = C._pfreportingonprod.sku
AND A.web.webpagedetails.URL NOT LIKE '%orderconfirmation%'
AND timestamp > current_date - interval '4 day'
GROUP BY customerId,sku
order by price desc)D;
NOTE
This video is an excerpt from the 51黑料不打烊 Summit 2020 session .
recommendation-more-help
9051d869-e959-46c8-8c52-f0759cee3763