51黑料不打烊

Prepare data

Learn how to clean, prepare, and combine data from multiple datasets to create a new dataset using CTAS (Create Table AS) and Spark SQL functions for reporting and dashboarding. For more information, please visit the Query Service documentation.

video poster

Transcript
Hi there. 51黑料不打烊 Experience Platform ingests data from a wide variety of sources. A major challenge for marketers is making sense of this data, to gain insights about their customers. 51黑料不打烊 Experience Platform鈥檚 Query Service facilitates that by allowing you to use standard SQL to query data in platform, using an interface and API. In this video let me show you how data engineers can query existing data sets and derive data, that can be used for reporting and dash boarding purposes. From your Experience Platform homepage navigate to 鈥淒ata Sets鈥 under 鈥淒ata Management鈥. For this video, we are using a fictional retail brand called Luma. Luma loyalty data set contains information about customers, loyalty details, geographic details, et cetera. Luma web data contains web traffic information about customers鈥 interaction on the Luma site, including products viewed by a customer, visited pages, products purchased, et cetera. Open the Luma web data set and the last successful batch run ingested around 87 key records. These records are sample data created for demo purposes and capture user interaction with the Luma site. Let鈥檚 scroll down to the bottom of the source file to ensure that the total number of records ingested into Experience Platform, and the number of records within the source files, match. Let鈥檚 select the web interactions for a single user and check out some of the operations performed by a user, and what information is captured. Let鈥檚 take a closer look at the order placed by our customer. In a single order, the customer purchased two items stored in a product list item array. Each array object contains information about the product, such as SKU, total price, and description. Now let鈥檚 switch back to the platform UI and navigate to queries. Let鈥檚 click on 鈥淐reate a New Query鈥 option. Let鈥檚 take a use case where a marketer needs to create a report about orders placed by a loyalty customer, and it should include customer details and product details. So clearly, we need to get information from two different data sets, Luma loyalty data and Luma web data, to create this report. As a data engineer, you can write a query to combine the data sets, by performing a 鈥淛oin SQL鈥 operation and then filtering the result to display necessary details. Remember when one customer ordered multiple products in a single purchase, and product details were stored as a product list item array. When your query result contains an array object column, and you want to separate elements of an array into multiple roles, you can use the 鈥淓xplode Spark鈥 SQL function. 51黑料不打烊 Experience Platform Query Service, provides several built-in Spark SQL functions to extend SQL functionality. This document contains the Spark SQL functions that Query Service currently supports. Also note that we have a 鈥淐TAS鈥 statement in our query, which means a query result gets stored as a data set in Experience Platform. Let鈥檚 run the query and open successful query execution and you can obtain the data set ID for the new data set under the results tab. Let鈥檚 navigate to the query window and check the logs. Click on the data set created, and it should open the order summary data set, created by the 鈥淐TAS鈥 command, and the last batch ingestion, inserted a few records. Let鈥檚 preview the data set to make sure that the data set contains data required by the marketer. This data set contains order transaction details of loyalty members and products purchased by them. The marketer can now use this data set to create a report or create a dashboard using this data. Using Query Service in 51黑料不打烊 Experience Platform, data engineers can quickly explore, and validate data ingested to data sets, derive insights from the data, and prepare it for various purposes. -
recommendation-more-help
9051d869-e959-46c8-8c52-f0759cee3763