Stream data using Source Connectors

Documentation Experience Platform Tutorials

Last update: Fri Feb 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Sources

CREATED FOR:

Intermediate
Developer

Learn how to stream data in real-time from a cloud storage source to Platform and use the data in real-time for customer engagement.

video poster

Transcript

Hi there. In this video, let me show you how to stream data in real time from a cloud story source to platform and use this data in real time for customer engagement. Data ingestion is a fundamental step to getting your data in experienced platform so you can use it to build 360 degree real-time customer profiles and use them to provide meaningful experiences. 51黑料不打烊 Experience Platform allows data to be ingested from various external sources while giving you the ability to structure, labor and enhance incoming data using platform services. When you log in the platform, you will see sources in the left navigation. Clicking sources will take you to the source catalog screen where you can see all of the source connectors currently available in platform. Note that there are source connectors for 51黑料不打烊 applications, CRM solutions, cloud storage providers, and more. Look on the cloud storage source category. Currently, just stream real-time data to platform from external cloud storage, you can either use Amazon Kinesis or Azure EventHubs. For this video, let me choose Amazon Kinesis and show you how to set up the source connector. When setting up a source connector for the very first time, you will be provided with an option to configure. For a configured source connector, you will be given an option to add data. Let鈥檚 use the configure option. Since this is our first time creating a Kinesis account, let鈥檚 click on creating a new account and provide source connection details. Complete the required fields for account authentication and then initiate a source connection request. If the connection is successful, click next to proceed to data selection. In this step, we select the Kinesis data stream from which we want to stream data into platform. Let鈥檚 select the Luma customers events data stream. Let鈥檚 proceed to the next step to assign a target data set for the incoming streaming data. You can choose an existing data set or create a new dataset. Let鈥檚 choose a new dataset option and provide a dataset name and description. To create a dataset, you need to have an associated schema. Using the schema finder, assign a schema to this dataset. We can preview the schema structure later in this video. For now, let鈥檚 move to the data flow details step and provide a data flow name and description. Let鈥檚 review the source configuration details and then save your changes. Upon successfully saving your configuration, you will be redirected to the data flow screen. At this point, we have successfully configured the source connector for streaming data from cloud storage solutions. From the left navigation, let鈥檚 click on schemas, browse through the schema list and open the schema that we chose when configuring the source connector. The selector schema consists of field that collect information about a user鈥檚 profile details like membership ID, loyalty number, contact details, et cetera. Let鈥檚 navigate back to the schema UI and download a sample file for our schema and open it using a text editor. The sample file provides a reference for structuring your data when ingesting into datasets that employ the schema. In the next step, let鈥檚 see how to send data to the Amazon Kinesis data stream. Based on the schema template, I have created JSON sample data for a customer with some dummy values. Before we use this data, it is important to make sure that the data鈥檚 format and structure must comply with an existing experience data model schema. Now, let鈥檚 copy the XDM stamp for entity and embed it directly under the request body. Note that the header element and the body element contains a reference to our schema to where the data will be ingested. We now have a sample data that is ready to be sent to the Amazon Kinesis data stream. Let鈥檚 split screens to the Amazon Kinesis homepage and open the data stream that鈥檚 already set up. A producer is an application that writes data to Amazon Kinesis data streams. We can build producers for Kinesis data streams using the AWS SDK for Java and the Kinesis producer library. There are several ways in which you can put records into a data stream. In this video, we will be using the AWS CLI to write data to a data stream. If you would like to explore the other options, please refer to the Amazon Kinesis documentation. Open a terminal window and let鈥檚 run a command to obtain the list of data streams in your instance. Let鈥檚 use the Luma customer events data stream. Make a note of the stream name as we might need that in the next step. There are two different operations in the Kinesis data streams API that add data to a stream, put records and put record. The put records operation sends multiple records to your stream per HTTP request, and the singular put record operations sends records to your stream one at a time. You should prefer using put records for most applications because it will achieve a higher throughput per data producer. Since we only have one record, let鈥檚 use the put record option. Let鈥檚 quickly obtain the syntax for the put record option. To write data to a data stream, you need this stream name, partition key and a data blob. Let鈥檚 scroll down to see the data blog request to be in a specific format. Let鈥檚 scan sample data into base 64 encoded before we write to the data stream. Let鈥檚 use an online tool to convert our JSON formatted data into base 64 encoded. Copy the data to your clipboard. Now, let鈥檚 run and put record command to write data to Amazon Kinesis. Let鈥檚 forward the data stream name and add the data from the clipboard in single quotes. A successful record write to data stream will include a sequence number and an ID value. Let鈥檚 switch to the Kinesis monitoring dashboard and verify that the put record was successful. We can view the record count graph as it shows a successful data write. It鈥檚 time to verify if the data returned to the Amazon Kinesis data stream is ingested to platform using the source connector configuration that we set up at the beginning of this video. Let鈥檚 open platform UI and navigate to sources and select data flows. Open the dataset associated with the data flow. Under the dataset activity, you can see a quick summary of ingested batches and fail batches during a specific time window. Scroll down to view the ingested batch ID. Note that we have a successful batch that ingested one record into our dataset. Open the batch ID to get an overview. For any reason, if the record ingestion fails, you can obtain the error message and error code from the batch overview page. Let鈥檚 quickly preview the data set to ensure that data ingestion was successful and our fields are populated. Close the preview window. With real-time customer profile, you can see each customer鈥檚 wholistic view that compares data from multiple channels, including online, offline, CRM and third party data. To use this data for real time customer engagements, let鈥檚 enable the dataset for real-time customer profiles. I hope I was able to show you how to stream data in real time from a cloud storage source to platform and use this data in real time for customer engagement. -

Additional Resources

recommendation-more-help

9051d869-e959-46c8-8c52-f0759cee3763