Optimize Your Commerce Intelligence Data Warehouse

Documentation Commerce Intelligence Events

Optimize Your Commerce Intelligence Data Warehouse

Last update: Fri Jan 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time)

CREATED FOR:

Intermediate
Experienced
Leader
User
Developer

NOTE

51黑料不打烊 Commerce Intelligence was previously known as Magento Business Intelligence (MBI). Recordings for past events reflect the previous name.

Using the Data Warehouse Manager, you can manage table and column sync settings, drill down into a table鈥檚 schema, and create calculated columns to use in reports.

NOTE

Access to the Data Warehouse Manager, requires Admin permissions.

In this webinar:

Introduction to the Data Warehouse
Update cycle and optimization
Replication and rechecks
Data Warehouse views
Questions and answers

Video content

video poster

Transcript

Welcome everybody and thanks for joining us today for the Optimize Your MBI Data Warehouse webinar. I鈥檓 Webster Love and I鈥檒l be your host for today. Joining me is your presenter for today鈥檚 session, Sonika Viramnani. We鈥檙e from the Product Analyst team from a Gentle Business Intelligence at 51黑料不打烊. Before we dive in, we have a couple of technical notes. For audio issues, audio is through 51黑料不打烊 Connect. Your speakers or headphones must be unmuted and then ensure that the speaker icon is on at the top of the Connect Room. We don鈥檛 currently have an option for calling in via phone. 51黑料不打烊 Connect doesn鈥檛 always work well with Bluetooth headphones, so if you鈥檙e using those, if you鈥檙e having issues, you can try switching to wired headphones. You can also try logging out of the room and back in or switching to the desktop app instead of the browser version. Throughout the webinar, if you face any technical issues with the audio or the screen share, or if you have questions related to the functionality of MBI or to the content that鈥檚 being presented, please feel free to put those questions in the Q&A pod at the left side of your screen, as we鈥檒l be monitoring that while the recording of today鈥檚 content plays. We鈥檒l try to answer those questions either right away through chat or during the Q&A segment in the middle and the end of the webinar. We鈥檒l also be sending a recording of this webinar out afterward. Now I鈥檒l hand it over to Sonika.

Thank you and thank you everyone for joining us today. I hope each one of you are doing good and staying safe. Our agenda for this session is split into two parts. Within part one, we will start with a brief introduction to MBI Data Warehouse. Then we will talk about update cycle in MBI, where we will talk about the various steps that are involved and discuss ways to optimize your update cycle time. We will then move on to replication methods and rechecks. Here, we will discuss about the various replication methods that are available in MBI and check out the way to set or change your existing replication method on your track to tables. We will then understand the purpose of rechecks and go through the process of how to manually enable them in your MBI account.

With that, we will wrap up our part one of the webinar and move on to the first Q&A for the session.

Within part two, we will discuss about data warehouse views. Here, we will first discuss the purpose and power of data warehouse views. Then we will show where to navigate in your MBI account to create a data warehouse view and how to check existing views in your account. We will then have another Q&A segment to answer your questions and wrap up the session.

As we have mentioned already, we will try to answer your questions in Q&A pod or in the live Q&A sessions. If you have a question that doesn鈥檛 get answered during the session or if you think of a question after the webinar is done, feel free to send us an email at the address that is showing on the screen right now. For the next seven days from now, we will be monitoring the questions that will be sent to this email and we鈥檒l get back with possible answers.

Alright, now let鈥檚 talk about MBI data warehouse. As you all know, MBI is a first-act business intelligence platform that includes both a data warehousing component as well as a visualization layer for the automation of reporting. MBI replicates the selected data from your connected data sources and stores it in its own data warehouse. So, when you鈥檙e running queries or creating reports in your account, data gets pulled from MBI鈥檚 data warehouse but not from your source database. MBI鈥檚 data warehouse enables you to consolidate data from multiple data sources including 51黑料不打烊 Commerce and other transactional databases as well as third-party API integrations like Google AdWords and Salesforce. If you鈥檇 like to learn more about MBI鈥檚 data warehouse functionality, refer to our Managing Data Sets in MBI webinar on the Resources Library. As I have mentioned, MBI replicates data from your connected data sources to MBI鈥檚 data warehouse at regular intervals. This replication happens in a process called as update cycle in MBI. In general, update cycle in MBI can be broken down into two main phases, data synchronization and chat caching. Within the data synchronization phase, MBI checks for data in your connected data sources and replicates new rewards and updates any changed values in existing rows on your track database since your previous update cycle was trans-successful. Data synchronization itself can be divided into two phases. The first one is synchronization of existing rows where MBI checks for and replicates any new rows into your track tables that were created in all connected data sources since the previous update. And the second phase is called as re-sync of existing rows. Here, MBI checks for changes to existing rows on your track tables and replicates those changes for all the connected data sources. Re-sync will be executed based on either the set replication method on your tables or on the fields that have re-checks enabled on them. We will discuss about replication methods and re-checks in our next section.

Coming to the second phase in the update process, which is chat caching. In this phase, MBI aggregates and caches all the chats that are displayed on a dashboard and have been viewed by a user in the last 120 days by default. The timing cutoff is totally configurable and it is 120 days by default.

Now, let鈥檚 switch over to MBI to explore and understand the information that can be found related to your updates.

As you can see now, we are now on the dashboards page in our demo account. Here, if you click on the notifications icon, you will see some details on the data status. It shows you information with date and exact time on when the last successful update in your account began and it also says if there are any integration errors in your account. Here, all our integration seems to be connected successfully as it says no integration errors. You will also be able to find more details on updates in your account within the data warehouse manager section. This sits in the manage data section of MBI here. So, let鈥檚 go to manage data and select data warehouse. Please note that you will need to have admin privileges to access data warehouse manager in your account. So, within the data warehouse section, you can find information on when the last successful update began and you will also see whether an update is currently in progress. And if yes, meaning that if currently there is an update in progress, you will see an option to ask MBI to send an email notification when the current update gets finished. If there is no update in progress at the moment as what we are seeing now in our demo account here, you will see an option to run a force update. This can be very useful in cases where you know that you have some interesting data that came into your source database since MBI鈥檚 previous update ran and would like to populate your reports with that data in your MBI reports. So, in such cases, you can initiate a new update right away by clicking on force update. Few things to keep in mind here are that when you run a force update, it will kill an existing update that is not a force update and will start a new update. Force update will not run a new update if the time when you have initiated the update comes under the blackout hours in your account. For those of you who are unfamiliar, blackout hours are the hours that you set in your account during which MBI will not be able to access data from your connected data sources. And here鈥檚 where you can adjust that. Go to the client set like your account dropdown and hit account settings.

Here you will see something called as blackout hours for database integrations. We do not have any blackout hours set in our demo account here, so we don鈥檛 see anything. But you can set your hours as desired and hit save customizations. If you notice here, you will also see an option to run force updates. What this does is that when you set the desired timings here, the MBI will automatically run force updates during that particular time.

Alright, now let鈥檚 talk about some possible ways with which you can optimize your update cycle and for that let鈥檚 switch back to our slides.

You can optimize your update cycle in MBI by taking the following steps. The first one is to deactivate any inactive users in your account. An admin user will have to do this by deleting users that are no longer using MBI actively. Please be cautious while deactivating users in your account as this will delete the reports that they might have created. So make sure to transfer their reports and dashboards over to an existing active user before deactivating any user in your account. And the second step is to clean up your account by deleting any dashboards and reports that are no longer needed. To do this, it鈥檚 recommended that each end user that owns reports should cycle through their reports and dashboards and remove those that are no longer useful. If there are any reports on dashboards that are not being used at the moment but might be good to have for future reference, we suggest you to remove such reports from dashboards but not delete them. So that way they will not go through caching process in every update cycle and will reduce your overall update time and can still add them back to dashboards whenever you need. And the third step in optimizing your update cycle is to ensure that the tables in your account are set to a suitable replication methods and column recheck frequencies. This definitely helps your update cycle to run in an optimized manner. Now let鈥檚 see where you can do or implement these in your MBI account. For that, let鈥檚 switch over to MBI.

To deactivate inactive users in your account, go to the Manage Users section under the Account dropdown. And here you will see the list of users that currently have access to your account. Here you can search for the required user and deactivate them by clicking on Deactivate User. And after you do that, make sure to hit Update Settings to capture the changes. To delete the reports and dashboards that are no longer needed, you can go to the Reports and Dashboards page in the Manage Data section here and delete them as needed. Like to delete a report, you would select the report and then hit Delete. You can also delete the reports and dashboards directly from the Dashboards page here. To know more information about how to do that, you can refer to our webinar recording on Getting Started with MBI. So regarding adjusting the replication methods and rechecks on your tables, you can do that within the Data Warehouse section here. So let鈥檚 say if you want to look at the Sales Order table, let鈥檚 give it a minute for it to load. Okay, here it is. So if you notice, for the Sales Order table, it鈥檚 currently set on the replication method of single auto-incrementing primary key. You can edit this by clicking on the Edit icon here and it shows you the available options. Overall, if you need any help with optimizing your update cycle, you can open a support ticket and the team will be able to assist you. Alright, since we are looking at replication methods here, let鈥檚 begin to learn more about replication methods and rechecks. Replication and column rechecks control how MBI identifies new or updated data in your database tables. A synced table replication method defines how MBI brings in new data to that table. And a replication method is set on a table-by-table basis. Setting a recheck on a column instructs MBI to check that column for changed values at a specified frequency such as daily, weekly, and updates the value for that column if a change is found. Configuring replication method and rechecks correctly in your account is crucial to ensuring both data accuracy and optimized update times in your account. First, let鈥檚 talk about various replication methods that are available in MBI and how they work.

Replication methods fall into three categories. The first one is called as incremental replication. As you can see here, with this method, in every replication attempt, MBI will only replicate new or updated data from your connected data sources to your tables within MBI Data Warehouse. As these methods will greatly reduce latency, we highly recommend using incremental type of replication wherever possible. The second category is called full table replication. Within this type of replication methods, MBI will replicate the entire contents of a table on every replication attempt. Because of the potentially large amount of data to be replicated, these methods can increase latency and can increase update times too. So if a table contains any time stamped field or a date time column, we recommend using an incremental method wherever possible. And the third category of replication method is called paused and you can see it here. So when this replication method is selected, it indicates that replication for the table is stopped or paused. So as long as this replication method is set on a table for that entire duration of time, MBI will not check for new or updated data in that table during an update cycle. And the table data remains stale until the replication method is changed. So this means that no data will be replicated from the source table that has paused as its replication method. One thing to keep in mind here is that not all replication methods are available for all tables. Their availability varies based on the configuration or the structure of the table. So with an incremental type of replication, we have three methods available. One is modify that which we are not seeing on this table because as I said, not all replication methods are available and it varies based on the configuration of the table. And the other one is single auto-incrementing primary key and the third type of replication method under incremental category is add date. Let me go to a table that has modified that as the replication method so I can talk more details about it. Let me go to sales for the address.

Alright, let me click on the edit icon for the replication method here. As you can see here, we can see modified that as one of the possible replication methods for this table. So first let鈥檚 talk about modify. So modified at replication method uses a datetime column that is populated when a row is created and then gets updated when data changes. So the datetime column determines how MDI finds updated data to replicate. So this modified at replication method is designed to work with tables that meet the criteria of having a datetime column that is initially populated when a row is created and is updated whenever the row is modified. And it should have a datetime column that is never null and the rows in the table are never deleted. So to choose this replication method for your selected table, you would choose modify that in the dropdown here and select the replication key. Incremental replication methods require you to set a replication key. So replication key is nothing but the datetime column that is initially populated when the row is created and is updated whenever the row is modified. So you would select the suitable replication key for the table and hit save. I don鈥檛 want to make any changes so I鈥檓 not going to hit save. So within the modified at replication method, whenever an update runs in your MDI account, new or updated data is identified by searching for rows that have a value in the datetime column that occurred after the most recent update. So when new rows are discovered with that process, they get replicated into your data warehouse. So if any rows already exist in your data warehouse, they will be overwritten with the current database values. So to understand this process better, let鈥檚 take an example. Consider that you have a table set for modified at replication method and it has a datetime column called as updated at. And that indicates the last time when the data was changed. If the most recent update in your account ran on July 28th at 3pm, then the next update in your MPI account will search for all the rows that have an updated at value greater than July 28th 3pm. So any rows that were discovered or that were created or modified since July 8th at 3pm will be replicated to the data warehouse. Modified at replication method is not only the most ideal replication method, but it is also the fastest amongst all. So this method not only produces noticeable results with the speed with large datasets, but it also doesn鈥檛 require configuring a recheck option. Other methods will need to iterate through an entire table to identify changes, even if a small subset of data has changed. So modified at iterates only through that small subset of data. So whenever possible, if your data structure allows, it鈥檚 recommended to use modified at replication method. For the data sources that are connected using MPI鈥檚 pre-built third-party integrations, only modified at should be used as the replication method by selecting updated at as the replication key here. So that is because for those data sources, we have optimized this replication method of modified at for speed and data integrity when used with the updated at field as the replication key.

Alright, now let鈥檚 talk about single auto-incrementing primary key. So auto-incrementing is a behavior that sequentially assigns primary keys to rows in your tables. So if a table is auto-incrementing type and the highest primary key in the table is currently 10,000, then the next primary key value will be 10,001 or higher. So a table that does not use auto-incrementing behavior may assign a primary key value that is less than 10,000 or jump to a much higher number. This replication method of single auto-incrementing primary key is designed to replicate new data from tables that meet the criteria of having a single column primary key and a table that has primary key values. The table should have a primary key with the data type of an integer and the primary key values for the table should be of auto-incrementing type. So when a table is using single auto-incrementing primary key as the replication method, new data is discovered by searching for primary key values that are higher than the current highest value in your NDI data warehouse. So let鈥檚 say for example if the current highest primary key value in your NDI data warehouse for a given table is 700, then the next update runs, NDI will search for rows with primary key values of 701 or higher and replicates them into your data warehouse. This replication method by itself will not check for updated values in your existing rows. To handle that you will have to enable rechecks on the required columns and we will talk about rechecks in a bit here.

Now let鈥檚 talk about the add date type of replication method. Let me go to a table that has add date as the possible types of replication methods. Let me give it a minute for it to load. Alright, it has come up and let鈥檚 click on the drop down for the incremental type and we see add date as the replication method that鈥檚 available here. So add date replication method functions similar to single auto-incrementing primary key method. The difference between both the types is that instead of using an integer for the table鈥檚 primary key, add date method will use a time stamped column to check for new rows. This means that when a table uses add date replication method, new data in the source table is discovered by searching for time stamped values that are greater than the latest date synced to your table in MVI data warehouse. So let鈥檚 say if your last update in your account ran on July 26th at 10 AM, any rows in your source database with a time stamped greater than that will be marked as new data and will be replicated into your data warehouse. One thing to note here is that unlike the modified at replication method, add date will not check existing rows for updated information. It will only check for new rows and replicates them. Alright, so far we have discussed about incremental replication types. Now let鈥檚 talk about full table replications. As I鈥檝e mentioned earlier, this type of replication will replicate all the contents of a table in every update cycle and we have two methods in this category. Full table and primary key batch monitoring. Within full table replication, whenever there are new rows that are detected in the source table, the entire table in MVI gets refreshed. Amongst all the replication methods that are available, this is by far the least efficient one and that is due to the fact that all the data must be processed during every update assuming there are new rows in your source table. New rows are detected by querying your database and counting the number of rows at the start of the synchronization process in an update cycle. So if your local database contains more rows than your table in MVI, then the entire table is refreshed. Whereas if the row counts are identical or if table in MVI contains more rows than your local database, then the table is skipped in the update process. So this makes full table replication incompatible when you have cases where there are more rows deleted than created in your local database table between subsequent update cycles or when the column values in the tables are changed but no additional rows are created in your local database table. So in either of those cases, full table replication will not detect any changes in your source table and thus data in your MVI table will become stale. Because of the inefficiency or the least efficiency of this method, full table replication is only recommended as a last resort. For the primary key batch monitoring type of replication method, whenever a table uses this, new data is discovered by counting rows inside ranges or batches of primary key values. If you understand it better, let鈥檚 say an update runs in your MVI account and performs a row count on the source table in your local database for the range of primary keys from 300 to 400. In this update, the system finds and logs 54 rows. In the next update, a row count is performed again on the 300 to 400 range of primary keys and it finds 67 rows. Because there is a difference in the number of rows compared to the last update, the system will inspect that range or the batch in more detail. This replication method is designed to work with tables that have either a single column as a primary key with integer values of non-incrementing type or the table has a single column primary key with non-integer values or the table has composite keys as a primary key which means that the primary key is composed of multiple columns. One thing to note here is that the columns that are used in a composite primary key can never have null values. So this is not an ideal replication method since it鈥檚 incredibly slow due to the amount of processing that must occur to examine batches and find changes. We do not recommend this method unless it is not possible to make the modifications required to support any of the other replication methods. So now that we have an understanding of all the available replication methods, you can adjust them as needed in your tables. Once you select the desired replication method, be it incremental type or full table replication type, make sure to hit save and capture the changes. As I said, I do not want to make changes here so I鈥檓 going to hit cancel.

Alright, so this wraps up our discussion about replication methods and now let鈥檚 talk about rechecks. As I鈥檝e mentioned earlier, setting a recheck on a column instructs MBI to check that column for changed values at a specified frequency such as daily, weekly, etc. and MBI will update the value of the column if a change is found. So when a column has a recheck configured, MBI checks for changes to that column鈥檚 previously replicated values and whenever there is a change that is found, it updates the column with new values. Rechecks are very important when you have columns in your tables with changeable values. And the processing of rechecks varies based on the chosen replication methods, meaning that some replication methods will automatically check for updated values in existing rows and the tables that are on such replication methods do not need rechecks enabled. But for the tables that are on the rest of the replication methods, meaning the ones that do not handle updated values as part of the replication process, require rechecks to be configured on the columns that may have frequent changes in their data values. For example, one such common column that we see in general commerce schema is status field from the sales order table. So when an order is initially returned to the database, the status column might contain the value of pending. The order will then be replicated into your data warehouse with the pending value for the status field. But order status can change, right? So there will not always be an appending status. Eventually it could become complete or cancelled. To ensure that your data warehouse syncs this change, it鈥檚 important to check whether the table鈥檚 replication method handles updated values. If it does not, then you will have to enable rechecks on the column for the data values to be updated. So there are two ways a column can be flagged for rechecking. The first method is an auditing process that runs as part of the update process and it will flag the columns that will have to be rechecked. Note here that the auditor relies on a sampling process and the changing columns may not be caught immediately. And the second way to do it is manually enabling recheck configuration on any required column. About manual enabling of rechecks, the manual rechecks can be enabled within a table only when that table鈥檚 replication method is set to either single auto-incrementing primary key or primary key batch monitoring. And one thing to note here is that Modify That is the best choice for processing changing values as rechecks don鈥檛 have to be configured to efficiently check all the columns for updated values. And this is how you can manually enable rechecks on your desired columns. Within the data warehouse itself, go to the table that you have a required field and select the field. Let鈥檚 say for example we want to enable rechecks on the status field from the sales order table. So we are in the sales order table and let鈥檚 go to the status field. So to enable rechecks, we will have to select the field and click on the set recheck frequency. So when you鈥檙e flagging a column for rechecking, you can set how often you want the recheck to be done. If a particular column will change less often, choosing a less frequent recheck can optimize your update cycle. So once we click on the set recheck frequency dropdown, we can see the available recheck frequencies on the table or in MBI overall. So we can see that it has options for never, always, daily, weekly, monthly and once. So as update times are correlated to how much data needs to be synced, best practice here is to recheck as infrequently as possible. If you have a scenario where you would like to accommodate row deletions, meaning that if a row is deleted remotely in your source table, you want MBI to delete the corresponding row locally in its data warehouse, the set replication method on the table will have to be either single auto incrementing primary key or primary key batch monitoring. And when you have those like what we have on this table, you can enable MBI to handle row deletions by setting a recheck on the primary key and the process to do it is similar to how we enable rechecks on any other key. We just select the field and choose the required reset frequency and that will ensure that MBI handles row deletions as well. Alright, so this wraps up our part one of the webinar today and now let鈥檚 jump into the first Q&A segment for this session. We will answer some of the questions that we have received so far.

Alright, thanks for that great look at lots of ways to optimize your updates and your data warehouse. Quick reminder, if you have questions about the content so far, please submit those in the Q&A pod on the left side of your screen. We鈥檒l also have a second Q&A section at the end of today鈥檚 webinar. Alright, we have a few that have come in already. First question, what if new rows appear in my database table while MBI is in the middle of an update cycle? Will those rows be replicated in the current update or in the next one? That鈥檚 a great question. So it depends on what step the update cycle is on by the time new rows appear in your source database table. If the rows appear before MBI performs the synchronization step on that table, the new rows will be replicated in the current update. However, row sync is the first step in the MBI update cycle. So in general, you can expect rows that appear in your database after an update has started will be reflected in MBI following the next update.

Alright, thanks. Next is, there鈥檚 an error occurring in my updates. What can I do about it? If you have an error occurring in your updates, then please submit a support ticket. Our support team has access to extra details about your account鈥檚 update cycle and they can help you with diagnosing and fixing any errors.

Alright, next says, it doesn鈥檛 look like I have the data warehouse as an option in my left hand panel. Is this something that鈥檚 part of my package or do I need to access it differently? Yeah, to be able to access data warehouse, your MBI account should be on the pro version and you need to have admin privileges. So if you meet both the requirements and if you still cannot access the data warehouse, then please submit a support ticket and the team will be able to assist you. Alright, next, if our account has very long update times, which more than 24 hours or more than a day, does it make sense to configure a daily force update? In general, that鈥檚 not recommended. It鈥檚 because setting up a daily force update when your update takes longer than a day will result in fewer total updates being run. Like for example, if you have a force update schedule that starts at noon on a Monday and if update takes 26 hours, this will finish the update at 2 o鈥檆lock. So the next regular update will automatically start an hour or so later. So that regular update will still be running at noon Wednesday, but the scheduled force update stops it and it will initiate a new update. So the Wednesday force update finishes Thursday afternoon and if the regular update had been allowed to finish, then MBI would have displayed updated data almost a whole day earlier. So in cases like where you have update times taking a very longer time, meaning greater than 24 hours or so, then it鈥檚 not usually recommended to schedule a force update to run on a daily basis. All right. Next question. If I set a column to have a weekly recheck, does that recheck then happen each week at the time and day when I originally set the recheck? So I think like, for example, if you鈥檙e in MBI on a Tuesday and on that Tuesday at noon or something, you set, you configure that recheck, then is that recheck going to happen every single Tuesday at noon? That鈥檚 a really great question. It鈥檚 not really the case. Like for example, like as you said, if you configure a weekly recheck on Tuesday at noon, that doesn鈥檛 mean recheck will always be performed every Tuesday at noon. So there鈥檚 a standard time that each frequency of recheck runs. All the daily rechecks run during the first full update on a daily basis. And for all the weekly updates, the first update that starts after 9 p.m. Friday will do the rechecks. And for the monthly rechecks as well, all the first updates that run after 9 p.m. Friday will do the recheck once in every four weeks. All right. And I think we have time for about one more before we move on to our second section of the webinar. I鈥檝e done the suggested update cycle optimization steps, but my updates are still taking longer than I want them to. How else can I reduce update times? Is there any other way? Good question again. If you have done all the suggested optimization steps and if you still see longer updates in your account, then we suggest you to reach out to the support team. That鈥檚 because the 51黑料不打烊 Commerce support team can help you with optimizing your update cycle. And that may consist of a few possibilities of checking your inactive user list. And first of all, if you鈥檙e unsure about any of the optimization steps that we have explained in our session so far, they can help you with getting more clarity on those. And if you determine you need to transfer assets from an inactive user before deleting their account, then support can perform that transfer for you. And they can also help you identify inactive users in your account by their last login date. So finally, they can provide details on areas with greatest opportunity for improvement. And that could be like changing replication methods on specific tables or looking at calculated columns and charts that are consistently taking longer time to go through the update cycle process. So that can definitely help you with optimization efforts. Okay, great. So let鈥檚 wrap up our first Q&A section here and get back to Part 2 of our webinar. As a reminder, we鈥檒l have a second Q&A section after this Part 2, so feel free to keep your questions coming. I鈥檓 going to back over to you, Sonika. All right, now let鈥檚 go to the Part 2 of the webinar and talk about Data Warehouse Views. Data Warehouse Views feature is a method of creating new warehoused tables by modifying an existing table or joining or consolidating multiple tables via joins using SQL queries. Once a Data Warehouse View has been created and processed by an update cycle, it will populate in your Data Warehouse as a new table under the Data Warehouse dropdown in your MBI account. Most common use case of Data Warehouse Views in MBI is consolidating multiple similar but disparate tables together such that all the reporting can be built on a single new table. A few common examples here include consolidating the tables from a legacy database and a live database to combine both historical and current data or combining multiple ad sources like Facebook and AdWords into a single consolidated ad spend table. Once the Data Warehouse View is created in your account, it works like any other table where data gets updated in every update cycle and you can create calculated columns in it or build metrics and reports on top of it. And you can also use the new table in a SQL report. And one thing to note here is that you cannot reference one Data Warehouse View in another Data Warehouse View. Now let鈥檚 switch over to MBI to see where and how you can create Data Warehouse Views. As I said, you can find all the created Data Warehouse Views in your account under the Data Warehouse Views drop section here. And you will also have a tab for Data Warehouse Views within the Manage Data section in your account. Alright, so here you can see a list of views that are already created in your account. As you can see, we have one pre-created sample view in this demo account. You can see the contents in the view by clicking on the edit icon here. I do not want to make any changes in our sample view here so I鈥檓 going to hit cancel. You can also create a new view within your account by clicking on create Data Warehouse View. So within the interface to create a new Data Warehouse View, you can give your view a name and write your query here using standard SQL syntax. So when you鈥檙e finished, click save to save your view. Keep in mind that your view will temporarily have a pending status until it is processed by the next full update cycle, at which point the status will change to active. After being processed by an update, your view will be ready to use in reports. One thing to note here is that once you save the view, the underlying query used to generate a Data Warehouse View cannot be edited. For example, if for some reason you need to adjust the structure of your already created Data Warehouse View, you will need to create another new view and manually migrate any calculated columns, metrics, or reports from the original view to the new view. When the migration is complete, you can safely delete the original view. Because since Data Warehouse Views are not editable, we strongly recommend that you test the output of your query using the SQL Report Builder before saving your query as a Data Warehouse View.

Alright, so this wraps up our part 2 and the overall content portion of our webinar. Thank you very much everyone for your time today. With that, let鈥檚 jump into the final Q&A segment and I鈥檒l hand it back over to the host now.

Alright, thanks for some good info there on a powerful MBI feature. So now let鈥檚 move to our second Q&A portion. Again, please submit any additional questions about either section of today鈥檚 webinar in the Q&A pod on the left side of your screen. Also, please submit your feedback on the poll questions that now show on your screen. These are really helpful for us to improve our content in the future. Just a couple of admin notes, if you have questions after the webinar or if we don鈥檛 get to your question during this Q&A, please do send us an email to mdi-webinar at adobe.com. We鈥檒l receive questions sent there for the next week or so. And one last reminder, you鈥檒l get a follow-up email in the next day or two with a link to a recording of this webinar and to the resources library that we mentioned where all our webinar recordings get posted. And it鈥檒l also contain the email address that I just mentioned. Alright, a couple more questions that have come in. First one is, let鈥檚 see, do you have any recommendations on the best replication methods for Magento or other common data sources? Yeah, wherever possible, we always recommend to use incremental type of replication because that definitely helps with optimizing your update cycles. But for the 51黑料不打烊 Commerce schema in particular, most of the tables have primary keys and auto-incrementing format, I think. So in cases like that, our recommendation would be to use single auto-incrementing type of replication method. And on top of it, as we have mentioned earlier, for the third party integrations that are pre-built in MBI, they have the modified at replication method. I mean, sorry, modified at replication method would be the best choice to go using the updated at key because as we have mentioned earlier, for all the third party integrations, we have optimized modified at replication method for better data accuracy and optimization speed. Great. Here鈥檚 a question about warehouse views. Do data warehouse views accept all SQL syntax? I would say yes, nearly. Data warehouse views do not allow the format of select star. You must explicitly list all the columns that need to be included in the view. But otherwise, standard SQL syntax applies. To be very particular, like most of our clients are on Postgres version, so we recommend you to use the Postgres syntax. Again, the best practice is to test your query in the SQL report builder to check if that works before saving your data warehouse view. All right, great. Next question. In the update cycle, when one update finishes, does the next one start immediately? Not really. Like the next update does not start immediately because there is a queuing process between each update that takes about one hour or so. It鈥檚 also good to keep in mind when looking at the times for updates listed in your account in the data warehouse manager section. That helps you guys to factor in this out of queuing to get a better idea of your update durations. All right, thanks. I think we have time for about one more here. Let鈥檚 try. Can I reference a calculated column I have created on one of my tables in a data warehouse view? Great question. We cannot reference a calculated column in a data warehouse view because data warehouse views can only access native columns from your tables. All right, we鈥檙e about out of time for this webinar. I want to make sure that we wrap up and give people just a couple of minutes before they have to go to their next thing. So once again, everybody, thank you very much for joining us today. We hope you got some really useful information out of the session. Have a great rest of your day.

Useful resources

Optimizing Your Database for Analysis
Recommended Data Dimensions for Segmentation and Filtering

recommendation-more-help

d7409b8f-48b5-4e68-8fba-85517cfe1127