The hybrid dataset is part of the Real-time streaming in Power BI that helps you stream data and update dashboards in real time. One of the most popular uses of the Hybrid dataset is to connect MS forms to Power BI with Power Automate, so a hybrid dataset in Power BI might be something you have used before. One way to restart this type of dataset is to turn off the "Historic data analysis" option in the settings, and then turn it back on again to delete all records.
Nevertheless, how can it be scheduled or how can it be restarted with just one click?
The purpose of this blog post is to answer this question using Power Automate and API, but before we get into the details about the solution let's look at what a hybrid dataset is.
A real-time dataset can be classified into different types like the Push dataset, Streaming dataset, PubNub streaming dataset, and Hybrid dataset which is the combined version of the Push dataset and Streaming dataset. To understand the Hybrid dataset, we first need to understand Push and Streaming datasets:
Push dataset:
Push dataset is a feature in Power BI that allows data to be loaded into the service by pushing it from an external source.
Once the dataset is created, Power BI automatically creates a database to store the data, so you can use Power BI's report-building features, such as Power BI visuals, data alerts, and pinned dashboard tiles, to create reports with this dataset data.
Click here for more information.
Streaming dataset:
With a streaming dataset, data is also pushed into the Power BI service, but with a key difference: The data is stored only in a temporary cache, which soon expires. This temporary cache is used only to display visuals with transient histories, such as line charts with a one-hour time period.
Due to the lack of an underlying database, streaming datasets cannot be used to create report visuals. As a consequence, you cannot use report features such as filtering, Power BI visuals, and others.
Streaming datasets can only be visualized by adding tiles and using them as custom streaming data sources.
Click here for more information.
Hybrid dataset:
The name Hybrid dataset did not appear in any MS docs (based on my search), but I found it in one of RADACAD's blog posts, so I found it cool then I used it here as well.
This dataset in question is a combination of Push and Streaming datasets, featuring a temporary cache for incoming and outgoing data, as well as a database for storage. To create this dataset, the "Historic data analysis" option must be selected during its creation.
When Historic data analysis is disabled, as it is by default, you create a streaming dataset as described earlier. When Historic data analysis is enabled, the dataset you create (Hybrid) becomes both a streaming dataset and a push dataset.
This setting is equivalent to using the Power BI REST APIs to create a dataset with its defaultMode set to pushStreaming.
I mentioned earlier that disabling and enabling historic data analysis could restart a hybrid dataset, but in this blog post, I want to show you how to schedule the restart using Power Automate and APIs.
Let's look at this scenario to have a better understanding of the purpose of this blog: You have established a flow that links an MS Form to a Power BI Hybrid dataset. When a user submits the form, the data is automatically pushed and saved into the hybrid dataset which is connected to a report that visualizes the results.
But your client asked you to restart the dataset every week because they want to see the current week's data and old data are not valuable to them. The current week's data can be visualized using filters and formulas, but the old data have no value, so keeping them is pointless and we can restart the dataset every week.
By using this method, you can create a Microsoft Power Automate flow that automatically restarts the Power BI dataset every Monday (Or any other day) at a specific time rather than manually.
Before going through the Power Automate flow set-up, please read these articles about how to use HTTP actions to run a Power BI API:
Create the flow on Power Automate
Step 1: Run flows on a schedule
You can create a flow that performs a task(s):
Once a day, an hour, or a minute.
On a date that you specify.
After the number of days, hours, or minutes that you specify.
Once you've signed in to Power Automate, select My flows > New flow > Scheduled cloud flow.
In the fields next to Starting, specify the date and time when your flow should start.
In the fields next to Repeat every, specify the flow's recurrence.
Select Create.
Following the above actions, a new flow with a Recurrence action will be created.
Click on it, and select "Show advanced options".
You will see the following items to configure in the advanced section: **These options change based on the values to which Interval and Frequency are set. If your screen doesn't match the graphic below, make sure that the Interval and Frequency are set to the same values that the graphic shows.
In the Time zone field, select a time zone from the dropdown list to specify whether the Start time reflects a local time zone, Coordinated Universal Time (UTC), or another time zone.
In the Start time field, enter a start time in this format: YYYY-MM-DDTHH:MM:SSZ
If you specified Day under Frequency, select the time of day when the flow should run in the At these hours and At these minutes dropdown lists.
If you specified Week under Frequency, do the following.
In the On these days and At these hours dropdown lists, select the day or days of the week on which the flow should run and the time or times of day when the flow should run.
In the At these minutes field, enter the minute values (from 0 to 59) separated by a comma.
In our scenario, the flow must be run every Monday at 8:00 AM, so the Recurrence must be configured like this image:
Learn more about the advanced options and how to configure them.
Now we can add the first action which is the HTTP to run a Power BI API.
Step 2: Set Up the HTTP Action to call Power BI Rest APIs
There is one table in the Hyper dataset called "RealTimeData," and all rows in that table need to be deleted to restart the dataset. The "Push Datasets - Datasets DeleteRowsInGroup" API must be utilized in the HTTP action to delete all rows in that table.
HTTP |
DELETE https://api.powerbi.com/v1.0/myorg/groups/{groupId}/datasets/{datasetId}/tables/{tableName}/rows |
The Power BI API URL must be completed with the inclusion of the {groupId} (WorkspaceId), {datasetId} (Dataset ID), and the table name {tableName}, which is "RealTimeData".
The Hyper Dataset ID and Workspace ID can be found on the URL of the setting page.
Select the Dataset and click on the More Option (... / ellipsis)
Select the setting on the list
The Dataset ID can be found in the URL after /dataset/ on the settings page:
The Workspace ID can be found in the URL after /groups/ on the settings page:
So the API URL for this example will be: https://api.powerbi.com/v1.0/myorg/groups/0e8aee6d-e630-438e-964b-cb15c0c6e06c/datasets/7c7d9291-be87-4164-9ba9-362da727343b/tables/RealTimeData/rows
The Advanced Options of the HTTP action has multiple fields, so I recommend reading my blog post "Call Power BI Rest APIs with Power Automate" to understand how to properly configure it to call a Power BI API in detail.
Your Flow is ready now!
Note:
Additional actions can be added after the HTTP action, such as sending an email or a text message on Teams, to notify users that the hyper dataset has been restarted. Alternatively, an action can be added before the HTTP action to save the previous data in an Excel file or move it to another dataset for historical record keeping.
コメント