Working with DateID in Direct Query and Incremental Refresh
Scenarios when using DateTime to limit data and maintain query folding can be done with DateIDs! Two scenarios are (1) rolling x days of data for direct query or (2) setting up incremental refresh on a fact table with DateID.
First, DateID's are typically used in data warehouses, and come in the form YYYYMMDD. So today, 30th January 2021, would be 20210131, and an int (integer). Integer's are great indexes and make for faster querying and use less space in SQL. These typically join to a Date table in the data warehouse that has the actual date and other date related fields, just like you have in a date table in Power BI.
These scenarios assume you have connected to a fact table, and quite possibly a very large one, that you want to filter via date to avoid bringing in all the rows. You quickly encounter the query folding term, and I'm not an expert on that topic, but I know it means that when the data is refreshed it won't first try to bring back all the rows before limiting it.
The key is to take the datetime value and convert it to a DateID in the filtering step. One step only!
Scenario 1: Direct Query to a table and bring back only the last 10 days of data
1. Open Power BI Desktop and connect to your table using direct query
2. In the "Transform Data" view, you can see your data!
3. I'm a bit lazy, so first I will go and filter the DateID to greater than or equal than a few days back, to get the next step mostly written for me.
4. Now I will go to the advanced editor and make it less manual.
5. On the home ribbon choose "Advanced Editor", and you should see something similar to what I have below. I am connecting to an Azure SQL Database, but there are other connectors could have used. The important part is that last line before in should be similar (it references the line before, called "dbo_MockDataDateID", so yours will most like be different).
6. Add a row before "Source = ...." to create a variable to hold the date that is 10 days back from today, called x. That end comma and the capitals are very important when writing m script.
x = Date.AddDays(DateTime.Date(DateTime.LocalNow()),-10),
Unpacking this, the DateTime.LocalNow give the current datetime. DateTime.Date gives us back the date part only (we don't care if it's 1am or 5pm!). Date.AddDays now uses that date, which is today's date, and tells us what the date was 10 days ago.
So, x is a Date data type and is holding the value for 20 January 2021, as today is 30 January 2021.
7. We now use this date instead of the manually entered one by replacing the 20210115 with this Date.Year(x)*10000 + Date.Month(x)*100 + Date.Day(x). The whole line becomes:
#"Filtered Rows" = Table.SelectRows(dbo_MockDataDateID, each [DateID] >= Date.Year(x)*10000 + Date.Month(x)*100 + Date.Day(x))
And that is really all you need to have it now be a constant 10 day rolling window of data. If you want to be able to adjust that time frame once it's published, you can also make that -10 a parameter.
Go to Home ribbon > Manage Parameters > New Parameter to create a your parameter.
Let's call it DaysBack, giving it a nice description for when you forget what it's for later, make it a number (decimal), and default it to -10.
And finally, lets use it in our query! Instead of -10 change it with DaysBack.
Now, you can go ahead and publish your report to your workspace. When you open it up the first time you can follow the prompts to enter your credentials to get it connected and you will see in that scheduled refresh page a section for the parameters. You can adjust the DaysBack here to your desired days back going forward!
Scenario 2: Incremental refresh set up on a table with DateID
What is interesting about this scenario is that the documentation for setting up incremental refresh does have this tip, but when I first encountered it I could not figure out how to implement it! I read that little green tip box many times! This is what they provide:
(x as datetime) => Date.Year(x)*10000 + Date.Month(x)*100 + Date.Day(x)
I assume you have already followed along the documentation for setting up incremental refresh and already created your RangeStart and RangeEnd parameters, both of which are DateTimes. (Careful to put that RangeEnd far into the future or your data will stop at that date.)
1. Filter the DateID manually like you would with the RangeStart and RangeEnd, making sure to only have one end with "or equal to".
2. Click OK and take a look at the advanced editor to see that row we created manually:
3. Your line 4 should be similar (it references the line before, called "dbo_MockDataDateID", so yours will most like be different). We are going to modify this line to use the helpful tip to convert the datetime to DateID format of YYYYMMDD.
each [DateID] > Date.Year(RangeStart)*10000 + Date.Month(RangeStart)*100 + Date.Day(RangeStart) and [DateID] <= Date.Year(RangeEnd)*10000 + Date.Month(RangeEnd)*100 + Date.Day(RangeEnd)
And that's it! Now you can close and apply, then go to model view to set up your incremental refresh on that table as usual.
Let me know if you find these tips helpful!