Data Analysis Expressions (DAX) is a powerful language used in Power BI to create complex calculations and data analysis. One of the key features of DAX is the ability to use variables, which can significantly enhance the efficiency and readability of your code. This article explores the benefits of using variables in DAX, provides examples, and discusses their impact on performance.
Variables in DAX offers you to break your logic into smaller steps by storing specific parts of your logic in variables. These variables can then be referenced in multiple areas within your DAX.
Using DAX variables has many benefits. Essentially, a DAX variable enables you to store DAX logic that can be used later in your DAX code. This makes it possible to break down your code into smaller parts or steps. This, in turn, allows for code-reusability, cleaner and more structured code .Not only can this result in increased performance, following this practice will also greatly simplify future maintenance as well as help with documentation of your code logic.
Benefits of DAX variables
When it comes to software development, reusing code has lots of benefits. As such, variables in DAX offer numerous advantages, from improved performance to easier code maintenance. Let’s jump into these areas in detail.
Performance
Reusing code in DAX can improve the performance of your DAX. Using DAX variables allows you to define DAX statements once that can be reused in multiple parts of your code.
Let’s take an example and say you want to calculate your profit margin. This is done by subtracting cost from your sales amount and dividing the result by your sales amount.
The formula could look like this:
DIVIDE(
SUM
([Sales amount] - [Cost]) / [Sales amount])
As you can see, the reference to measure [Sales amount] is defined twice. Which could result in that measure being computed twice. Instead, you could store the reference to measure [Sales amount] only inside a DAX variable.
By doing so, the measure is computed only once.
VAR _SalesAmount = [Sales amount]
Another example could be situations where you want to check a measure for a specific value using multiple SWITCH or IF statements. By using a variable you can define the value of the measure once, and then reference the variable inside your SWITCH/IF statements instead.
The performance aspect is not the only important aspect of using DAX variables, some might say it’s not even the most important one. However, as your Power BI solution scales, reusing code can have a large positive effect. If you are using DirectQuery semantic models, it might have an even bigger impact.
Reducing number of errors
One other aspect of code reusability is that it allows you to minimize both bugs and errors in your code.
By breaking down your code into smaller steps, you will minimize the risk of mistakes. You can also define parts of you logic once and then reuse it in multiple places, which will reduce risks for multiple variations of the same logic.
Maintenance and code readability
Breaking down your code into smaller parts and steps will make your code easier to read and make it easier to document. Having your logic broken down in smaller steps will also make it easier to debug and troubleshoot. Carefully designing and defining your DAX variables in steps will make your code easier to follow and understand. This will not only help yourself, but especially for other developers who needs to work with your code. And trust me, there will be others.
Furthermore, using DAX variables will make it easier to adjust logic or add new features to existing code.
Showing DAX Variables by example
Let’s start with setting up some typical measures. In this example we are using three measures. Sales amount, Sales amount PY and Sales amount YoY.
Sales amount =
CALCULATE (
SUMX (
'Sales',
'Sales'[Net Price] * 'Sales'[Quantity]
)
)
Sales amount PY =
CALCULATE (
[Sales amount],
CALCULATETABLE (
DATEADD (
'Date'[Date],
-1,
YEAR
)
)
)
Sales amount YoY =
VAR _SalesAmountCurrentYear = [Sales amount] //Sales amount value current period
VAR _SalesAmountPreviousYear = [Sales amount PY] //Sales amount value previous year
VAR CheckForBlanks =
NOT ISBLANK ( _SalesAmountCurrentYear )
&& NOT ISBLANK ( _SalesAmountPreviousYear ) //Check if blank values does not exists
VAR Result =
IF (
CheckForBlanks,
_SalesAmountCurrentYear - _SalesAmountPreviousYear //Calculate YoY Value
)
RETURN
Result
As you can see from the DAX code above, we are using DAX variables in measure Sales amount YoY. This allows us to define Sales amount and Sales amount PY only once, and refer to that definition multiple times in the code.
The DAX variable _SalesAmountCurrentYear holds the computed values for Sales amount and DAX variable _SalesAmountPreviousYear holds the computed values for Sales amount PY.
Furthermore, there is a variable called CheckForBlanks that checks for existing blank values in any of the two previous variables.
Lastly, we do a simple check with an IF statement before calculating the year over year value by subtracting the previous year sales amount from the current year sales amount.
Working with variables this way enables us to clearly define separate parts of logic that is only defined once in the code. As mentioned earlier, this has many benefits. For example, if you change the measure for Sales amount PY, you would only need to change the reference once in Sales amount YoY.
Testing and debugging the code is also easier. For example, instead of returning variable Result, you can choose to temporarily return any other variable to check what values it returns as output. Without variables this would be a much more tedious task to do.
Investigating performance
As mentioned previously, using DAX variables and reusing code generally increases the performance of you semantic model. Depending on your data model and complexity of your DAX code, this might not have a huge impact on import mode semantic models. However, when using DirectQuery semantic models, the impact can be much greater. Let’s take a look at an example.
Performance using DAX Variables
By using DAX Studio, we can examine the execution times of different variations of the measure Sales amount YoY.
Let’s start by returning values for Sales amount YoY for each year, by using the DAX variables as described earlier in this article.
Let us analyze the results.
The total time of exectution is 604 ms. Out of those 604 ms, 475 ms was spent on the storage engine exeuction, which in this case is SQL Server executing the query at my underlying data source.
We can also see that 5 Storage engine SQL queries was generated. Which means that 5 SQL queries was sent to my underlying SQL server, where some of those has been sent sequentially.
Each of these SE queries has been generated to accommodate the need for different parts of the logic. In this case, we are not only including one measure Sales amount YoY. We are also referring to two other measures that in turn has specific logic which needs certain data to be computed. For example, the first query returns a list of all dates, most likely to be able to calculate the Sales amount PY measure, which has a DATEADD function. The next SE query returns a list of all the years, which is included in the DAX query.
Performance without DAX Variables
Now, let’s alter the measure Sales amount YoY. This time, we are not going to use reusable variables for measures Sales amount and Sales amount PY. Instead, we are going to reference those measures directly in all parts of the code.
Lets analyze the results.
The total execution time is now 1198 ms which is an increase of 594 ms or 98% compared to the precious execution.
SE execution time is 972 ms which is an increase of 497 ms or 104% compared to the previous execution.
Furthermore, we now have 8 SE queries instead of the previous 5. An increase of 3 SE queries.
The reason for the additional SE queries and the increase in execution times is that DAX now needs to compute the measures Sales amount and Sales amount PY multiple times. Instead of only once in the previous version. Using a DirectQuery semantic model means additional underlying queries are being generated.
Last words about DAX Variables
Even though the name variable implies that the values can change, this is not the case. A DAX variable is not a traditional variable in that sense. It is more of a constant that gets its value assigned once and only once.
DAX variables are computed in the context of the definition of the variable. If you call a variable from within a CALCULATE function that has a filter context, the value of the variable will not be affected.
DAX variables can contain both scalar and table values. This means that you can also reuse whole tables within e.g. a measure.
Summary
As you can see, using variables in DAX has many advantages. Make sure to utilize these in order to make your model cleaner, faster and easier to maintain. Furthermore, using DAX variables is considered a standard among experienced Power BI and DAX developers.
Furthermore, it’s important to understand the effect on DirectQuery semantic models. Since using DAX variables might have an even bigger impact on the performance of such models. Make sure that you go through your DAX code. Test and benchmark accordingly to speed up the query time of your semantic model, and also your Power BI reports.
Curious about how you can utilize DAX Studio to troubleshoot your Power BI DAX queries? Read my two-part article on how to level up your Power BI semantic model.
Curious about other performance tuning tips? Read my Top tips to speed up your Power BI DirectQuery semantic models or go deeper by reading my articles about mastering DirectQuery performance in Power BI – start here
Thank you for reading!
Did you enjoy this? Or do you have any other thoughts on using DAX variables? Please let me know in the comments
Make sure to subscribe to get notified on new content!