First, I want to clarify that DAX data types are not something exclusive to Power BI. Since the DAX language is common for data models (*data models*) for Power BI, for Power Pivot for Excel, and for SSAS Tabular. Therefore, everything explained here is applicable to all three products, since they share the same storage engine: Vertipaq.

We will classify them into various groups with common characteristics to facilitate understanding:

## DAX data types: numeric

They all have in common the occupation of their storage space **8 bytes**.

**Decimal number**: They are floating point numbers. It can store values of up to 15 digits, positive and negative, where the decimal point can be placed anywhere in the number**Fixed decimal number**– The decimal point has a fixed position. It can store values of up to 19 digits, “always” storing 4 decimal digits**Whole number**: Allows up to 19 digits in the integer part, but does not allow decimals

It is clear that if we do not have to memorize the decimals we will use the Whole Number type. Now, where we need to evaluate in detail the type of data to use is when we need to store decimals and not get carried away by the wizards and values that are assigned by default when importing data from the source because they can cause problems. We should always do a thorough review of the data type assigned to each column of all tables in our data model.

Whenever we store financial and management transaction values that need to store numbers with decimals, the best option is the fixed decimal number (integer), whose only exception is that we need to store more than 4 decimal places, in which case we do not we have no other alternative than to use the decimal number.

And why this decision? Because when working with data stored in floating point, small variations in calculations may occur due to precision, which although for a few tens of rows the consequences are negligible, we usually have thousands or millions of rows, so when we use aggregation it works ( which happens in practically all our data models), these variations occur, those imbalances that we cannot admit when we present the results and which, in this case, are not due to an error in the DAX formulas, but rather to the error committed by not assigning the appropriate data type to the columns involved in the calculations.

In summary, based on the above, this is my recommendation regarding the numeric data types to use in the columns of your model tables:

- If there are no decimals -> Integer
- If there are up to 4 decimal places -> Fixed decimal number. For most financial transactions 2 decimal places are sufficient, so it will not be common for an overflow to occur due to the habitual use of this type of data.
- If there are 5 decimal places or more -> Decimal number. It’s not that it seems like the most appropriate option, but since we don’t have a data type that allows us to store more than 4 decimal places without using floating point, it’s the only viable alternative.

Finally, I leave you this link to an article in English that addresses this topic in more depth: Choosing numeric data types in DAX.

## DAX data types: date and time

An important detail to keep in mind is that, although they are not visible, internally they are stored as a sort of decimal number, which stores the time difference between the indicated date and 1/1/1900. Stores the date in the integer part of the internal number and the time in the decimal part.

We have three types of data to store dates and times:

**Appointment**– stores date and time values**Date**: stores only the date part, with time values always equal to zero (decimal part)**Time**: stores only the hourly part, with the date part always composed of zeros (integer part, which is therefore equivalent to 1/1/1900)

Of course, it is not necessary to know these numerical storage details that I talked about, but at any given moment, if something does not give the expected result, understanding these internal details can be very useful in solving specific problems.

## Other DAX data types (text, boolean, binary)

To store text values, we have a single data type: **Text (Text)**. Stores Unicode character strings up to 512 MB. Therefore, without going into more technical details, you can store very long character strings :), up to 256 million Unicode characters. It is case insensitive, i.e. hello = Hello = HELLO.

We also have a data type called boolean **True False**which only allows these two values, the “false” value is stored internally with a 0 (zero) and the “true” value with a 1.

Finally, we have the type **Binary (Binary)**, which allows us to store various objects and is mainly used to store photos. You will think, why do I want a photo in an analytical model? Well, mainly to improve the user interface and usability. You can use it, for example, to filter by photo instead of a description, or to view those photos in a report or dashboard.

I leave you a link to the official Power BI website where you can get more information: Data types in Power BI Desktop.

## DAX data types and table relationships

It is very important to understand all the characteristics and behaviors of each of the DAX data types, since in addition to applying the ideal data type to each column and avoiding the possible problems mentioned above, it is necessary to establish a relationship between two tables, the columns involved in said relationship must be of the same data type.

## Data types and formats

Data types affect internal storage and calculation results. It is essential, as we have mentioned, to use the appropriate types of data to obtain the desired and correct results. The format only affects the display.

It is also very important to use the appropriate format, which is also consistent with the type of data used and the number of decimal places obtained in the calculations. For example, if you use the integer data type, it makes no sense to put 4 decimal places in the format. Just because it doesn’t make sense doesn’t mean it can’t be done, obviously it can be done, what will happen is that five zeros will always appear in the decimal part.

If, however, we enter a format with fewer decimals than those stored by that type of data, only those will be shown, performing the implicit rounding. For example, if we have a column with fixed integers and a two-decimal format, implicit rounding will be performed. Although in this case it is advisable to do an explicit rounding at the time of loading using the Query calculation functions and store it already rounded to the second decimal place.

## Conclusions

I hope that from this moment on you give all the importance you have to the use of appropriate data types, to explicit rounding calculations in the process of extracting, transforming and loading the source data into the model, and that you avoid those therefore “mismatches” unwanted in the analytical results????

If you want to delve deeper into this and many other fundamental aspects of Power BI, data analysis, applying appropriate methods and good practices, I recommend you read carefully…

## Leave a Reply