there: You open Energy BI, drag a messy Excel sheet into the canvas, and begin dropping charts till one thing appears to be like “proper.” It’s simple, it’s intuitive, and actually, that’s why Energy BI is certainly one of my favorite instruments for information visualisation.
However because the world of knowledge shifts towards end-to-end options like Microsoft Cloth, “simply making it work” isn’t sufficient anymore. Giant organisations want fashions which can be performant, safe, and scalable.
I’ve determined to problem myself by taking the PL-300: Microsoft Information Analyst Affiliate examination. However as an alternative of simply grinding by means of observe exams or memorising definitions, I’m going into “Sensible Mode.” If I’m going to get licensed, I need to show I can remedy the issues actual companies really face.
The Mission: The Enterprise-Grade Monetary Suite
For my first mission, I’m tackling the Government Monetary Well being Suite.
Why finance? As a result of within the enterprise world, it’s the final word check of your Information Modeling and DAX abilities. Most “generic” tutorials use a single, flat desk. However in an actual firm, information is fragmented. You will have “Actuals” (what occurred) sitting in a single place and “Budgets” (the objective) sitting in one other, normally at totally different ranges of element.
On this mission, I’m going to doc how I:
- Deconstruct a “Flat Desk” right into a clear, skilled Star Schema.
- Deal with the “Grain” Downside (evaluating each day gross sales vs. month-to-month budgets).
- Grasp DAX for these high-stakes metrics like 12 months-to-Date (YTD) and Variance %.
I’m sharing my journey in public in order that for those who’re additionally making ready for the PL-300, you may comply with alongside, construct these options with me, and perceive the why behind the structure — not simply the how.
For this mission, we’re utilizing the Microsoft Monetary Pattern. It’s the proper “clean canvas” as a result of it comes as a flat, “messy” desk that we’ve got to re-engineer professionally.
The way to get it: In Energy BI Desktop, go to Dwelling > Pattern Dataset > Load Pattern Information. Choose the financials desk.
Let’s get our fingers soiled in Energy Question.
Section 1: Information Transformation (Energy Question)
Earlier than touching DAX or visuals, I slowed myself down and spent actual time in Energy Question. That is the half I used to hurry by means of. Now I deal with it as the muse of all the things that follows.
If the information mannequin is shaky, no quantity of intelligent DAX will prevent.
Step 1: Information Profiling (a fast actuality examine)
As soon as I loaded the Microsoft Monetary Pattern dataset, the very first thing I did was activate column profiling:
- Column high quality
- Column distribution
- Column profile
After I activate Column high quality, distribution, and profile, I’m not attempting to be thorough for the sake of it. I’m scanning for model-breaking points earlier than they flip into DAX complications.
Column profiling instantly tells you:
- The place nulls are hiding
- Which columns are pretending to be dimensions
- Which fields look numeric however behave like textual content
1. Nulls & Information Sort Mismatches
I believe we’re good. Empty values are 0% all by means of, legitimate are 100%, and errors are 0%. Information sorts are all good, additionally. Most likely as a result of we’re utilizing the pattern financials dataset, there shouldn’t be any points
2. Cardinality: What Desires to Be a Dimension
Cardinality is just what number of distinctive values a column has. Energy BI surfaces this instantly in Column distribution, and when you begin taking note of it, modeling selections get a lot simpler.
Right here’s my rule of thumb:
- Low cardinality (values repeat so much) → doubtless a dimension
- Excessive cardinality (values are principally distinctive) → fact-level element
After I activate column distribution, I’m asking two questions:
- What number of distinct values does this column have?
- Do these values repeat sufficient to be helpful for filtering or grouping?
If a column appears to be like categorical however has 1000’s of distinct values, that’s a pink flag.
As soon as I turned on Column distribution, the dataset began sorting itself for me.
Some columns instantly confirmed low cardinality — they repeated usually and behaved like true classes:
- Phase
- Nation
- Product
- Low cost Band
- Manufacturing Value
- Gross sales Value
- Date attributes (12 months, Month Quantity, Month Identify)
These columns had comparatively few distinct values and clear repetition throughout rows. That’s a powerful sign: these need to be used for grouping, slicing, and relationships. In different phrases, they naturally belong on the dimension aspect of a star schema.
Then there have been the columns on the opposite finish of the spectrum.
Measures like:
- Items Offered
- Gross Gross sales
- Reductions
- Gross sales
- COGS
- Revenue
…confirmed very excessive cardinality. Many values have been distinctive or almost distinctive per row, with broad numeric ranges. That’s precisely what I anticipate from fact-level metrics — they’re meant to be aggregated, not filtered on.
That perception instantly knowledgeable my subsequent step: utilizing Reference in Energy Question to spin off Dim_Product and Dim_Geography, as an alternative of guessing or forcing the construction.
Step 2: Spinning Dimensions with Reference (Not Duplicate)
That is the purpose the place I ended treating the dataset as a report-ready desk and began treating it as a model-in-progress.
In Energy Question, it’s tempting to right-click a desk and hit Duplicate. I used to try this on a regular basis. It really works — however it quietly creates issues you solely really feel later.
As an alternative, I used Reference.
Why reference as an alternative of duplicate? You may ask
Whenever you create a referenced desk:
- It inherits all upstream transformations
- It stays logically tied to the supply
- Any repair within the truth desk mechanically flows downstream
From a real-world perspective, it’s simply… safer.
Right here’s how I created Dim_Product & Dim_Geography
Ranging from the primary monetary desk:
- I right-clicked the question and chosen Reference
- Renamed the brand new question to
Dim_Product - Saved solely product-related columns (Product, Phase, Low cost Band)
- Eliminated duplicates
- Ensured clear information sorts and naming
What I ended up with was a small, secure desk with low cardinality — excellent for slicing and grouping.
I repeated the identical strategy for geography:
- Reference the actual fact desk
- Maintain the Nation column
- Take away duplicates
- Clear textual content values
P.S. On this dataset, geography is represented solely on the nation stage. Relatively than forcing a area or metropolis hierarchy that doesn’t exist, I modeled Nation as a lean, single-column dimension.
Step 3: Create a Dynamic Date Desk
Right here’s the place I see lots of Energy BI fashions quietly fail PL-300 requirements.
- I didn’t import a static calendar.
- I didn’t manually generate dates.
- I constructed a dynamic date desk in Energy Question based mostly on the information itself.
Why this issues:
- It ensures no lacking dates
- It mechanically adjusts when new information arrives
- It aligns completely with Microsoft’s modeling finest practices
To create a dynamic date desk. Simply click on on Load -> Clean Question -> Superior Editor and paste this code in
Under is the precise M code I used
let
Supply = Financials,
MinDate = Date.From(Checklist.Min(Supply[Date])),
MaxDate = Date.From(Checklist.Max(Supply[Date])),
DateList = Checklist.Dates(
MinDate,
Length.Days(MaxDate — MinDate) + 1,
#length(1, 0, 0, 0)
),
DateTable = Desk.FromList(DateList, Splitter.SplitByNothing(), {“Date”}),
AddYear = Desk.AddColumn(DateTable, “12 months”, every Date.12 months([Date]), Int64.Sort),
AddMonthNum = Desk.AddColumn(AddYear, “Month Quantity”, every Date.Month([Date]), Int64.Sort),
AddMonthName = Desk.AddColumn(AddMonthNum, “Month Identify”, every Date.MonthName([Date]), sort textual content),
AddQuarter = Desk.AddColumn(AddMonthName, “Quarter”, every “Q” & Quantity.ToText(Date.QuarterOfYear([Date])), sort textual content),
AddDay = Desk.AddColumn(AddQuarter, “Day”, every Date.Day([Date]), Int64.Sort)
in
AddDay
This calendar:
- Covers each date within the dataset
- Scales mechanically
- Is prepared for time intelligence the second it hits the mannequin
As soon as loaded, I marked it as a Date Desk within the mannequin view — non-negotiable for PL-300.
By the tip of Section 1, I had:
- A clear truth desk
- Correct dimension tables created by way of Reference
- A dynamic, gap-free date desk
- Transformations I might really clarify to a different analyst
Nothing flashy but — however that is the section that makes all the things after it simpler, quicker, and extra dependable.
Within the subsequent part, I’ll transfer into information modeling and relationships, the place this construction actually begins paying dividends.
Section 2: Information Modeling (From Tables to a Star Schema)
That is the section the place Energy BI begins behaving like a semantic mannequin.
By the point I switched to the Mannequin view, I already had:
- A clear truth desk
- Lean dimensions created by way of Reference
- A dynamic, gap-free date desk
Now the objective was easy: join all the things cleanly and deliberately.
Step 1: Establishing the Star Schema
I aimed for a traditional star schema:
- One central truth desk (monetary metrics)
- Surrounding dimension tables (
Dim_Date,Dim_Product,Dim_Geography)
Each relationship wanted to reply three questions:
- Which desk is the “one” aspect?
- Which desk is the “many” aspect?
- Does this relationship make sense on the grain of the information?
You may discover that I didn’t introduce surrogate keys for the actual fact or dimension tables. On this dataset, the pure keys — Nation, Product, and Date — are secure, low-cardinality, and unambiguous. For this mannequin, including synthetic IDs would improve complexity with out bettering readability or efficiency.
Right here’s how the general mannequin appears to be like:
Step 2: Relationship Course (Single, on Goal)
All relationships have been set to:
- Many-to-one
- Single path, flowing from dimension → truth
For PL-300 and real-world fashions alike, single-direction filters are the default till there’s a powerful motive to not use them.
Step 3: Date Desk because the Anchor
The dynamic date desk I created earlier now grew to become the spine of the mannequin.
I:
- Associated
Dim_Date[Date]to the actual fact desk’s date column - Marked
Dim_Datebecause the official Date Desk - Hid the uncooked date column within the truth desk
This does three essential issues:
- Allows time intelligence
- Prevents unintentional use of the flawed date discipline
- Forces consistency throughout measures
From right here on out, each time-based calculation flows by means of this desk — no exceptions.
Step 4: Hiding What Customers Don’t Want
This can be a small step with an outsized impression. PL-300 explicitly exams this concept that the mannequin shouldn’t be simply appropriate — it ought to be usable.
I hid:
- International keys (Date, Product, Nation). If a column exists solely to create relationships, it doesn’t want to look within the Fields pane.
- Uncooked numeric columns that ought to solely be utilized in measures. After creating my DAX measures (e.g. Whole Gross sales, Whole Revenue). I can go forward and conceal uncooked numeric columns (like Items Offered, Gross Gross sales, Reductions, Gross sales, COGS, Revenue) from my truth desk. These nudges customers towards appropriate and constant aggregations
- Duplicate date attributes within the truth Desk (12 months, Month, Month Quantity). These exist already within the date desk.
Step 5: Validating the Mannequin (Earlier than Writing DAX)
Earlier than touching any measures, I did a fast sanity examine:
- Does slicing by Nation behave accurately?
- Do Product and Phase filter as anticipated?
- Do dates mixture cleanly by 12 months and Month?
If one thing breaks right here, it’s a modeling concern — not a DAX concern.
To check this, I created a fast visible checking the Sum of Revenue by 12 months. Right here’s the way it turned out:

To this point so good! Now we will transfer on to creating our DAX measures.
Section 3: DAX Measures & Variance Evaluation (The place the Mannequin Begins to Shine)
That is the section the place the work I’d completed in Energy Question and the mannequin actually began paying off. Truthfully, it’s the primary time shortly that writing DAX didn’t really feel like combating the desk. The star schema made all the things… predictable.
Step 1: Base Measures — the muse of sanity
I resisted my previous intuition to pull uncooked columns into visuals. As an alternative, I created specific measures for all the things I cared about:
Whole Gross sales :=
SUM ( financials[ Sales])
Whole Revenue :=
SUM ( financials[Profit] )
Whole Items Offered :=
SUM ( financials[Units Sold] )
Whole COGS :=
SUM ( financials[COGS])
Step 2: Time intelligence with out surprises
As a result of I already had an entire, correctly marked date desk, issues like year-to-date or prior yr comparisons have been easy.
Gross sales 12 months-to-Date
Gross sales YTD :=
TOTALYTD (
[Total Sales],
‘Dim_Date’[Date]
)
Gross sales Prior 12 months
Gross sales PY :=
CALCULATE (
[Total Sales],
SAMEPERIODLASTYEAR ( ‘Dim_Date’[Date] )
)
Step 3: Variance measures — turning numbers into perception
As soon as I had Precise vs Prior Interval, I might calculate variance with nearly no additional effort:
Gross sales YoY Variance :=
[Total Sales] — [Sales PY]
Gross sales YoY % :=
DIVIDE (
[Sales YoY Variance],
[Sales PY]
)
Similar strategy for month-over-month:
Gross sales PM :=
CALCULATE (
[Total Sales],
DATEADD ( 'Dim_Date'[Date], -1, MONTH )
)
Gross sales MoM Variance :=
[Total Sales] - [Sales PM]
Gross sales MoM % :=
DIVIDE (
[Sales MoM Variance],
[Sales PM]
)
Step 4: Why this really feels “simple”
Right here’s the trustworthy half: writing DAX didn’t really feel like the toughest factor. The laborious half was all the things that got here earlier than:
- Cleansing the information
- Profiling columns
- Spinning out dimensions with Reference
- Constructing a strong date desk
By the point I acquired right here, the DAX was simply including worth as an alternative of patching holes.
Good DAX isn’t intelligent — it’s predictable, reliable, and simple to elucidate.
Conclusion
The magic wasn’t in any single DAX method — it was in how the mannequin got here collectively. By profiling the information early, understanding cardinality, and spinning dimensions with Reference, I constructed a construction that simply works. A dynamic date desk and clear relationships meant time intelligence measures and variance calculations ran effortlessly.
Hiding pointless fields and grouping measures thoughtfully made the mannequin approachable, even for another person exploring it for the primary time. By the point I wrote the DAX for Precise vs. Prior Interval or Month-over-Month variance, all the things felt predictable and reliable.
If you wish to see the complete semantic mannequin in motion, together with all of the tables, relationships, and measures, you may obtain it here and discover the way it ties collectively. There’s no higher strategy to be taught than seeing a working mannequin in Energy BI and experimenting with it your self.
Wanna join? Be happy to say hello on any of the platforms under
