From 8de88bb43ccd6f574211be66d8fbbf6165ef4c6e Mon Sep 17 00:00:00 2001 From: Jon Date: Sat, 18 Apr 2026 13:56:39 +0200 Subject: [PATCH] Clear all posts; seed 6 placeholders for authoring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Removes the 5 AI-drafted posts plus the two earlier DAX posts. In their place, 6 short placeholder stubs — each with a title, category, tags, excerpt, and a "Draft — outline only" body — so the homepage, sidebar, and feed render meaningfully while the content is authored. Placeholder spread - Power BI 2026-04-10 model-review-playbook - DAX 2026-03-25 filter-context-mental-model - Power Query 2026-03-12 m-patterns-for-messy-sources - AI 2026-02-28 ai-on-analytics-work - Tooling 2026-02-15 tabular-editor-workflows - Craft 2026-02-02 notes-on-analytics-engineering Five categories, two months (Feb + Apr), so the Categories panel and the Archive panel both have variety to render. Also updates posts.json, feed.xml, and sitemap.xml to match. Co-Authored-By: Claude Opus 4.6 --- content/posts.json | 91 +++++------ .../ai-for-dax-without-losing-control.md | 124 --------------- content/posts/ai-on-analytics-work.md | 18 +++ .../all-allexcept-allselected-mental-model.md | 144 ----------------- content/posts/claude-code-for-pbi-refactor.md | 148 ------------------ .../posts/dax-measure-definition-standards.md | 79 ---------- .../posts/dax-readability-formatting-guide.md | 131 ---------------- .../display-folders-vs-naming-conventions.md | 121 -------------- content/posts/filter-context-mental-model.md | 18 +++ content/posts/m-patterns-for-messy-sources.md | 18 +++ content/posts/model-review-playbook.md | 18 +++ .../posts/notes-on-analytics-engineering.md | 18 +++ .../reviewing-a-data-model-you-didnt-build.md | 127 --------------- content/posts/tabular-editor-workflows.md | 18 +++ feed.xml | 83 +++++----- sitemap.xml | 29 ++-- 16 files changed, 197 insertions(+), 988 deletions(-) delete mode 100644 content/posts/ai-for-dax-without-losing-control.md create mode 100644 content/posts/ai-on-analytics-work.md delete mode 100644 content/posts/all-allexcept-allselected-mental-model.md delete mode 100644 content/posts/claude-code-for-pbi-refactor.md delete mode 100644 content/posts/dax-measure-definition-standards.md delete mode 100644 content/posts/dax-readability-formatting-guide.md delete mode 100644 content/posts/display-folders-vs-naming-conventions.md create mode 100644 content/posts/filter-context-mental-model.md create mode 100644 content/posts/m-patterns-for-messy-sources.md create mode 100644 content/posts/model-review-playbook.md create mode 100644 content/posts/notes-on-analytics-engineering.md delete mode 100644 content/posts/reviewing-a-data-model-you-didnt-build.md create mode 100644 content/posts/tabular-editor-workflows.md diff --git a/content/posts.json b/content/posts.json index d8cc3a1..f8c717c 100644 --- a/content/posts.json +++ b/content/posts.json @@ -1,75 +1,64 @@ { "posts": [ { - "slug": "reviewing-a-data-model-you-didnt-build", - "title": "Reviewing a data model you didn't build", - "date": "2026-04-18", - "category": "Craft", - "tags": ["Review", "Power BI", "Modeling"], - "excerpt": "A structured approach to picking up an unfamiliar Power BI or tabular model — what to read first, what to measure, and which smells to treat as red flags.", - "readTime": 7, - "author": { "name": "Jonathan Pap" } - }, - { - "slug": "claude-code-for-pbi-refactor", - "title": "Using Claude Code to refactor a Power BI project", - "date": "2026-04-17", - "category": "AI", - "tags": ["Claude", "Power BI", "Refactoring"], - "excerpt": "A practical note on using an agentic coding tool for tabular model work — where it helps, where it makes things worse, and how to keep a human in the loop.", - "readTime": 7, + "slug": "model-review-playbook", + "title": "Model review playbook", + "date": "2026-04-10", + "category": "Power BI", + "tags": ["Power BI", "Review", "Governance"], + "excerpt": "Draft — a working checklist for reviewing a Power BI model, whether you built it or inherited it.", + "readTime": 5, "author": { "name": "Jonathan Pap" } }, { - "slug": "ai-for-dax-without-losing-control", - "title": "Using AI to write DAX without losing control of your model", - "date": "2026-04-16", - "category": "AI", - "tags": ["DAX", "AI", "Workflow"], - "excerpt": "A workflow for getting high-quality DAX out of an LLM without letting it silently reshape your semantic model.", + "slug": "filter-context-mental-model", + "title": "A working mental model for filter context", + "date": "2026-03-25", + "category": "DAX", + "tags": ["DAX", "Filter Context"], + "excerpt": "Draft — the explanation of filter context I wish I'd had early on, with worked examples.", "readTime": 6, "author": { "name": "Jonathan Pap" } }, { - "slug": "all-allexcept-allselected-mental-model", - "title": "ALL, ALLEXCEPT, ALLSELECTED: a mental model for filter context removal", - "date": "2026-04-15", - "category": "DAX", - "tags": ["DAX", "Filter Context", "Reference"], - "excerpt": "A one-page mental model for DAX's three filter-removal functions — when each one applies and why picking the wrong one silently breaks totals.", - "readTime": 6, + "slug": "m-patterns-for-messy-sources", + "title": "M patterns for messy source data", + "date": "2026-03-12", + "category": "Power Query", + "tags": ["Power Query", "M", "Patterns"], + "excerpt": "Draft — Power Query patterns I reach for when the source was built for something other than analytics.", + "readTime": 5, "author": { "name": "Jonathan Pap" } }, { - "slug": "display-folders-vs-naming-conventions", - "title": "Display folders vs. naming conventions: organizing a growing measure model", - "date": "2026-04-14", - "category": "Power BI", - "tags": ["Power BI", "Governance", "Standards"], - "excerpt": "A mid-size Power BI model needs both display folders and naming conventions. Here's how each earns its keep, and where they tend to collide.", + "slug": "ai-on-analytics-work", + "title": "Using AI on analytics work", + "date": "2026-02-28", + "category": "AI", + "tags": ["AI", "Workflow"], + "excerpt": "Draft — where LLMs genuinely help on analytics tasks, where they quietly hurt, and the workflow changes that tilt the ratio.", "readTime": 5, "author": { "name": "Jonathan Pap" } }, { - "slug": "dax-measure-definition-standards", - "title": "DAX Measure Definition Standards", - "date": "2026-02-14", - "category": "Power BI", - "tags": ["DAX", "Standards", "Modeling"], - "excerpt": "A practical standard for naming, formatting, and organizing DAX measures so teams can keep models clear and maintainable.", + "slug": "tabular-editor-workflows", + "title": "Tabular Editor workflows worth knowing", + "date": "2026-02-15", + "category": "Tooling", + "tags": ["Tabular Editor", "Tooling", "Power BI"], + "excerpt": "Draft — the Tabular Editor scripts and habits that earn back their learning cost.", "readTime": 4, "author": { "name": "Jonathan Pap" } }, { - "slug": "dax-readability-formatting-guide", - "title": "Better DAX Readability: Syntax, Formatting, and Tools", - "date": "2026-02-09", - "category": "Power BI", - "tags": ["DAX", "Formatting", "Tabular Editor"], - "excerpt": "A practical guide to writing readable DAX with consistent syntax, short-line formatting, and tool-based workflows using Bravo and Tabular Editor.", - "readTime": 8, - "author": { "name": "Jonathan Pap" }, - "image": "./assets/img/posts/dax-readability-formatting-guide.png" + "slug": "notes-on-analytics-engineering", + "title": "Notes on analytics engineering", + "date": "2026-02-02", + "category": "Craft", + "tags": ["Craft", "Career"], + "excerpt": "Draft — short reflections on what the role actually involves once the initial tooling decisions are behind you.", + "readTime": 5, + "author": { "name": "Jonathan Pap" } } ] } diff --git a/content/posts/ai-for-dax-without-losing-control.md b/content/posts/ai-for-dax-without-losing-control.md deleted file mode 100644 index 6a46811..0000000 --- a/content/posts/ai-for-dax-without-losing-control.md +++ /dev/null @@ -1,124 +0,0 @@ -# Using AI to write DAX without losing control of your model - -LLMs write passable DAX. They write *bad* DAX at the same rate, confidently, -with the same tone of voice. If you're using an AI assistant to speed up -measure authoring, the risk isn't that it refuses to help — it's that it -gives you something plausible that silently violates your model's semantics. - -This is the workflow I've settled on. It's not about prompting tricks. -It's about keeping the human decision points where they belong. - -## The core problem - -DAX is unusual among languages because its correctness depends on **context -you can't see in the code**. The same `SUM( 'Sales'[Amount] )` will return -different numbers depending on: - -- The current filter context (slicers, page filters, row headers) -- Whether there's an active relationship on `Date` -- Whether the current filter context came from the fact or the dimension -- Whether cross-filtering is set to "both" on a relationship - -An LLM sees the DAX. It doesn't see your model. It will happily produce code -that's syntactically valid, looks well-formatted, passes all the obvious -sniff tests — and computes the wrong number because the relationship between -`Sales` and `Date` is inactive. - -## What works - -### 1. Give the model the schema, not just the task - -Before asking for a measure, paste: - -- The relevant tables and their key columns. -- Which relationships exist, their cardinality, and which are **active**. -- The existing measures it might depend on (names + one-line purpose). - -A 15-line schema block cuts hallucination rate dramatically. The LLM stops -inventing `[Customer Key]` columns that don't exist because it can see the -ones that do. - -### 2. Ask for the explanation first, code second - -Good prompt shape: - -> "I want a measure that shows X. Before writing DAX, tell me what filter -> context it needs to run in, which columns it will reference, and whether -> it needs time intelligence. Then write the measure." - -When the model has to articulate the semantics in English first, two things -happen: it catches its own misunderstandings, and you get a decision point -before you have code to evaluate. - -### 3. Review the generated DAX against a checklist, every time - -A short checklist catches most of the common failure modes: - -- [ ] Does it use the right filter-removal function? (ALL vs ALLEXCEPT vs ALLSELECTED — see the dedicated post.) -- [ ] Does it reference columns that actually exist in the schema you provided? -- [ ] If there's a time dimension, does it use the marked date table, not `Sales[Date]`? -- [ ] Does it handle the empty / BLANK case explicitly? -- [ ] At the total row of a typical visual, will it return what the name promises? - -The last one is the big one. If the measure is named `Sales % of Total` and -at the total row it reads 8%, either the name or the code is wrong. - -### 4. Test on real data before trusting it - -An LLM-written measure that works on sample data can break on production data -because of things the model never saw: duplicate keys, inactive relationships -being used elsewhere, role-playing dimensions, security filters. - -Minimum viable test: - -- One known value (total of a year you've reconciled elsewhere) -- One edge case (filter to a date range with no sales) -- One total row (does it sum to 100% / the grand total / etc.) - -If all three match, the measure is probably fine. If one doesn't, don't -ship it. - -## What doesn't work - -**"Write me 30 measures for a sales model."** -You'll get 30 measures. Three will be right. Six will be subtly wrong. -The rest will be variations on a theme that's already wrong. You'll spend -longer vetting them than writing them. - -**Iterating without running the code.** -Don't refine the measure by asking the LLM if it looks right. It will always -say yes. Run it, see the number, then iterate. - -**Trusting generated time intelligence blindly.** -Time intelligence functions (`SAMEPERIODLASTYEAR`, `DATESYTD`, etc.) depend -on a properly marked date table with contiguous dates. LLMs assume this; -many models don't have it. Verify once per model that `FIRSTDATE('Date'[Date])` -returns what you expect with no filters, then you can trust subsequent time -intelligence. - -**Letting it name your measures.** -Naming is a model-design decision. The LLM doesn't know your team's -conventions. If you let it pick names, you'll end up with `Total Sales`, -`Sum of Sales`, `Sales Amount`, and `Sales Revenue` all meaning the same -thing across your model. - -## A workable loop - -1. Describe the question in plain English. -2. Paste schema + relevant existing measures. -3. Ask for explanation + code. -4. Rename the measure to fit your convention. -5. Run it against a known value. -6. Check the total row. -7. Commit. - -Five of those seven steps are you. The AI saves you the typing in step 3. -That's the whole value proposition — and it's enough, if you don't let the -typing saved in step 3 convince you to skip steps 5 and 6. - -## Closing note - -The mistake isn't using AI for DAX. It's treating the AI's output as a -finished artifact instead of a draft. A draft needs review; a finished -artifact doesn't. Getting that boundary right is the difference between -shipping faster and shipping wrong. diff --git a/content/posts/ai-on-analytics-work.md b/content/posts/ai-on-analytics-work.md new file mode 100644 index 0000000..4b63bc0 --- /dev/null +++ b/content/posts/ai-on-analytics-work.md @@ -0,0 +1,18 @@ +# Using AI on analytics work + +> **Draft — outline only.** A working note on where LLMs genuinely help on analytics tasks, where they quietly hurt, and the small workflow changes that tilt the ratio. + +## What this post will cover + +- What "review-first, generate-second" looks like in practice +- Prompts that pay rent; prompts that waste time +- Why schema-in-prompt beats schema-in-head +- The failure modes I've hit more than once + +## Why it matters + +_Draft pending._ + +## Closing note + +_Draft pending._ diff --git a/content/posts/all-allexcept-allselected-mental-model.md b/content/posts/all-allexcept-allselected-mental-model.md deleted file mode 100644 index 7c2f021..0000000 --- a/content/posts/all-allexcept-allselected-mental-model.md +++ /dev/null @@ -1,144 +0,0 @@ -# ALL, ALLEXCEPT, ALLSELECTED: a mental model for filter context removal - -Most DAX bugs I've debugged in other people's models come down to picking the -wrong filter-removal function. The three that trip people up look similar in -the function list but do genuinely different things — and the difference only -shows up at the total row, which is exactly where reports get read. - -Here's the one-page mental model I've ended up with. - -## The one-sentence version - -- **ALL** — "pretend no filters exist on this." -- **ALLEXCEPT** — "pretend no filters exist on this *except these*." -- **ALLSELECTED** — "pretend no filters from *inside the visual* exist, but - keep the filters the user applied *outside* it." - -If you only remember one thing: **ALL ignores slicers, ALLSELECTED respects -them.** That single distinction explains about 80% of the "my total doesn't -match my rows" bugs I've seen. - -## A worked example - -Say you have: - -- A `Sales` fact table -- A `Date` dimension -- A page-level slicer on `Date[Year]` set to 2025 -- A matrix visual with `Date[Month]` on rows and `[Sales]` in values - -You want a measure that gives "sales as a % of total for the visible period." - -### With ALL - -```dax -Sales % of Total (ALL) := -DIVIDE( - [Sales], - CALCULATE( [Sales], ALL( 'Date' ) ) -) -``` - -The denominator removes **every** filter on `Date`, including the year slicer. -Each month now shows its share of all sales *ever*, not all sales in 2025. -The total row reads `~8%` because 2025 is one year out of many. This is -almost never what a dashboard user expects. - -### With ALLEXCEPT - -```dax -Sales % of Total (ALLEXCEPT) := -DIVIDE( - [Sales], - CALCULATE( [Sales], ALLEXCEPT( 'Date', 'Date'[Year] ) ) -) -``` - -Removes every filter on `Date` **except** `Year`. The year slicer survives. -Monthly rows sum to 100% at the total. Usually what you want when the base -is "the year the user picked." - -### With ALLSELECTED - -```dax -Sales % of Total (ALLSELECTED) := -DIVIDE( - [Sales], - CALCULATE( [Sales], ALLSELECTED( 'Date' ) ) -) -``` - -Removes only filters that came from **inside the matrix** (the `Month` row -context) while keeping filters from **outside** (the year slicer, any page -filters, any visual-level filters). - -The effect: each month shows its share of *the year the user picked*, and -the total row reads 100%. This is what most users intuitively mean by -"% of total" on a filtered report. - -## The mental model - -Think of filters as coming from two places: - -1. **User filters** — slicers, page filters, report filters. These are - "what the user asked to look at." -2. **Visual filters** — row / column headers inside the matrix, axis values - on a chart. These are "what the current cell represents." - -Then: - -| Function | User filters | Visual filters | -|---|---|---| -| `ALL(table)` | Removed | Removed | -| `ALLEXCEPT(table, cols)` | Removed except listed cols | Removed except listed cols | -| `ALLSELECTED(table)` | **Kept** | Removed | - -The "user filters vs. visual filters" distinction is the whole game. -`ALLSELECTED` is the only one that splits them; the other two treat -filters as a single pile. - -## When to pick which - -**Use ALL when** the denominator is genuinely a grand total — a KPI card -that says "this year, we did X% of lifetime sales." There's no user context -to preserve; you *want* every filter gone. - -**Use ALLEXCEPT when** you have a specific filter you need to keep and -you're willing to hard-code it. Fragile to model changes (if the column -gets renamed, the measure silently starts aggregating at the wrong level) -but unambiguous. - -**Use ALLSELECTED when** the measure runs inside a visual and you want -"% of what the user is looking at." This is the common case for dashboards. - -## A gotcha that bites everyone - -`ALLSELECTED` behaves differently at the total row than on data rows. -At data rows it removes the row context. At the total row, since there -*is* no row context to remove, it effectively equals `ALL` of the user-visible -selection. - -Concretely: in the matrix above, the row-level `Sales % of Total (ALLSELECTED)` -for March is `March sales / 2025 sales`. The total row is -`2025 sales / 2025 sales = 100%`. That's correct and expected — but it also -means you can't directly use the same measure as a KPI tile outside the visual, -because outside there's no visual to pull "selected" from. - -## A cheap test - -Put all three versions of the measure on the same matrix, add a total row, -and eyeball whether the total reads 100%, ~8%, or something else. The total -is where the semantics show up. If your measure is labeled "% of total" and -the total row isn't 100%, the measure and the label disagree — and the label -will win in the user's head every time. - -## Closing note - -None of these functions is wrong. They answer different questions. - -The failure mode is picking by muscle memory — reaching for `ALL` because -it's the first one you learned — and shipping a dashboard where the total -row quietly means something different from what the column header promises. - -Pick by the question the user is asking, not by the function you typed -last week. diff --git a/content/posts/claude-code-for-pbi-refactor.md b/content/posts/claude-code-for-pbi-refactor.md deleted file mode 100644 index 12afe2b..0000000 --- a/content/posts/claude-code-for-pbi-refactor.md +++ /dev/null @@ -1,148 +0,0 @@ -# Using Claude Code to refactor a Power BI project - -Agentic coding tools are mostly pitched at web developers. They can do more -than that. I've been using Claude Code on Power BI projects — specifically -on the `.pbip` format and the TMDL / BIM files inside it — and while it's -not the silver bullet some vendors imply, it's genuinely useful for a -specific shape of task. - -Here's what works, what doesn't, and the guardrails I've ended up with. - -## Why it works at all - -Modern Power BI saves models as text. If you enable the Power BI project -(`.pbip`) format, you get a folder tree with: - -- `*.SemanticModel/` — the model as TMDL (tables, measures, relationships) -- `*.Report/` — the report as JSON (pages, visuals, bookmarks) -- `definition/` — various config - -All text. All diffable. All something an LLM can read, reason about, and -edit without screen-scraping the Power BI Desktop UI. - -That's the unlock. Pre-`.pbip`, AI assistance meant copy-paste roundtrips -out of the Desktop app. With `.pbip`, it's the same workflow as editing -any other codebase. - -## What it's genuinely good at - -### Bulk measure refactoring - -"Rename every measure starting with `Sum of ` to remove the prefix." -"Add descriptions to all measures in the `Financial Ratios` folder based -on their DAX." -"Convert all DIVIDE( a, b ) calls with a literal 0 fallback to use BLANK()." - -These are find/replace tasks that are tedious by hand because they need -semantic understanding of DAX, but they're near-trivial with an agent that -can read the TMDL, write a plan, and apply edits across dozens of files. - -### Drafting starter measures from a spec - -Give it a schema and a list of business questions, get back a TMDL file -with draft measures. Don't trust any of them without review (see the -previous post on AI + DAX), but as a scaffolding step it saves the -typing work. - -### Governance and consistency sweeps - -"Audit this model for measures that don't have descriptions." -"Find measures whose names don't match our convention (`[Entity] [Metric] -[Qualifier]`)." -"List all measures that directly reference columns instead of going -through existing measures." - -Sweeps like this are the kind of thing you'd do with a Tabular Editor -script, except the LLM can also read the surrounding context and explain -why each flagged item is flagged. - -### Documentation generation - -"Produce a Markdown document describing every table, its business purpose -inferred from the measure names that use it, and which measures depend on -it." You have to review the output — "inferred from" does a lot of work -— but as a starting point for a model dictionary it's faster than writing -it from scratch. - -## What it's not good at - -### Model design decisions - -"Should this be a star schema or a snowflake?" is not a prompt. It's a -conversation with someone who knows the business. An LLM will happily tell -you "star schema" because that's the statistically common answer, not -because it understood your problem. - -### Performance tuning - -DAX query plans depend on the storage engine's actual behavior — how the -relationships are materialized, what's in the vertipaq cache, column -cardinality. None of which is visible to an LLM looking at TMDL text. -It can spot obvious anti-patterns ("you're iterating a table of 40M rows -inside a CALCULATE"), but for real tuning you still need DAX Studio and -the actual query timings. - -### Relationship changes - -Changing cardinality or cross-filter direction has ripple effects across -every measure that relies on filter propagation. LLMs don't reliably -predict those effects. I don't let them edit relationships unattended — -too easy to introduce silent breakage. - -### The report layer - -Report JSON is editable in principle, but in practice the schema is -verbose, version-sensitive, and undocumented. Bulk operations on visuals -work for simple things (rename a visual, change a color) and fall over -on anything complex. This is improving; it isn't there yet. - -## Guardrails I've ended up with - -**Always work in a branch.** Not optional. The combination of an agent -making confident bulk edits and Power BI's "Save" being somewhat opaque -means you want `git diff` between you and committing. - -**Review every DAX change.** Bulk rename? Trust it. Bulk description -generation? Review each one — "plausible-looking fiction" is the failure -mode. - -**Don't let it open Power BI Desktop.** Keep the loop text-only: the agent -edits TMDL, you reload in Desktop, you eyeball the result, you iterate. -The moment you let it take screenshots or drive the UI, you've taken on -a different set of reliability problems. - -**Keep the agent scoped.** "Refactor the measures folder" is a reasonable -task. "Refactor the model" is not. Scope control is the single biggest -predictor of whether a session produces clean output or tangled output. - -## A sample session shape - -1. Branch off main: `git checkout -b refactor/measure-descriptions` -2. Prompt: "Read `SemanticModel/tables/Sales.tmdl`. For every measure - without a description, suggest a one-line description based on the - DAX and the measure name. Produce a diff; don't write yet." -3. Review the diff. Edit the prompt based on what you see. -4. "Apply." -5. `git diff` — scan for anything weird. -6. Reload the `.pbip` in Power BI Desktop. Spot-check three measures. -7. Commit. -8. Repeat for the next table. - -The whole loop is 10–20 minutes per table. Doing the same work by hand in -Desktop is maybe 45 minutes per table if you're fast. The speed-up is -real but not transformative. Where it becomes transformative is the -**consistency** — an agent applies the same rule to every measure, where -a human gets bored on measure 40 and skips the last 10. - -## Closing note - -Claude Code (and similar tools) slots into the Power BI workflow in -exactly the places you'd expect it to: bulk, text-level, rule-based -changes that don't require understanding the business. For those, it's -a genuine productivity win. For model design, it's a thoughtful pair -programmer on its good days and a confident novice on its bad ones — -same as everywhere else LLMs show up. - -The right frame isn't "can AI do my job." It's "which parts of my job -are typing that I shouldn't have to do." Those are the parts worth -automating. diff --git a/content/posts/dax-measure-definition-standards.md b/content/posts/dax-measure-definition-standards.md deleted file mode 100644 index 2b38b97..0000000 --- a/content/posts/dax-measure-definition-standards.md +++ /dev/null @@ -1,79 +0,0 @@ -# DAX Measure Definition Standards - -Most DAX models become hard to maintain for one simple reason: measures are created quickly but not defined consistently. - -When naming, formatting, and grouping rules are clear, teams spend less time decoding logic and more time validating business outcomes. - -## Why This Matters - -- Faster onboarding for new developers and analysts. -- Fewer misunderstandings in self-service reporting. -- Easier model reviews before release. -- Better long-term maintainability as the model grows. - -## The 5 Standards Every Measure Should Follow - -### 1. Clear Business Meaning - -Descriptions must convey the functional purpose and business logic of a measure to ensure quick understanding for self-service users and developers. - -Use names and descriptions that answer: "What business question does this measure answer?" - -```dax --- Good: business intent is explicit -Sales Revenue YoY Growth % = ... -``` - -### 2. Result-Oriented Naming - -Measure names should explicitly describe the result they produce rather than the technical process used to calculate them. - -Prefer business outcomes over implementation terms: - -- Better: `Gross Margin %` -- Avoid: `Margin Divide Calc` - -### 3. Standardized Acronyms - -Use widely recognized acronyms as suffixes only to ensure they do not hinder the searchability or clarity of the measure name. - -If an acronym is not common across the team, write the full phrase. - -- Better: `Customer Churn %` -- Acceptable suffix: `YoY` -- Avoid: unclear custom shortcuts - -### 4. Precise Data Formatting - -Ensure every measure is assigned the correct format, specifically defining whether it is a Percent, Whole Number, or Decimal. - -Formatting is not cosmetic; it changes how users interpret business meaning. - -- `%` for rates and shares -- `Whole Number` for counts -- `Decimal` for ratios not presented as percentages -- Currency format for monetary values - -### 5. Logical Grouping - -Organize the model by grouping measures into specific categories such as "Key Measures" (base metrics) and "Time Related" calculations. - -A simple structure makes navigation predictable: - -- `Key Measures` -- `Time Related` -- `Financial Ratios` -- `Operational KPIs` - -## Quick Quality Check (Before Publish) - -- [ ] Name expresses a business result. -- [ ] Description explains business purpose and rule context. -- [ ] Acronyms are standard and easy to understand. -- [ ] Format matches the metric type (Percent, Whole Number, Decimal, Currency). -- [ ] Measure is in the correct display folder/group. - -## Closing Note - -Readable, standardized measures are a model quality control practice, not just a formatting preference. -If every new measure follows these five rules, your DAX layer stays scalable and audit-friendly. diff --git a/content/posts/dax-readability-formatting-guide.md b/content/posts/dax-readability-formatting-guide.md deleted file mode 100644 index 51d3fd4..0000000 --- a/content/posts/dax-readability-formatting-guide.md +++ /dev/null @@ -1,131 +0,0 @@ -# Better DAX Readability: Syntax, Formatting, and Tools - -Readable DAX is easier to review, debug, and maintain. In most teams, performance issues and logic bugs are often found faster when measures follow a clear syntax standard. - -This guide focuses on practical readability rules, formatting style, and tooling with Bravo and Tabular Editor. - -## Executive Summary - -To improve DAX readability quickly: - -- Use consistent syntax (`VAR`, `RETURN`, spacing, comments). -- Use tools to enforce style instead of formatting manually. -- Keep a shared standard for all measures in the model. - -## 1. Syntax Rules That Improve Readability - -### Use variables for intermediate steps - -Variables make business logic explicit and reduce repeated expressions. - -```dax -Gross Margin % = -VAR _SalesAmount = [Sales Amount] -VAR _GrossMargin = [Gross Margin] -VAR _Result = - DIVIDE ( _GrossMargin, _SalesAmount, 0 ) -RETURN - _Result -``` - -### Keep intent obvious with naming - -- Use clear and meaning full names. -- Prefix technical helper measures with `_`. -- Avoid cryptic abbreviations that only one developer understands. - -### Use comments for business logic - -Use short comments to explain *why* logic exists, not what each obvious token does. -Do not comment each line. - -```dax -Net Sales Qualified = -// Exclude internal transactions based on business rule. -CALCULATE ( - [Net Sales], - 'Sales'[Channel] <> "Internal" -) -``` - -## 2. Long-Line vs Short-Line Formatting - -### Long-line formatting - -```dax -Total Sales YoY Growth % = -VAR _TotalSales = - SUM ( Orders[Revenue] ) -VAR _TotalSalesPP = - CALCULATE ( - SUM ( Orders[Revenue] ), - PARALLELPERIOD ( 'Calendar'[Date], -12, MONTH ) - ) -VAR _Result = - DIVIDE ( _TotalSales - _TotalSalesPP, _TotalSalesPP ) -RETURN - _Result -``` - -### Short-line formatting - -```dax -Total Sales YoY Growth % = -VAR _TotalSales = - SUM ( Orders[Revenue] ) -VAR _TotalSalesPP = - CALCULATE ( - SUM ( Orders[Revenue] ), - PARALLELPERIOD ( - 'Calendar'[Date], - -12, - MONTH - ) - ) -VAR _Result = - DIVIDE ( - _TotalSales - _TotalSalesPP, - _TotalSalesPP - ) -RETURN - _Result -``` - -## 3. Tooling: Bravo and Tabular Editor - -### Bravo (quick formatting and review) - -Bravo is excellent for ad-hoc DAX cleanup and readability checks: - -- Formatting measures quickly. -- Validate style consistency before publishing. -- Use it as a lightweight review tool for developers and analysts. - -### Tabular Editor (model-wide consistency) - -Tabular Editor is the best place to standardize formatting across many measures: - -- Apply scripted formatting patterns at scale. -- Enforce naming conventions and technical prefixes. -- Keep scripts in source control so the rules are repeatable. - -A practical workflow: - -1. Draft or update measure. -2. Format in Bravo for quick readability check. -3. Apply model-wide standards in Tabular Editor scripts. -4. Publish only after formatting + naming checklist passes. - -## 4. Team Checklist - -- [ ] `VAR`/`RETURN` structure used where appropriate. -- [ ] Variables are prefixed with `_` consistently. -- [ ] Spacing and line breaks follow team standard. -- [ ] Measure names are business-readable. -- [ ] Technical helpers are clearly marked. -- [ ] Comments explain intent, not trivial syntax. - -## Closing Recommendation - -Readable DAX is a quality standard, not a style preference. -Short lines, clear syntax, and tool-assisted formatting produce faster reviews and more reliable models. diff --git a/content/posts/display-folders-vs-naming-conventions.md b/content/posts/display-folders-vs-naming-conventions.md deleted file mode 100644 index b60b334..0000000 --- a/content/posts/display-folders-vs-naming-conventions.md +++ /dev/null @@ -1,121 +0,0 @@ -# Display folders vs. naming conventions: organizing a growing measure model - -When a Power BI model has 20 measures, no one cares how they're organized. -When it has 200, every five-minute question becomes "where did we put that one?" - -Two tools address this, and teams usually pick one and neglect the other: -**display folders** (the folder tree in the Fields pane) and **naming conventions** -(the text of the measure name itself). They're not interchangeable. They solve -different problems, and they're strongest when you use both. - -## What each one is actually for - -**Display folders** answer: *where do I go to find a measure?* - -They're a browsing affordance. A folder tree works the way an IDE's project tree -works — you don't have to memorize filenames, you scan a shape. - -**Naming conventions** answer: *what does this measure mean once I'm looking at it?* - -A name is a contract. A measure called `Sales Revenue YoY %` promises a specific -result type (percentage), a specific comparison (year-over-year), and a specific -base (sales revenue). If any of those are wrong in the DAX, that's a bug. - -If your folders are organized but the names inside them are `Measure 7`, you've -built a filing cabinet with blank labels. If the names are perfect but there are -200 of them dumped at the root, you've built a glossary without a table of contents. - -## A folder structure that scales - -The structure that holds up best on mid-size models groups **by audience**, -not by calculation type: - -``` -Key Measures/ - Sales - Margin - Units -Time Related/ - YoY - YTD - MoM -Financial Ratios/ - Profitability - Liquidity -Diagnostics/ - Counts - Tests -``` - -Three principles behind that shape: - -1. **Top level = what a business user would pick from.** `Key Measures` is the - default. `Diagnostics` is intentionally last — it's for the modeler, not the - consumer. -2. **Second level = one axis of variation.** Inside `Time Related` the axis is - the comparison period. Inside `Key Measures` it's the business entity. -3. **No folder has fewer than 3 measures or more than ~12.** Fewer and it's not - pulling its weight; more and you need a third level. - -## A naming convention that matches - -Three parts, read left-to-right as a sentence: - -``` -[Entity] [Metric] [Qualifier]? -``` - -- `Sales Revenue` -- `Sales Revenue YoY %` -- `Customer Churn %` -- `Order Count` -- `Inventory Days` - -The qualifier is optional and strictly for time or statistical modifiers -(`YoY`, `YTD`, `MoM`, `Avg`, `Median`). It goes at the end so the main -measure sorts next to its variants alphabetically. - -Two rules that do a lot of work: - -- **No implementation words in names.** Not `Calc`, not `Divide`, not `Result`. - The name describes the output. -- **Units live in the name.** `%`, `$`, `Days`, `Hours`. If someone copies a - measure into a table without a header, they should still know what they're - looking at. - -## Where they collide - -The main tension is **redundancy**. If the folder is already called `Time Related` -and inside it is a measure called `Sales Revenue YoY %`, the `YoY` feels repetitive. - -Resolve it by treating the name as the source of truth. Folders move; names -propagate into reports, bookmarks, and Excel files. Repetition in the name is a -feature, not a bug — it lets the measure survive being lifted out of the folder -and dropped into a card visual on a dashboard. - -The other collision: **measures that belong in two folders.** "Is `Gross Margin %` -a Key Measure or a Financial Ratio?" Answer: **one folder, chosen by who uses it -most.** Copying a measure into two folders via hidden measures is clever and it -bites you every time the definition changes. - -## The 20-minute audit - -Every quarter, or when the measure count crosses 50 / 100 / 200: - -- [ ] Sort the full measure list alphabetically. Read it once. Flag anything - ambiguous. -- [ ] Open the Fields pane. Every folder should have 3–12 entries. -- [ ] Spot-check 5 measures. Ask: "Could a new analyst find this in under 10 - seconds?" If no, the folder is wrong, the name is wrong, or both. -- [ ] Check for near-duplicates — `Sales` vs `Total Sales` vs `Sales Total` - usually means one canonical measure plus two forgotten experiments. - -## Closing note - -Folders and names solve different halves of the same problem. A team that treats -them as interchangeable always ends up with a model that's either well-browsed -but inscrutable, or well-labeled but unfindable. - -The fix is cheap — it's mostly renaming — but the decision to invest in it is -the hard part. Do it at 50 measures and it takes an afternoon. Do it at 500 and -it's a project. diff --git a/content/posts/filter-context-mental-model.md b/content/posts/filter-context-mental-model.md new file mode 100644 index 0000000..f5429a0 --- /dev/null +++ b/content/posts/filter-context-mental-model.md @@ -0,0 +1,18 @@ +# A working mental model for filter context + +> **Draft — outline only.** DAX's filter context is the single concept that takes the longest to internalize. This post will be the explanation I wish I'd had early on. + +## What this post will cover + +- Row context vs. filter context: the minimum distinction that matters +- How filters propagate through relationships (and when they don't) +- `CALCULATE` as filter context modifier +- The user-filter vs. visual-filter split that explains `ALLSELECTED` + +## Why it matters + +_Draft pending — tie this back to a specific debugging story to make it concrete._ + +## Closing note + +_Draft pending._ diff --git a/content/posts/m-patterns-for-messy-sources.md b/content/posts/m-patterns-for-messy-sources.md new file mode 100644 index 0000000..b7d1dc9 --- /dev/null +++ b/content/posts/m-patterns-for-messy-sources.md @@ -0,0 +1,18 @@ +# M patterns for messy source data + +> **Draft — outline only.** A short collection of Power Query patterns I reach for when the source system was built for something other than analytics. + +## What this post will cover + +- Handling inconsistent date formats without `try ... otherwise` noise +- Splitting one-to-many columns (comma-separated values, JSON blobs) +- Stable key generation when the source has no real key +- When to push a transform upstream vs. keep it in M + +## Why it matters + +_Draft pending._ + +## Closing note + +_Draft pending._ diff --git a/content/posts/model-review-playbook.md b/content/posts/model-review-playbook.md new file mode 100644 index 0000000..6bcfb5a --- /dev/null +++ b/content/posts/model-review-playbook.md @@ -0,0 +1,18 @@ +# Model review playbook + +> **Draft — outline only.** This post will capture my working checklist for reviewing a Power BI model, whether I built it or inherited it. + +## What this post will cover + +- First-hour read-before-edit rule +- Structural smells: inactive relationships, calculated columns, orphan measures +- Structural wins: folder discipline, naming consistency, diagnostics folder +- A one-page review template that produces a decision, not a document + +## Why it matters + +_Draft pending — write when I have two or three concrete review sessions to draw from._ + +## Closing note + +_Draft pending._ diff --git a/content/posts/notes-on-analytics-engineering.md b/content/posts/notes-on-analytics-engineering.md new file mode 100644 index 0000000..70db0ad --- /dev/null +++ b/content/posts/notes-on-analytics-engineering.md @@ -0,0 +1,18 @@ +# Notes on analytics engineering + +> **Draft — outline only.** Short reflections on what the role actually involves once the initial tooling decisions are behind you. + +## What this post will cover + +- The split between data-modeling work and report work +- Where governance overhead earns back its cost +- Signs a "quick MVP" is about to become production +- Habits that survive contact with a real deadline + +## Why it matters + +_Draft pending._ + +## Closing note + +_Draft pending._ diff --git a/content/posts/reviewing-a-data-model-you-didnt-build.md b/content/posts/reviewing-a-data-model-you-didnt-build.md deleted file mode 100644 index 533c9d3..0000000 --- a/content/posts/reviewing-a-data-model-you-didnt-build.md +++ /dev/null @@ -1,127 +0,0 @@ -# Reviewing a data model you didn't build - -At some point, every analytics engineer inherits someone else's model. -Maybe a contractor rotated off. Maybe a colleague moved teams. Maybe the -"quick MVP" from 18 months ago is suddenly the production source of truth. - -The first hour with an unfamiliar Power BI or tabular model is the most -important hour. It sets your expectation for everything that follows. - -Here's the order I go in, and the things I look for. - -## 1. Read before you edit - -Do nothing for the first 20 minutes except read. - -- Open the Fields pane. Scan every table name. Say each one out loud — - if a name doesn't describe what's in the table, that's signal. -- Open the Model view. Look at the relationships. Count the fact tables. - Count the dimensions. Note any snowflakes. -- Expand one table and one folder of measures. Read 10 measure names. - -You're building a model of the model. Don't edit anything yet — if you -edit before you've read, you'll be explaining ghost decisions to yourself -for the next week. - -## 2. Measure the obvious proxies - -Some things give away maturity fast: - -- **Number of measures vs. number of tables.** Healthy ratio is roughly - 5–15 measures per fact table. Below that, the model is under-expressed - (too much logic lives in visuals). Above it, measures are probably - duplicated or the model is doing someone else's job. -- **Measures at the table root vs. in folders.** 0 at the root is a very - good sign. 200 at the root means no one has organized in a while. -- **Count of inactive relationships.** Zero is fine. One or two with - comments is fine. Seven is a model that's been patched and never - refactored. -- **Tables with `dim_` / `fact_` / `_temp` / `Table2` / `Query1` prefixes.** - Naming hygiene is a proxy for overall hygiene. You'll almost never find - a model with neat names and terrible DAX, or vice versa. - -## 3. The five smells I treat as red flags - -**Columns used directly in visuals instead of measures.** Drag-and-drop -from column to chart works fine until the aggregation rule needs to -change. Then you have 40 visuals to update. - -**Calculated columns doing measure work.** A calculated column that -aggregates other columns ("order total = SUMX( related lines )") is -usually a measure written by someone who didn't know measures existed -yet. Performance and model size both suffer. - -**Bidirectional cross-filtering outside of specific known patterns.** -Many-to-many with a bridge table? Fine. Bidirectional on a normal -dimension? Almost always wrong — it works until it doesn't, and when -it breaks the bug is subtle and wide-ranging. - -**A date table that isn't marked as a date table.** Time intelligence -will silently produce wrong numbers. This is the single easiest fix to -make early and the single easiest bug to miss late. - -**Measures named with implementation words.** `Sum_Revenue_Calc_Final_v2` -is a scream for help. Expect more of the same downstream. - -## 4. The five green flags I trust - -**Descriptions on every measure.** Not universal, but when present, -signals that someone cared. Someone who cares will also have named -things well and not left landmines. - -**A `Diagnostics` or `Test` folder of measures.** Means the author has -been burned before and brought tools. - -**Consistent naming across tables.** If every dimension has `ID` as its -key and every fact has `_FK` suffix, someone had a standard and enforced -it. That person also probably knew what they were doing elsewhere. - -**Relationship cardinalities explicitly noted, not default.** Defaults -in Power BI are usually right. Explicit overrides are a sign the author -understood the model well enough to deviate from defaults on purpose. - -**A one-page README somewhere — in the About page of the report, in a -`/docs` folder, in the model description.** Existence of documentation -at all is a lagging indicator of a thoughtful model. - -## 5. What to do before you change anything - -Before editing, do three things: - -1. **Put it in source control.** If the file is `.pbix`, convert it to - `.pbip` first so you have something diffable. -2. **Screenshot the current report.** Pages, key cards, the first page - of any dashboard. You're going to change things and want a reference - for "did I break this?" -3. **Snapshot the known-good numbers.** Pick 3–5 summary values you can - reconcile (last quarter revenue, top customer, whatever). Write them - down. These are your "did I silently break totals" sentinels. - -That's 30 minutes of work and it saves you from the single worst failure -mode in inherited models: making a change, deploying it, and only finding -out three days later that a downstream report now reads 8% lower because -of a filter-context bug you introduced. - -## 6. The review report I write - -If I'm reviewing for a team rather than just inheriting it, I write up -findings using this shape: - -- **Verdict** — one sentence. "Ship it," "fixable in a day," "needs a - real refactor," "start over." -- **Top 3 risks** — specific measures or relationships I don't trust. -- **Top 3 wins** — things I'd keep and copy into other models. -- **Effort estimate** — hours to fix the risks, in a range. - -Not a 40-page audit. A one-pager someone can act on. The goal of the -review is a decision, not a document. - -## Closing note - -The best signal about a model you didn't build is **how confident you -feel after an hour of reading it.** If you could explain the schema and -three key measures to a teammate from memory, it's a decent model. -If your notes are "I think this one does... something with dates?" -after 60 minutes, the model isn't clear enough to be trusted as a source -of truth, and the first investment isn't improvements — it's -comprehension. diff --git a/content/posts/tabular-editor-workflows.md b/content/posts/tabular-editor-workflows.md new file mode 100644 index 0000000..07af324 --- /dev/null +++ b/content/posts/tabular-editor-workflows.md @@ -0,0 +1,18 @@ +# Tabular Editor workflows worth knowing + +> **Draft — outline only.** A collection of the Tabular Editor scripts and habits that earn back their learning cost. + +## What this post will cover + +- Best Practice Analyzer rules I run on every model +- C# scripts for bulk renames, description generation, folder reshuffles +- Deployment patterns: external tools vs. the standalone app vs. CI +- When TE is the wrong tool and you should just use Desktop + +## Why it matters + +_Draft pending._ + +## Closing note + +_Draft pending._ diff --git a/feed.xml b/feed.xml index 287aa7a..23e0b98 100644 --- a/feed.xml +++ b/feed.xml @@ -4,72 +4,63 @@ Insights, tutorials, and patterns — blue to purple. - 2026-04-18T00:00:00Z + 2026-04-10T00:00:00Z https://jonathan-pap.github.io/ Jonathan Papworth - Reviewing a data model you didn't build - - https://jonathan-pap.github.io/post.html?slug=reviewing-a-data-model-you-didnt-build - 2026-04-18T00:00:00Z - A structured approach to picking up an unfamiliar Power BI or tabular model — what to read first, what to measure, and which smells to treat as red flags. - - - - - Using Claude Code to refactor a Power BI project - - https://jonathan-pap.github.io/post.html?slug=claude-code-for-pbi-refactor - 2026-04-17T00:00:00Z - A practical note on using an agentic coding tool for tabular model work — where it helps, where it makes things worse, and how to keep a human in the loop. - + Model review playbook + + https://jonathan-pap.github.io/post.html?slug=model-review-playbook + 2026-04-10T00:00:00Z + Draft — a working checklist for reviewing a Power BI model, whether you built it or inherited it. + - Using AI to write DAX without losing control of your model - - https://jonathan-pap.github.io/post.html?slug=ai-for-dax-without-losing-control - 2026-04-16T00:00:00Z - A workflow for getting high-quality DAX out of an LLM without letting it silently reshape your semantic model. - + A working mental model for filter context + + https://jonathan-pap.github.io/post.html?slug=filter-context-mental-model + 2026-03-25T00:00:00Z + Draft — the explanation of filter context I wish I'd had early on, with worked examples. + - ALL, ALLEXCEPT, ALLSELECTED: a mental model for filter context removal - - https://jonathan-pap.github.io/post.html?slug=all-allexcept-allselected-mental-model - 2026-04-15T00:00:00Z - A one-page mental model for DAX's three filter-removal functions — when each one applies and why picking the wrong one silently breaks totals. - + M patterns for messy source data + + https://jonathan-pap.github.io/post.html?slug=m-patterns-for-messy-sources + 2026-03-12T00:00:00Z + Draft — Power Query patterns I reach for when the source was built for something other than analytics. + - Display folders vs. naming conventions: organizing a growing measure model - - https://jonathan-pap.github.io/post.html?slug=display-folders-vs-naming-conventions - 2026-04-14T00:00:00Z - A mid-size Power BI model needs both display folders and naming conventions. Here's how each earns its keep, and where they tend to collide. - + Using AI on analytics work + + https://jonathan-pap.github.io/post.html?slug=ai-on-analytics-work + 2026-02-28T00:00:00Z + Draft — where LLMs genuinely help on analytics tasks, where they quietly hurt. + - DAX Measure Definition Standards - - https://jonathan-pap.github.io/post.html?slug=dax-measure-definition-standards - 2026-02-14T00:00:00Z - A practical standard for naming, formatting, and organizing DAX measures so teams can keep models clear and maintainable. - + Tabular Editor workflows worth knowing + + https://jonathan-pap.github.io/post.html?slug=tabular-editor-workflows + 2026-02-15T00:00:00Z + Draft — the Tabular Editor scripts and habits that earn back their learning cost. + - Better DAX Readability: Syntax, Formatting, and Tools - - https://jonathan-pap.github.io/post.html?slug=dax-readability-formatting-guide - 2026-02-09T00:00:00Z - A practical guide to writing readable DAX with consistent syntax, short-line formatting, and tool-based workflows using Bravo and Tabular Editor. - + Notes on analytics engineering + + https://jonathan-pap.github.io/post.html?slug=notes-on-analytics-engineering + 2026-02-02T00:00:00Z + Draft — short reflections on what the role actually involves once the initial tooling decisions are behind you. + diff --git a/sitemap.xml b/sitemap.xml index 7880938..f811569 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -11,38 +11,33 @@ 0.5 - https://jonathan-pap.github.io/post.html?slug=reviewing-a-data-model-you-didnt-build - 2026-04-18 + https://jonathan-pap.github.io/post.html?slug=model-review-playbook + 2026-04-10 0.8 - https://jonathan-pap.github.io/post.html?slug=claude-code-for-pbi-refactor - 2026-04-17 + https://jonathan-pap.github.io/post.html?slug=filter-context-mental-model + 2026-03-25 0.8 - https://jonathan-pap.github.io/post.html?slug=ai-for-dax-without-losing-control - 2026-04-16 + https://jonathan-pap.github.io/post.html?slug=m-patterns-for-messy-sources + 2026-03-12 0.8 - https://jonathan-pap.github.io/post.html?slug=all-allexcept-allselected-mental-model - 2026-04-15 + https://jonathan-pap.github.io/post.html?slug=ai-on-analytics-work + 2026-02-28 0.8 - https://jonathan-pap.github.io/post.html?slug=display-folders-vs-naming-conventions - 2026-04-14 + https://jonathan-pap.github.io/post.html?slug=tabular-editor-workflows + 2026-02-15 0.8 - https://jonathan-pap.github.io/post.html?slug=dax-measure-definition-standards - 2026-02-14 - 0.8 - - - https://jonathan-pap.github.io/post.html?slug=dax-readability-formatting-guide - 2026-02-09 + https://jonathan-pap.github.io/post.html?slug=notes-on-analytics-engineering + 2026-02-02 0.8