Wrangling Data: Why I’m Moving Transformations Out of Power BI

I’ve been working on a project recently that involves building out reports and dashboards in Power BI. The current setup pulls all of the data from a class management system directly into Power BI. At first, this seemed like the best setup because it was simple and cost saving; however, I soon realized that although Power BI’s DAX language is powerful, it feels clunky when doing the type of data transformations that I have been performing.

Pulling data straight from the class management warehouse into Power BI is efficient on the surface, but it comes at a cost. Power BI struggles when data is raw and unshaped.

  • Reusability is low because transformations often live inside measures or calculated columns (as opposed to using something like dbt or “CREATE OR REPLACE” tables in a SQL warehouse.)
  • Performance suffers when complex logic is embedded into visual-level filters and calculations. It’s better to automate scheduled transformation runs outside of the dashboard.

Overall, I’ve felt as though I’m spending too much time debugging nested formulas and following my rabbit hole of measures instead of looking at simple and clean CTEs.

Enjoy this image of what AI thinks I look like when I’m data wrangling in Power BI.

Instead of loading the data straight from the software company’s data warehouse into Power BI, my goal is now to replicate the data into my own data warehouse first. This will either be BigQuery or a Microsoft SQL warehouse. There I can clean, structure, and join tables using SQL before loading into Power BI.

  • I can version and reuse transformations
  • I can automate the ETL pipeline
  • Power BI will be used for visualization and not for data prep

I’ve spent a little bit of time looking into both BigQuery and Microsoft SQL. Here is what I’ve gathered.

BigQuery – Pros:

  • Serverless, easy to scale.
  • Great for large datasets with complex queries.
  • Plays nicely with modern cloud pipelines (like dbt, Fivetran).
  • No infrastructure setup.
  • I’ve used it along with dbt for work in the past, so I’m comfortable with it.

Cons:

  • Requires a GCP account (can be overkill for small orgs or contract work like I’m doing).
  • Costs are query-based, so inefficient queries = $$$ (makes me nervous when budget is limited).
  • Needs a connector for Power BI, which may not be as seamless as native Microsoft tools.

Microsoft SQL Warehouse – Pros:

  • Native integration with Power BI.
  • Familiar SQL Server syntax.
  • Easier to manage if you’re already in the Microsoft ecosystem.
  • Good documentation and support.

Cons:

  • Requires more infrastructure setup than BigQuery.
  • Less flexible for cloud-native ETL.
  • Can be slower to scale or update if you’re running on limited hardware.
  • I’m less familiar with it.

It should be noted that there are cases when it is unnecessary and more complex to move transformations upstream.

When not to move upstream:

  • If your transformations are minimal
  • If you are less technical and/or prefer working in Power BI.
  • If you don’t have control over a data warehouse and must do all data prep with what is available in Power BI.

For my use case, I’m going to move my data wrangling upstream. After looking into both BigQuery and Microsoft SQL, I have decided to stick with Microsoft SQL. I’ll have some learning and setup to do upfront but it will integrate smoother and will save me money and headaches in the long run.

Have you tried moving your transformations upstream? What did you learn?

Leave a comment