Working with business data in Power Query

Admin

Admin

Instructor: Benny Ifeanyi (LinkedIn)

Introduction

Power Query is a data connection technology and ETL (Extract, Transform, and Load) tool that is used to simplify the process of connecting to, shaping, transforming, and combining data from various sources. With Power Query, you can connect to various data sources, including databases, web services, and text files. Once connected, you can use the intuitive user interface to shape and clean the data, such as filtering and removing duplicates, and then load the data into your workbook or data model.

Power Query also offers advanced features such as merging and aggregating data from multiple sources, creating calculated columns, and transforming data using formulas. It is also a data cleaning tool that help to handle dirty data by removing missing key elements, extra characters, poorly compiled and formatted data, poor data entries, duplicates or miss-spelling, data type error and invalid data and transforming it into an organized data for further analysis.

Additionally, you can save your data transformation steps as a query, making it easy to reuse the steps in the future. This can in turn help to save your time and effort in working with data, especially when dealing with large, complex data sets from multiple sources.

Central to this tutorial, I will walk you through the following:

  1. Key functions/operations performed in Power Query
  2. How to access Power Query Application
  3. Power Query in Microsoft Excel
  4. Steps on getting data from web into Power Query in Microsoft Excel
  5. Steps on combining data from folder in Power Query when using Power BI

Key functions performed in Power Query

  1. Data cleaning and transformation engine room
  2. ETL (Extract, Transform, and Load) Application
  3. Connection to various data sources

How to access Power Query Application

It is worthy of note that Power Query was first introduced as a feature in Microsoft Excel 2010 and can now be found in the following application for usage:

  1. Power Query Desktop (Microsoft Excel and Microsoft Power BI)
  2. Power Query Online (Power BI dataflows, Microsoft power platform dataflows, Azure data factory wrangling dataflows )
  3. Other Microsoft products (Power Apps, Microsoft Dynamics 365 & Microsoft Power Automate)

For the purpose of this project, the dataset used is found here

Power Query in Microsoft Excel

  • Open Microsoft Excel on your personal computer, select the data tab. The highlighted area in red labelled Get and Transform Data is Power Query

  • You can get data into Power Query from various data sources as shown in the picture below. But central to this tutorial, we will get our data from an external excel spreadsheet

  • Import the dataset from your local computer into Power Query.

  • The data is then imported into the Power Query Navigator, then click transform to get the data into Power Query Editor

Central to this tutorial, I will be cleaning this dataset and transforming it to the clean data showed in the image below.

Dirty Data

  • This will be out final outcome from the cleaning and transforming of the dirty data above.

Clean Data

To clean and transform the data and get the clean format above, we will go through the following process

  • Transpose the data

In Power Query, transposing data refers to the process of rotating a table or range of data so that the rows become columns and the columns become rows. This can be useful when the original data is not in the desired orientation for analysis or visualization.

To transpose this data, click the transform tab and then the circled transpose

  • Fill down the Segment Column

There is a pattern in column 1 with respect to the different segment in the dataset. So I will fill down to get the values in place.

The segment column (column 1) is then filled downward to get this:

Note: Unlike other tools were you can undo easily, power query as an applied steps tab on the bottom right corner where you can undo specific steps carried out on a data. You can also move a step back to see previous changes with this feature.

  • Use first row as Headers

Click the Home tab, then click on the Use first row as Headers

This automatically replace the existing column header with the actual column header as shown in the clean data

  • Remove the Totals in the Segment column by using the drop down option to filter them out manually or with a filter option with text not containing the word total.

This Filter menu pops up which allows me to filter with text that doesn’t contain total.

  • Unpivot other columns in the data

Note: In Power Query, unpivoting other columns means converting multiple columns in a table from a wide format to a long format, where the column headers become rows in a single column, and their corresponding values are stacked in another column. This transformation is useful when you have a table with multiple columns containing similar types of data, and you want to combine them into a single column for further analysis.

So I will select the segment and Ship mode column and unpivot other columns

Note: Then the column other columns are converted into long format as all the order_ID are transformed into a single column named "attribute" (order_ID) and the corresponding values in another column named “values” (Cost)

  • Rename the columns from the code displayed or by double clicking the column name

Note: Power Query uses a functional programming language called "M" (also known as "Power Query Formula Language") to perform data transformations. M is used to write custom functions, create queries, and manipulate data within Power Query.

  • You can then close and load the data into Microsoft Excel

Creating a function in Power Query

Creating a function in Power Query involves defining a reusable set of data transformations that can be applied to similar data structures or data sources. This allows you to write a set of transformations once and then apply them to multiple tables or files with similar data structures.

To create a function in Power Query, you can use the "view" tab and select "Advanced Editor", you can define the function using the M language or take existing function from a previous transformation. The function can take input parameters, such as the table or file to be transformed, and return a transformed table.

To create this function, follow the following steps:

1. Remove the first two lines of code after the let function. This is because this code is used to import the dataset from a folder and can be limiting for data from a different source.

2. Change the data source name and note that Power Query Language is case sensitive. I will call mine (Cleaned) for the purpose of this tutorial.

The final code will look like this.

You will get a view that looks like this:

Once the function is defined, you can apply it to similar data structures by calling it from other queries or using it in the "Invoke Custom Function" feature. This can save time and effort when working with large datasets or multiple data sources with similar structures.

  • Change the name of the data to “monthly transformation”
  • Select that on function query and select the actual data for new column (cleaned)

  • This is then effected as the new column named cleaned. The next step is to go ahead and remove other columns aside from the name and the cleaned column that is selected.

  • I will then use this button to expand the data
  • Uncheck the box and press “Ok”

  • Then you can load into Microsoft Excel

Steps on getting Data from Web into Power Query in Excel

You can get data from GitHub into Power Query by using the "From Web" feature in Power Query. Here are the steps:

This is the link to the GitHub Repository I will be getting the data from here

Note: This is a view of the data from GitHub that I will be transforming in Power Query

Note: Click on Raw and copy the link which will be used to get the data from web.

To start to this follow the steps below from your Microsoft Excel Sofware:

1. Open Power Query by clicking on the "Data" tab in Excel and then selecting "Get Data" > "From Other Sources" > "From Web".

2. In the "From Web" dialog box, enter the URL of the GitHub repository you want to get data from.

3. . Click OK and wait for the "Navigator" dialog box to appear.

4. In the "Navigator" dialog box, select the table or file you want to import and click "Transform" to import the data into Power Query.

5. So this is the dataset successfully imported into Power Query

Cleaning & Transform the data from Web in Power Query

  • To start the cleaning, first - move the first row as Header. Click on “Use First Row as Header”

  • Remove the last two empty columns. Select the two columns and click on “Remove Column”

  • Duplicate the first name, last name and date column to ensure ease of transposing the data.

  • Select the three (3) new or duplicate columns and then move to the beginning for ease of splitting respective columns.

  • Go to “Transform” the “Split Column” and use “Delimiter”

Note: Select “Space as Delimiter” and split at “Right-most Delimiter”. Perform the same operation for the other split columns but for date, split at “Left-most Delimiter”. Then rename the columns appropriately.

  • Select the three (3) new columns and fill down to remove the “null values”

  • Next, for the other columns remove the blanks and first name features since it has the columns of it own.

  • After this, the data is read to be loaded into the Excel engine since it is well formatted and ready for further analysis.

Steps on combining data from folder in Power Query when using Power BI

I will be combining the data in this folder with Power Query in Power BI in a bid to use them for further analysis and visualization.

Note: This can be manually done by opening and appending the excel files at the end of one another but I will walk us through this can be done using Power Query in Power BI.

Let’s get started…

To combine data from multiple files in a folder using Power Query in Power BI, follow these steps:

  • Open your Power BI, then click on “Get Data from another source”

  • Select "Folder" from the "Common" section and click "Connect"

  • In the "Folder" dialog box, navigate to the folder containing the files you want to combine and click "OK"

Note: Instead of just load and transform button in the navigator, you will get an additional button to combine since you have multiple data. Then click on Combine and transform.

  • Then click on “Apply Changes” to update the queries running the navigator background.

  • Click on the data source dropdown, then select load more and the other month data comes up also.

Note: This dataset doesn’t need any form of transformation, so will just load into my Power BI Desktop

  • Once you have finished transforming the data (if need be with your own sample data), click on "Close & Apply" to apply the changes and return to the Power BI report.

Conclusion

Power Query is a powerful data transformation tool that is available in Microsoft Excel, Power BI, and other Microsoft applications. It allows users to easily extract, transform, and load data from various sources, including databases, files, and web pages.

Power Query is a valuable tool for data analysts, business users, and anyone who needs to work with data on a regular basis. Its ease of use, flexibility, and powerful features make it a must-have tool for anyone who needs to perform data transformations and analysis.

If you want to get started with data analytics and looking to improving your skills, you can check out our Learning Track

resa logo

Empowering individuals and businesses with the tools to harness data, drive innovation, and achieve excellence in a digital world.

Copyright 2025Resagratia. All Rights Reserved.