Creating the automation function and combining files
In the second article in this series, we covered how to extract the budget PDF document into manageable chunks. This is necessary because we want to leverage the combine files option in Power Query editor. We also explained how we aim to decipher the “sharing formula” if there is indeed such a revenue allocation formula.
In this tutorial, we will create a function to extract a template for the expenditure items and another function for the capital projects breakdown. If you need to catch up on this aspect and the terminologies, the first article in this series is a necessary resource.
We will be using data from the Ministry of Power for our templates. By now, you should have a folder named “Ministry of Power” containing the summary page extract and its 8 sub-MDAs in PDF.
If you haven’t extracted the Ministry of Power MDA into a folder, don’t worry, here’s your plug.
Open Power BI Desktop and click “Get data“. Then select the Folder option.
Enter the folder path into the dialog box that shows up. The folder path can be obtained by clicking into the folder address bar. Copy this path and paste it into the dialog box in Power BI.
Click OK. You should now see a list of the folder content. Ensure to click on “Transform Data” or “Edit” if you’re using an older version of Power BI Desktop.
This will transfer the content directly into the Power Query editor. Here, you will notice the Ministry of Power contents are shown as binary files. This file type allows us to perform transformations on an entire file as if it were a single variable. We will now perform these series of steps in order to create the functions.
- Identify and create a sample file (binary)
- Parametise the sample file as binary
- Create a reference to the parameter to serve as the function file
- Apply appropriate transformations to the function file
- Create a function fx from the function file
- Invoke the custom function fx on other binary files
- Expand and combine documents into one
- Close & apply to import the structured data into Power BI
Create an expenditure function and extract summary page
We will begin by going through steps 1 to 3, to create the expenditure function file.
In the next video, we will examine how to apply appropriate transformations to the function file to extract 3 sets of data for each MDA, the structure of which is better explained in the first tutorial.
- The summary page
- The expenditure items
- The capital projects breakdown
If you got this far with no errors when running your functions, well done! The actual steps taken in creating the function files were developed through an iterative process. Several errors were encountered along with way, however, these were essential to creating a more robust function. So if you encounter an error, go back to the applied step in which the error was detected and modify the transformation. We will cover more on error handling in a subsequent tutorial.
Now that the functions have been developed, the next step is to extract another MDA from the PDF document into a folder as described in tutorial 2. Connect to the folder in Power BI and run your functions on the binary files to extract another set of the summary page, expenditure items, and capital projects breakdown. Hopefully no errors!
Developing an open data analytics platform for Nigeria’s budget structure
In the spirit of open data, we want to make available a body of work that explains Nigeria from a new and objective perspective. We will break down the little details of the budget details to define a standard schema of expenditure items pervasive to the 48 ministries, departments and agencies (MDAs) in Nigeria. We believe that this project will improve information dissemination on governance in Nigeria, which in turn helps produce greater knowledge and societal progress.
If you examine the Ministry of Education budget details in the PDF document, you will notice that it has 214 sub-MDAs, all of which have to be split into separate PDFs. This can be a tedious and time-consuming process if one person had to do the entire 48 MDAs.
We’re encouraging collaboration as a way to learn Power BI and Business Intelligence, and to contribute to building, possibly, the largest set of open analytics data on Nigeria’s budget.
If you’re interested in contributing to this project, either as a writer, data engineer, or data analyst, kindly leave your details here. You will also be added to a WhatsApp group to follow our progress and to get immediate answers to any questions on these tutorials.
What does the “sharing formula” for revenue allocation mean? Can we decipher a pattern contained within the 2020 executive budget proposal? Is there a hierarchy for all expenditure items?
Next article – Open Analytics: Nigeria’s Proposed 2020 Budget
The previous article – Breaking down the FGN executive budget – Part 2
To find out more, join the tribe, subscribe to our latest insights and watch this space!