Automating reporting at the ETL stage Load is an important step in data processing, especially when the results of the analysis need to be presented in a format that is easy to communicate and understand. In the construction industry, this is often relevant for progress reports, project data statistics, quality assurance reports or financial documentation.
One of the most convenient tools for such tasks is the open source library, FPDF, available for both Python and PHP.
The FPDF open source library provides a flexible way to generate documents through code, allowing you to add headers, text, tables and images. Using code instead of manual editing reduces errors and speeds up the process of preparing reports in PDF format.
One of the key stages of creating a PDF -document is adding headings and main text in the form of comments or description. However, when creating a report, it is important not only to add text, but also to structure it properly. Headings, indents, line spacing – all this affects the readability of the document. Using FPDF, you can set formatting parameters, control the arrangement of elements and customize the style of the document.
FPDF is very similar in principle to HTML. Those who are already familiar with HTML can easily generate PDF documents of any complexity using FPDF, as the code structure is very similar to HTML markup: headers, text, images and tables are added in a similar way. Those who are not familiar with HTML need not worry – you can use LLM, which will instantly help you compose the code to generate the desired document layout.
- The following example demonstrates how to generate a report with a header and body text. Executing this code in any IDE with Python support creates a PDF -file containing the desired header and text:
from fpdf import FPDF # Import the FPDF library
pdf = FPDF() # Create PDF -document
pdf.add_page() # Add a pagepdf.set_font(“Arial”, style=’B’, size=16) # Set font: Arial, bold, size 16
pdf.cell(200, 10, “Project Report“, ln=True, align=’C’) # Create title and center it
pdf.set_font(“Arial”, size=12) # Change the font to regular Arial, size 12
pdf.multi_cell(0, 10, “This document contains data on the results of project file verification…“) # Add multi-line text
pdf.output(r “C:\reports\report.pdf”) # Save PDF -file

When preparing reports, it is important to take into account that the data from which the document is formed is rarely static. Headers, text blocks (Fig. 7.2-14) are often formed dynamically, receiving values at the Transform stage in the ETL process.
Using the code allows you to create documents that contain up-to-date information: project name, date of report generation, as well as information about participants or current status. The use of variables in the code allows you to automatically insert this data in the required places in the report, completely eliminating the need for manual editing before sending.
In addition to simple text and headings, tables occupy a special place in project documentation. Almost every document contains structured data: from object descriptions to inspection results. Automatic generation of tables based on data from the Transform stage allows not only to speed up the process of document preparation, but also to minimize errors when transferring information. FPDF allows to insert tables into PDF -files (as text or pictures), setting cell borders, column sizes and fonts (Fig. 7.2-15). It is especially convenient when working with dynamic data, when the number of rows and columns can vary depending on the document tasks.
- The following example shows how to automate the creation of tables, e.g. with a list of materials, estimates or parameter test results:
data = [
[“Item“, “Quantity“, “Price“], # Column headings
[“Concrete“, “10 m³“, “$ 500.”], # First row data
[“Rebar“, “2 tons“, “$ 600”], # Second row data.
[“Brick,” “5,000 pieces,” “$ 750.”], # Line 3 data.
]pdf = FPDF () # Create PDF -document
pdf.add_page() # Add a page
pdf.set_font(“Arial”, size=12) # Set the fontfor row in data: # Search table rows
for item in row: # Go through the cells in the row
pdf.cell(60, 10, item, border=1) # Create a cell with a border, width 60 and height 10.
pdf.ln() # Move to the next line
pdf.output(r “C:\reports\table.pdf”) # Save PDF -file

In real reporting scenarios, tables are usually dynamically generated information obtained at the data transformation stage. In the example shown (Fig. 7.2-15), the table is inserted into the PDF -document in a static form: the data for the example was placed in the data dictionary (the first line of the code), in real conditions such data variable is filled in automatically after e.g. grouping of the dataframe.
In practice, such tables are often built on the basis of structured data coming from various dynamic sources: databases, Excel -files, API -interfaces or results of analytical calculations. Most often, at the Transform (ETL) stage, data is aggregated, grouped or filtered – and only then transformed into totals in the form of graphs or two-dimensional tables displayed in reports. This means that the table content can change depending on the selected parameters, analysis period, project filters or user settings.
The use of dynamic dataframes and datasets in the Transform stage makes the reporting process in the Load stage as flexible, scalable and easily repeatable as possible without the need for manual intervention.
Besides tables and text FPDF also supports adding graphs of tabular data, which allows you to embed images generated with Matplotlib or other visualization libraries we have considered above into the report. You can supplement the document with any graphs, charts and diagrams using the code.
- Using the Python library FPDF, let’s add a graph pre-generated with Matplotlib. to the PDF document:
import matplotlib.pyplot as plt # Import Matplotlib to create plots
fig, ax = plt.subplots() # Create the Fig. and axes of the chart
categories = [“Concrete“, “Rebar“, “Brick“] # Category names
values = [50000, 60000, 75000] # Category values
ax.bar(categories, values) # Create a bar chart
plt.ylabel(“Value,$.”) # Sign the Y axis
plt.title(“Cost Distribution”) # Add a title
plt.savefig(r “C:\reports\chart\chart\chart.png”) # Save the chart as an image.pdf = FPDF () # Create PDF -document
pdf.add_page() # Add a page
pdf.set_font(“Arial”, size=12) # Set the font
pdf.cell(200, 10, “Cost Chart“, ln=True, align=’C’) # Add a headerpdf.image(r “C:\reports\chart\chart\chart.png”, x=10, y=30, w=100) # Insert image into PDF (x, y – coordinates, w – width)
pdf.output(r “C:\reports\chart_report.pdf”) # Save PDF file

FPDF makes the process of document preparation and logic transparent, fast and convenient. Templates built into the code allow generating documents with up-to-date data, eliminating the need for manual filling.
Using ETL automation – instead of time-consuming manual reporting, professionals can focus on analyzing data and making decisions, rather than choosing the right tool to work with a particular data silo with a clear user interface.
Thus, the FPDF library provides a flexible tool for automated creation of documents of any complexity – from short technical reports to complex analytical summaries with tables and charts, which allows not only to speed up the document flow, but also significantly reduce the probability of errors associated with manual data entry and formatting.