Block diagrams of processes and the effectiveness of conceptual frameworks
19 February 2024Collecting data for verification
19 February 2024The majority of data (80%) in companies is created in unstructured formats, which slows down or makes it often impossible to flow smoothly, so we convert textual, unstructured and semi-structured data into structured form to improve the efficiency of data processing.
In the same way that specialists often do not know how to translate multi-format data into structured formats, specialists also do not know how to structure their requirements and wishes, leaving them in text format throughout the process.
Just as we have already converted data from unstructured text form to structured form, in our requirements process we will convert textual requirements into "logical and physical level" structured format.
In tabular form, we will describe the requirements for each system in the form of attributes and their boundary values.
In our example, let's take a closer look at the needs of a quality control engineer who uses a construction quality management system (CQMS) to ensure that the standards and requirements of a building product, in this case window systems, are met.
As an example, consider some important requirements for attributes of window system type entities in CQMS: energy efficiency, acoustic performance and warranty period. Each category includes certain standards and specifications that must be considered when designing and installing window systems.
The data requirements that the QA engineer sets up in the form of a table have the following boundary values:
- Window energy efficiency class attributes range from "A++", denoting the highest efficiency, to "B", considered the minimum acceptable level, and these classes are represented by the list ["A++", "A+", "A", "A", "B"].
- The attribute acoustic insulation of windows, measured in decibels and showing their ability to reduce street noise, is defined by the regular expression \d{2}dB.
- The attribute warranty period for window type entity starts at five years, setting this period as the minimum allowed when selecting a product; warranty period values are also specified, e.g. ["5 years", "10 years", etc.].
Within the established attributes, classes below "B" such as "C" or "D" will not pass inspection for window energy efficiency when new data becomes available. Acoustic insulation of windows, in data or documents that come to the Quality Control Engineer must be labeled with a two-digit number followed by the postfix "dB", such as "35dB" or "40dB", and values outside of this format, such as "9dB" or "100dB", will not be acceptable. The warranty period must begin with a minimum of "5 years" and shorter periods such as "3 years" or "4 years" will not meet the requirements that the quality engineer has described in the table format.
In order to check values against boundary values from requirements during the validation process we use regular expressions (acoustic insulation of windows) to check data consistency and integrity based on predefined rules.
Regular expressions (RegEx, are used in programming languages, including Python (Re), to find and manipulate strings. Regex is like a detective in the world of strings, able to identify text patterns in text with precision.
In regular expressions, letters are described directly using the corresponding characters of the alphabet, while numbers can be represented using the special character \d, which corresponds to any digit from 0 to 9. Square brackets are used to indicate a range of letters or digits, e.g., [a-z] for any lowercase letter of the Latin alphabet or [0-9], which is equivalent to \d. For non-numeric characters and non-letter characters, \D and \W are used respectively.
Popular RegEx use cases:
- Checking an email address: To check if the string is a valid email address, the template ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ can be used.
- Date Extraction: \b\d{2}\d{2}\d{2}\d{2}\d{2}\.\d{2}\.\d{4}\b template can be used to extract the date from the text in DD.MM.YYYYY format.
- Phone Number Verification: To verify phone numbers in the format +49(000)000-0000, the pattern will look like \+\d{2}\(\d{3}\)\d{3}-\d{4}.
By translating the QA engineer's requirements into the format of attributes and their boundary values, we have transformed them from their original text format into an organised and structured table, thus facilitating future validation and analysis of incoming data. By having requirements, we do not need data that has not been validated, while validated data can be automatically transferred to systems for further processing.
Now we will convert all the requirements of all the specialists from our new window installation process into an organised list in attribute format and add these lists from the required attributes to our flowchart for each specialist..
By adding all attributes to one common process table, we transform the information previously presented in the form of text and dialogue into a structured and systematized form of tables linked by means of a block diagram.
The data requirements are now clearly structured and the next step is to collect the data and prepare it for the validation process. The requirements for each system should be communicated to the specialists who create the data for those specific systems. Only when the requirements are in hand can the data creation phase begin.
Checking the presence of attributes and their values in the created data will allow us to ensure that the information provided has reached the required level of quality and is ready to be used in the appropriate use cases important to the professionals at a particular stage of the process of adding a new window entity to the project. If the data meets the requirements like a green light, we can automatically route it to the right people and systems for which the data was intended.