The formula to calculate the Data Quality (DQ) is:
\[ DQ = \left( \frac{VDP}{TDP} \right) \times 100 \]
Where:
Data quality refers to the condition of a set of values of qualitative or quantitative variables. High data quality means that the data is accurate, complete, reliable, and relevant. It is crucial for effective decision-making, operational efficiency, and accurate reporting. Poor data quality can lead to incorrect conclusions, wasted resources, and missed opportunities.
Let's say the number of valid data points (VDP) is 80, and the total number of data points (TDP) is 100. Using the formula:
\[ DQ = \left( \frac{80}{100} \right) \times 100 = 80 \% \]
So, the data quality (DQ) is 80%.
Definition: The data quality score formula calculates the percentage of passed data quality checks out of all executed checks.
Formula: \( \text{Data Quality Score} = \frac{\text{Passed Checks}}{\text{Total Checks}} \times 100 \)
Example: \( \text{Data Quality Score} = \frac{90}{100} \times 100 \)
Definition: The cost of quality (COQ) quantifies the total cost of quality-related efforts and issues.
Formula: \( \text{COQ} = \text{CoGQ} + \text{CoPQ} \)
Example: \( \text{COQ} = 5000 + 2000 \)
Definition: Data quality is measured along several dimensions such as accuracy, completeness, consistency, timeliness, validity, and uniqueness.
Formula: \( \text{Data Quality} = \frac{\text{Sum of Dimension Scores}}{\text{Number of Dimensions}} \)
Example: \( \text{Data Quality} = \frac{85 + 90 + 80 + 95 + 88 + 92}{6} \)