Skip to content

Quality Rules

Quality Rules help ensure that your collected data is accurate, consistent and reliable. By applying quality rules, you can automatically detect errors, maintain high-quality data and make better decisions based on validated information.

You have two options for creating and applying rules:

  • Generate AI suggested rules

  • Create custom rules

The system sends 10 random records as sample data to the AI. It analyzes the selected run, generates rules and applies them to subsequent runs. This ensures that similar patterns and issues are automatically monitored and validated in future data collection.

Types of Rules

Type Description
Generate Suggested Rule - Leverages AI to analyze data patterns and suggest rules automatically.
- Helps identify potential issues that may not be obvious manually.
- You can choose to apply a specific rule or select all suggested rules to ensure comprehensive coverage.
Custom Rule - Provides complete control over your data validation logic.
- Allows you to manually create and configure rules to meet your specific needs.
- Ideal when you have known criteria or business rules that must be enforced consistently.

Quality Rule Steps

  • Step 1: Navigate to the Datasets section.
  • Step 2: Select the run for which you want to set a rule.
  • Step 3: Click on Set QA Rule.
  • Step 4: You will be directed to the Generate Rule section, where you can select the page to which you want to apply the rule.
  • Step 5: You can either choose to create custom rules or ask AI to generate rules.
  • Step 6: When the rule is applied, view the quality status for the run.

Note

  • You can preview and test both AI-suggested and custom rules to ensure they work correctly before applying them to runs.

Managing Quality Rules

All applied rules can be accessed and managed in the Quality Rules section of the platform. Here, you can:

  • View a list of all active and inactive rules.

  • View the columns and pages that you have selected for each rule.

  • Activate or deactivate rules depending on project needs.

  • Delete rules that are no longer relevant.

By actively managing quality rules, you can ensure that your data remains accurate, consistent and aligned with the business objectives. This feature provides flexibility, whether you prefer full manual control or want to leverage AI for smarter rule suggestions.

Note

  • Users with Manager and Trusted roles can create, test, delete and manually validate data quality rules. Users with the Viewer role can only view these rules and cannot perform any of these actions.

Additional detail

  • Configure your alert settings via the notification button to monitor the status of your data health. Once enabled, you will receive regular email updates tailored to your specific monitoring requirements.

Types of Quality status

Type Description
Not Started Displayed when the run has not started yet and is still getting ready to begin processing.
Processing Displayed when the run is actively validating the data and processing is in progress.
Skipped Displayed when no rule is set for the run, when the run is merged or imported or when it contains JSON-formatted data. Also used as the default status when no other status applies.
Success Displayed when all applied rules have been successfully validated.
Failed Displayed when any one of the applied rules fails, even if the others have passed.

Limitations

  • Quality Rules can be applied only to standard runs. Runs that are imported, merged or use JSON-formatted data do not support rule creation.
  • Runs larger than 4GB can have rules applied, but only if the QUALITY_RULE_USE_SAMPLE report parameter is set. Once set, QA validation will be performed only on the specified number of rows.

Categories of Rules

Category Description Example
Value Set Comparisons Ensures that column values conform to a specific set, either by being included, containing all set values or matching the set exactly. A column “Country” should only contain valid country names from a predefined list (e.g., USA, Canada, UK).
Value Checks Checks individual column values for type, uniqueness, presence/absence (null or not), inclusion in a set or falling within a defined range. “Age” column should have numbers between 0–120; “Email” should not be null; each “User ID” should be unique.
String Operations Validates string lengths and ensures values match or do not match specified patterns. “Phone Number” should be exactly 10 digits; “Postal Code” should match a regex pattern for valid codes.
Multi-Column Expectations Ensures relationships and uniqueness across multiple columns, such as sums, pair comparisons or unique combinations. “Start Date” must be before “End Date”; combination of “First Name” + “Last Name” should be unique in the table.
Table-Level Expectations Checks structural aspects of the table, including column existence, column count, column order and row count. A table must have “Name,” “Email,” and “Phone” columns; total number of rows should match expected batch size.
Statistical Measures Validates statistical properties of column data, such as min, max, median, standard deviation, sum and distribution divergence. The “Salary” column should have a median between $30,000–$100,000; standard deviation of ratings should be reasonable.
Value Distribution Ensures the distribution of values meets certain criteria, including proportion of non-null or unique values, most common values and quantiles. At least 95% of “Order Status” should be non-null; top-selling products should appear in the most common values set.
Statistical Outliers Detects outliers using Z-scores or other statistical measures to ensure values are within acceptable ranges. A “Transaction Amount” that is extremely high compared to the rest of the dataset is flagged as an outlier.