The Dashboard page provides visibility into important quality metrics pertaining to a dataset. Metrics can be viewed in a time series graph to see trends in order to establish if a sudden jump or a dip is warranted or problematic. To view the dashboard, go to a specific dataset and then hop over to the dashboard tab as shown in the image below:
Fig: Dashboard tab to view quality metrics
The dashboard shows the following quality metrics are presented in a time series graph:
- Requests: Total number of HTTP requests made to the server to populate the full dataset. A request is either
- Successful: meaning the requested content is served by the server
- Failed: meaning the server returned an error and could not open the requested content. A sudden jump in failed requests may imply server issues or changes in the underlying webpage from where data is sources
- Rows: Each line of record in a Dataset is referred to as a Row.
- Accuracy: A numeric score in percentage that measures if sourced data complies with the expected data format. Compliance is validated using rules assigned to different column headers in a dataset. If no rules are assigned to any column headers then the accuracy is null.
- Fill Rate: A numeric score in percentage that measures data density. An empty cell means the Fill Rate for said cell is 0 (or 0%). On the contrary, a cell with data means a Fill Rate of 1 (or 100%). The aggregated Fill Rate score for the entire dataset is the average across all cells.
Fig: Quality dashboard
The top widgets in the quality dashboard show different metrics for the selected dataset. For example, the Rows widget shows the total number of rows in a dataset. This total can be compared against the average of all datasets that are loaded in the time series graph beneath. By default, the comparison is made against other datasets generated on the same day. However you can switch to the “Last 7 days” or “Last 30 days” from the dropdown above the graph.
Here day pertains to the local day for the person logged in in their native timezone. Therefore the graph and the underlying numbers may look different for two viewers in different timezones.
The graph shows metrics for a dataset. However there are cases where we want to see daily trends - for all datasets generated in a day. If that’s the case, change the filter from “Dataset” to “Days” which will aggregate the value across all datasets for that day.
Topics in this section: