Checks Lifecycle: Deep Dive
In this section we will explain how a background check unfolds over time, detailing which circumstances affect the check's execution.
Check Statuses: Overview
All checks have a status that represents in which part of the lifecycle they are. We will explain each part one by one (with the corresponding status) and then do a recap at the end of this section.
Queues and Priorities
First of all, when a check is created it can be enqueued and remain enqueued for a time. While the check is enqueued, its status will be
not_started. This queue system allows us to monitor in real time how our data sources are fairing with volumes, and prevent us from exceeding capacity in them.
The way we determine how a check should be handled by the queue system, is by the check’s priority. More specifically, the priority decides whether the check is enqueued or not and how much time will it remain enqueued.
There are 3 priorities:
High: Is the fastest way that a check can be executed. It will skip any queues and it is only limited by how many other high-priority checks are currently running.
Medium: Is the default priority if not specified. This causes checks to be enqueued. Also, there is not a limit on how many checks can be enqueued at any given time.
Low: These checks are similar to
medium, but they are enqueued in a lower priority queue. This means they will remain more time enqueued than those created with the
Here is a diagram detailing the information explained above:
Check execution and data sources
When a check starts, we begin the data collection of the data sources specified by the datasets present on the check type. At this point, the check changes its status to
The check will remain
in_progress as long as there is at least one database collecting results. When the last database yields results, the check will change to status
Note that data sources also have statuses, so if you only need information from specific data sources, you should watch for their status changing to
This diagram depicts this process:
Things can go wrong
Unfortunately, a check’s processing isn’t always as smooth as shown above. This is due to the volatile nature of the public data sources we query data from. For example, some data sources could be down or slow at the moment when the check is started. It is worth noting that Truora is not in control of the data sources we query, so any setback with any data source is out of our reach.
When a data source is down, we will retry to collect the involved data source up to a limit. If this retry limit is reached, we will mark the status of that data source as
error. Besides this, when a check finishes and more than 30% of the data sources end up in status
error, the general check status will change to
error. Checks that end in
error status will not be charged.
Similarly, when a database is slow, its status will change to
delayed. Depending on the priority of the check, there is a time frame where if all the data sources have not finished yet, the general check status will change to
This time frame duration is as follows, and starts at the moment the check begins collecting data:
delayed check status means that the data sources that are slow probably won’t be yielding any results. In these cases, instead of waiting for the check to complete, it might be appropriate to make a decision with the information that the check currently has, or creating a check later if the information from that particular data source is crucial to you.
When a check enters the
delayed status, depending on the country, it can take up to 3 days to finish. When this timeout occurs, the check will be forced to finish and we will count delayed databases as errors and set the final score accordingly.
The worst case scenario, where one data source never yields any information, is described by the following diagram:
The complete lifecycle
At last, here is a diagram detailing the whole check lifecycle:
In addition, the next diagram represents all the possible statuses on a check and how they transition to one another:
To sum up the whole lifecycle, these are some brief descriptions of the check and data source statuses:
||The check is enqueued and the data collection has not started yet.|
||Data is being collected but some data sources may have finished already.|
||One or more data sources is taking a long time to query the data. Most data sources will have already finished collecting data.|
||The check finished and 70% or more of the data sources did not end in status error.|
||The check finished and more than 30% of the data sources ended in status error.|
Data source statuses
||The data source data collection was triggered.|
||The data source does not fetch any data as it does not have the required inputs to do so.|
||The data source is taking a long time to query the data.|
||The data source data was fetched successfully and it’s present on the check details.|
||We could not fetch data from that data source.|