Entity Resolution

Data for a given real-world entity can be fragmented across multiple data sources, making it difficult to obtain an accurate and complete representation of that entity within the system. To create and maintain master data about an entity, the system must identify these fragments and assign a unique identity that can be used system-wide to refer to a single real-world entity.

Entity Resolution is a technique for identifying data records in a single or across multiple data sources that refer to the same real-world entity and linking them together.

How does it work?

In the Clinia platform, entity resolution is driven by resource collections and their corresponding comparison rules. These resource collections are established at the MDM level and serve to consolidate data source profiles into a unified resource type.

A Clinic created under data source 1 might also exist as an Organization in data source 2, and a Doctor's Office in data source 3. A resource collection indicates to the system that all source records created for these profiles can represent the same real-world entities.

To reliably and consistently resolve source records and link them together, resource collections use Unified Records. A unified record uniquely represents a real-world entity across the whole system. As such, it has a unique identifier that downstream applications can use to refer to it. A unified record also contains pointers and metadata about its association with every source record linked to it. This information can be leveraged by the system and its users to build, maintain, and reconcile the master data for this entity.

How are records resolved?

To identify when similar source records are created within or across different data sources, the system uses comparison rules to compare records against one another effectively. Resolution Rules define:

  1. Resolution Properties used for comparison and their mappings to the source profiles;
  2. Matchers define the parameters of individual property comparison and;
  3. Resolvers determine the outcome of the entire comparison process based on the individual property comparison outcome.

📘

Configurability

Users configure resolution rules based on their business logic, adapting and optimizing for their data.

Because a unified record only contains identifiers, metadata, and links to its source records but no actual properties, a special representation must be created to enable comparison. This representation is built by aggregating the properties from their linked source records. This means that when we compare an incoming source record to what is in the system, we try to match it to an aggregate representation of a unified record.

Source Record's properties are used against the Aggregate

Source Record's properties are used against the Aggregate

Properties are compared to see which ones are an exact match in the Aggregate

Properties are compared to see which ones are an exact match in the Aggregate

Individual Source Records

Individual Source Records