Sections

Patient Matching Example

Summary

In this section we present an example for a patient matching project and the different steps to implement a patient matching strategy.

Project profile: DISI MVP

The DISI MVP is a reference project that implements an HIE solution with a Client Registry, a Shared Health Record, an Interoperability Layer, and Data Analysis and Visualisation. The data will be captured from OHRI systems located at various health facilities. OHRI is an OpenMRS EMR that implements the HIV CBS use case.

The goal of this patient matching example is to follow the patient matching process to configure Client Registry to match patients successfully.

Profile

The profile is as follows:

Use cases required:

CR (HIE),
Evaluation

Scale: low
Algorithm required:

Low quality data and available skills => Fellegi-Sunter & EM

Selecting patient matching tools

From the project profile identified, the tools selected for this project are:

Production tool: OpenCR

Evaluation tool: Fastlink R Notebook

Fastlink configuration

Data analysis

There's no production data for this project. The structure of the data can be found in the Minimum Data Set

As part of this data analysis we want to determine the following:

What identifiers to use from all the captured data

Person unique identifiers: national ID and phone number
Other demographic identifiers: given name, family name, date of birth, gender and city
Source system ID: OHRI patient identifier

What pre-processing is required before all records, from different data sources, can be compared

For this project, we are assuming that pre-processing is not required.

What are the characteristics and quality of those identifiers and what type of errors are present when capturing the data

The Scenario 3: Low data quality configuration will be used.

Generate a dataset

The generated dataset can be downloaded here. It was created using the Data Generator Google Colab Notebook created by Jembi using the configuration below. More information on data generation can be found here.

Test and choose the optimal configuration

We used the Fastlink R Google Colab Notebook created by Jembi.

The criteria used for the test was:

Jaro-Winkler as the string similarity algorithm
A similarity threshold of 0.92 was used

Test in production tool

We applied the obtained configuration in OpenCR.

Next List of Technical Tools

On This Page

Patient Matching Example
Summary
Project profile: DISI MVP
Profile
Selecting patient matching tools
Fastlink configuration
Data analysis
The Scenario 3: Low data quality configuration will be used.
Generate a dataset
Test and choose the optimal configuration
Test in production tool