Skip to main content

Experian Match concepts

Experian Match is powerful and highly configurable. To make the most of the product, we recommend you familiarise yourself with the core concepts.

More information can be found in the user guide on how to successfully configure and tune your implementation. The below definitions provide a high-level overview of Experian Match and how it works.

Matching records

As the name suggests, the concept of matching records is the core principle behind Experian Match. Matching itself is the process of identifying which records are similar enough to be considered the same entity. This process involves standardising the input, blocking together similar records and comparing them based on configurable matching rules. The result of this is a collection of clusters containing matched records.

Match levels

Each match between two records can have one of four match levels. The default matching rules define the match levels as follows:

Exact

Each individual field that makes up the record matches exactly.

Close

Records might have some fields that match exactly and some fields that are very similar.

Probable

Records might have some fields that match exactly, some fields that are very similar and some fields that differ a little more.

Possible

Records contain the majority of fields that have a number of similarities but do not match exactly.

Data configuration

Experian Match can connect and output to multiple data sources and data types. You will need to configure these connections as part of setting up your session.

The available data types are:

  • JDBC
  • Mongo
  • Flatfile e.g. CSV

Data source

In order for Experian Match to start work on your data, you need to configure your data sources. Experian Match will need to know how to connect to your data source (including authentication information) and which data from the source to match on.

Match output

You will need to configure your data output if you want to output the results of your Matching jobs. Output configuration requires connection information to your database along with the fields you wish to output.

Experian Match provides four result fields that can be output alongside the source data which show the results of a matching job:

FieldField description

$CLUSTER_ID

The Cluster ID generated for the record by Matching

$MATCH_STATUS

The score associating the record to the cluster, i.e. Exact, Close, Probable, Possible, None

$SOURCE_ID

The configured Source ID from which a record originated

$RECORD_ID

The unique Record ID of the record as used in matching

Matching logic

Experian Match provides complete control over the matching logic and the way records are matched.

This can be configured using blocking keys and rules:

Blocking keys

Experian Match creates blocks of similar records to assist with the generation of suitable candidate record pairs for scoring. Blocks are created from records that have the same Blocking Key values. Blocking Keys are created for each input record from combinations of the record’s elements that have been keyed. Keying is the process of encoding individual elements to the same representation so that they can be matched despite minor differences in spelling.

A default set of best practice blocking keys are provided with the software for name and address data. These can be acquired by a call to the REST API. To use these the user must modify them to match their input data and requirements then submit them via the REST API.

Rules

Before setting up your match session, a rule set must be configured. A rule set is a set of logical expressions (rules), written in our own Domain Specific Language (DSL), which control how records are compared and how match levels are decided. We have designed the rule DSL to give you complete control over how records match.

A default best practice rule set is provided with the software for name and address data. This can be adjusted for optimal matching depending on your data and requirements.

Match store

The match store is created when a matching job is run. The location of the match store is set when configuring your session. The store contains the newly created clusters of records. Performing an output request will output the cluster IDs from the match store to your desired location.

Clusters

A cluster is a collection of records that have been identified as representing the same entity using the rules that you have provided.

Copyright ©, 2014-2017. All rights reserved.