All Collections
Sales Force Automation
Managing Data - Using the Duplicate Checker
Managing Data - Using the Duplicate Checker

Set up configurable match criteria to identify existing duplicates and auto merge from the tool, use the widget to avoid entering new dupes

Updated over a week ago


Data quality is a crucial factor in managing customer relationships - a single, clean record for each customer giving the visibility of your teams’ previous interactions and future plans is the key to successful collaboration, helping to ensure long lasting relationships and high customer satisfaction. ⤵

Some of the key reasons for ensuring data quality are:

  • Accurate customer insights: CRM systems rely on accurate and complete customer data to provide valuable insights into customer behaviour, preferences, and needs. Poor data quality can lead to incorrect analysis and decision-making. Adding AI into the mix makes this even more critical - you simply cannot rely on AI-generated insights and suggestions based on poor quality data!

  • Personalised customer experiences: high-quality data enables you to personalise your marketing and sales efforts, leading to improved customer satisfaction and loyalty.

  • Efficient business processes: clean and consistent data allows you to increase your operation efficiency by using Automatizer to streamline your business processes.

  • Better decision-making: reliable data is essential for making informed decisions about marketing,forecasting, product development, and resource allocation.

  • Compliance and risk management: accurate customer data helps businesses comply with data protection regulations and mitigates the risk of data breaches and security threats.

  • Cost savings: poor data quality can result in wasted resources, such as ineffective marketing campaigns or missed sales opportunities, leading to increased costs.

Having said that, it’s not easy to ensure high data quality - users are often in a hurry to enter data and don’t check for existing records, you might do your best when importing but 100% matching is usually unattainable if your data is from different sources.

Pipeliner already has tools to help and you may already have established data quality standards, use Auto Profiling to enrich your existing data, standardise data formats using validated field types such as dropdowns, apply field level permissions or even automate data entry using our API but our new Duplicate Checker is the game changer!

You can:

  • Enable duplication checks for a particular entity and specify which fields should be used for duplication checks from within the Admin Module

  • Open the Duplicate Checker tool from the Tools menu in the app to immediately see all duplicates per entity (depending on User Role permissions)

  • Mark duplicates and merge them from the Merge menu button

  • Use the duplicate checking widget to see possible duplicates of the current open record and take action to merge them if needed

  • Use the duplicate checking widget to avoid creating duplicates during new record creation by seeing if there are possible duplicates in the system ⤵

Do we have access to the Duplicate Checker?

Duplicate Checker is enabled for our Unlimited Tier customers and is also available as an add-on chargeable option for customers on Enterprise, Business or Starter Tiers.

We want to do our best to help you protect your data so we will not be offering trials of Duplicate Checker. If you want to see how it works before adding it to your subscription, please contact and we’ll arrange a short meeting to demonstrate the feature on a demo system where there is no risk to real-life data.

If you then decide to move forward, we’ll also be offering a paid-for option whereby we can create a copy of your live Pipeliner database as a temporary Sandbox so you can test out your chosen criteria against your actual data before running Duplicate Checker on your live Pipeliner space.


There is NO undo feature for merged records.

How is Duplicate Checker enabled?

Duplicate Checker is enabled from the Automation Hub in the Pipeliner Admin Module. ⤵

Once approved and running, your Admins will see a new Deduplication tab for each main entity (deduplication does not apply to Tasks, Appointments, Products, Product Line Items) ⤵

Global Deduplication Settings

From the Deduplication tab, your Admin can set up the global choices for deduplication for each entity.

Switching Deduplication On or Off

First, you can select if deduplication is On or Off for that entity. ⤵

If Deduplication is switched Off:

  • The entity will be not available in Duplication checker window

  • All duplication checker widgets will be hidden for this entity

  • Merging is still available as a main menu action

Match Settings

If Deduplication is switched On, you next need to define the Match Settings that will be used by the Deduplication Checker tool and also by the widget within an individual open record. ⤵

  • Matching Similarity level: this setting allows you to select the similarity level that will be applied across records - the higher the similarity, the fewer records will be found as duplicates but those records will be more identical

  • Select fields to match duplicates: the search engine will look for duplicates based on the values in your chosen fields

The option you choose in Merge Settings governs which record will be chosen as the Master record when duplicates are identified and merged together. This setting operates wherever a merge function is selected within the web app - from the Duplicate Checker tool, from the widget or from the menu options. ⤵

Choose the best option for your data from:

  • Most Populated Record - of the identified duplicates, the record with most fields filled in will be the master

  • Latest Created Record - of the identified duplicates, the latest record created will be the master

  • Oldest Created Record - of the identified duplicates, the oldest record created will be the master

  • Custom Condition- users can specify how exactly will be the master identified using a filter ⤵

NOTE: if the Duplicate Checker cannot accurately identify the master, the user is forced to select one.

For each entity, the Deduplication Settings have default values enabled. For the Matching fields these are:

Account & Contact - Name, Primary Email, Street address, City, Zip, Country

Other entities (Lead, Opportunity etc) - Name

For the Merge settings, the master record choice is set to Most Populated record for all entities.

Using the Duplicate Checker Tool

The Duplicate Checker tool is a powerful aid to maintaining the quality of your CRM data on an ongoing basis. It provides a real-time scan of selected entities, identifying all possible duplicates according to the default criteria and presenting options to Quick (auto) Merge or Manual Merge.

Duplicate Checker is accessed from the Tools menu and is available for users based on their User Role permissions. ⤵

A real time scan of all Pipeliner data runs and displays all the potential duplicates by entity (for all entities where deduplication is enabled in the Admin Module) using the default criteria set up by your Pipeliner Admins. ⤵

Click on an Entity in the left sidebar to see all suggested duplicates batched together in groups. ⤵

Selecting one or more groups gives access to the following actions on the toolbar.

Quick Merge - a bulk (auto) merge of the selected group(s) using the default sensitivity level, match fields and master record. ⤵

Post merge, you can click straight through to the resulting record. ⤵

Manual Merge - a manual merging of one selected group showing all records side-by-side so you can manually compare data and select the master record. ⤵

By default, you’ll see corresponding values from your match fields but you can select additional fields to be displayed to help you decide which values should be retained and which record should be the master record. ⤵

You can click through to each record to view even more detail if necessary and can search for a specific field displayed in your selected field list. Once you’re happy with your choices, click on Merge to merge data to your master record. ⤵

Show Records displays all records of the selected group or of all groups in a drill-down list view. ⤵

Add additional fields to the List View to better examine the records then click on Close before selecting Quick Merge or Manual Merge.

Refresh reloads the grid and resets any selections you’ve already made. ⤵

Settings allows the user to “override” the global settings from administration for the selected entity by changing the sensitivity level and master record selection. ⤵

Filter gives access to the standard Filter options which allow you to filter the duplicate records. This is extremely useful when the grid displays records that are suggested as duplicates because the match fields are empty/blank. The group shown below (having clicked on Show Records are suggested as possible duplicates because the email address is the same (albeit blank/empty) on all of them but we would not want to merge these together. ⤵

Adding a filter to exclude records without a value in Primary Email address will solve this issue. ⤵

NOTE: the records visible to a user in the duplicates grid depend on their user role and sales unit access settings - they are not able to see records that they do not have permissions for. This is another reason to ensure that your User Role permissions are set correctly.

You can also select individual records to be excluded from the merge. ⤵

And re-include them if you change your mind. ⤵

You can also select a group and choose Quick Merge or Manual Merge from the buttons on the right hand side.

Once a group has been merged (or if it fails, for example due to invalid data or permissions), the status (and how many records were merged) will be displayed. ⤵

User Role Permissions to use the Duplicate Checker Tool

You will want to ensure that only users based on specific user roles have the rights to use the Duplicate Checker tool from the Tools menu as this runs across your whole Pipeliner space and makes irrevocable changes to your data.

Make sure to update all your User Roles to enable or disable access to the tool. ⤵

NOTE: users whose User Role does not give them permission to run the Duplicate Checker tool will still be able to use the duplicate checking widget and merge selected records together using the Merge menu button

Merging selected records from the menu toolbar

Users with update and deletion rights to records will be able to select duplicates and use the Merge button from the toolbar to merge them together. ⤵

The subsequent screen gives the same option as the Manual Merge so users can add any additional fields they wish to display so they can judge which record should be the master, select the master and then click Merge. ⤵

Using the Duplicate Checker Widget for existing records

When you open a record, the Duplicate Checker widget in the right hand pane will check for any possible duplicates and display them. Click on the Actions button to open all suggested records in a drill down list view using Show All or click on Merge to merge the records. You can also click directly on the name of the suggested records to open in a seperate tab for review. ⤵

Using the Duplicate Checker Widget for new records

The widget makes it really easy to avoid adding duplicate records. When you enter a new record, the Duplicate Checker widget will display possible duplicates (based on the selected matching fields from the Admin Module) in the right hand panel. ⤵

Click on a suggested record to open it in a new tab so you can check whether it is the same record as the one you were planning to create. If so, just cancel and work on the existing record instead of adding duplicates to your database.

Frequently Asked Questions

I made a mistake and incorrectly merged records, what can I do?

If you merge data incorrectly, you’ll need to request a database restore from our Engineering team. This will be a chargeable service. You should immediately stop all your users from making updates into Pipeliner and send an email through to asking for a database restore and giving us a date and time just before you ran the merge. We will confirm the cost (approximately $150.00 depending on the size of your database and the number of records affected). We will pass the information to our Engineering team and they will restore your database to the backup set nearest to the date and time you give us. We’ll let you know as soon as that is completed and you’ll be able to check the data and then let your users start to enter information again. Remember to disable any data integrations for the period during which this is being organised.

How does this amazing tool work “under the hood”?

We are using a technique named Locality-Sensitive Hashing (LSH). LSH is a technique for indexing data points such that similar points have a high probability of mapping to the same bucket in a hash table. This allows for efficient retrieval of approximate nearest “neighbours”.

Did this answer your question?