How to Automatically Delete Duplicate Records in Airtable (Part 1)

Duplicate records can lead to errors and inefficiencies in data analysis. Learn how to find and manage Airtable duplicates automatically.
by Julianne Youngberg · February 2024

Contents

    Detecting and managing duplicate records in Airtable is crucial for maintaining an accurate database. Duplicates can lead to confusion, errors, and inefficiencies in data analysis—and whether it’s for personal or professional use, maintaining data integrity will help ensure you have a reliable source of information.

    There are several methods used to identify duplicate records in Airtable including extensions, third-party tools, and custom scripts. However, each has its drawbacks in terms of cost, technical complexity, and manual effort. If you prefer a solution that…

    • Doesn’t require a paid extension
    • Is code-free
    • Can be fully automated

    …then this method might be the one for you!

    In this tutorial, you’ll learn how to set your Airtable base up to detect duplicates. A follow-up article will teach you how to integrate your base with Zapier so your data can be automatically scraped periodically.

    Let’s dive in!

    What You Will Create

    Duplicate records can be the result of human error, botched data import, poor synchronization, overlapping sources, and more. Implementing a deduplication process cleans your dataset, ensuring you maintain an accurate database.

    We’ll use a series of linked fields to cross-reference each record with the others, marking duplicates so they can be removed.

    The only tool you will use for this process is:

    • Airtable : To store data and identify duplicate records

    The next article will take the process one step further by connecting with Zapier and Browserbear to automate the deduplication process.

    At the end of the tutorial, you should have a system of linked fields and a column that indicates whether a record is a duplicate:

    GIF of duplicate column formula

    Let's begin!

    How to Create a Deduplication Process in Airtable

    There are several ways to create a deduplication process in Airtable, but many require paid extensions or technical knowledge you may not have. The method you’ll learn in this article doesn’t require extensions or writing code.

    Let’s start with a simple Airtable base containing your data. Over this tutorial, you’ll combine relevant data to create a composite key, add a lookup table that links all records, link your lookup record to your data table, then add a formula that calculates whether each record is a duplicate.

    ❗️ Note : One limitation of this method is that it involves combining all search data into one record, making it unsuitable for large datasets. However, it is a reliable solution for relatively light, ongoing use cases.

    Step 1 - Create Composite Keys to Use As Search Values

    Starting off from your DATA table, review your fields and decide what you want to use as search values. This can be as many fields as you want, but keep in mind that the more you add to the lookup, the longer it might take to calculate duplicates.

    We'll start from this sample dataset:

    Screenshot of Airtable base with sample data

    Add a formula field named Composite Key. Write a formula to concatenate the fields you want to search for uniqueness. In this tutorial, we used CONCATENATE(Title, " - ",Date).

    Screenshot of Airtable composite key formula setup

    Now, add another formula field titled Composite Key + Record ID. Enter CONCATENATE({Composite Key}," ","{",Record_ID(),"}") as the formula.

    Screenshot of Airtable composite key + record ID setup

    This operation combines the composite key with a unique ID, ensuring that the first-created record matching a search is preserved as the original while newer records are flagged as duplicates.

    Punctuation is optional and mostly for ease of viewing.

    Bear Tip 🐻: If you're confident with formulas, combine the Composite Key and Composite Key + Record ID concatenation steps into one.

    Create a LOOKUP table in your Airtable base. The table will only be used to consolidate all composite keys and IDs into a single record, which will be used to cross-check each record for duplicates.

    Delete any unnecessary fields and records that may come up as presets on an empty table.

    Rename the primary field column as Link All Records. Add a single record, and insert the same text–Link All Records–into the space.

    Screenshot of Airtable lookup table with link all records outlined in red

    This will be the only record on this table.

    Returning to your DATA table, add a formula field titled Lookup Data. Enter ”Link All Records" as the formula.

    Screenshot of Airtable lookup data setup

    Now, change the field to a link to another record type, linking to the LOOKUP table. This will change all of the field’s single line text into links.

    Screenshot of Airtable lookup data linking setup

    You should see a field appear in the LOOKUP table that links to DATA. Opening it up will reveal links to all of your data records.

    Screenshot of Airtable all data records linked

    Remain on your LOOKUP table and add a lookup field for Composite Key + Record ID. Title it All Composite Key + Record ID.

    Screenshot of Airtable all composite key + record ID linking setup

    Opening up the record will reveal all of your consolidated search data as long text.

    Screenshot of Airtable all composite key + record ID data

    Return to your DATA table and add a lookup field for All Composite Key + Record ID.

    Screenshot of Airtable all composite key + record ID lookup field in data table

    This simply makes your consolidated search data available on the table.

    Step 4 - Add Formula Field to Identify Duplicates

    Add one final formula field, titled Duplicate?. Enter the following formula (or update as needed to match any fields titled differently):

    IF({LOOKUP DATA}, 
      IF( 
        FIND({Composite Key + Record ID}, 
          {All Composite Key + Record ID (from Lookup Data)} & ""
        ) 
        = 
        FIND({Composite Key}, 
          {All Composite Key + Record ID (from Lookup Data)} & ""
        ), 
        BLANK(), 
        "Duplicate" 
      ), 
      "Record needs to be linked" 
    )
    

    This operation will cross-check each record against the consolidated search data, returning Duplicate for any records flagged as unoriginal.

    Screenshot of Airtable duplicate? formula field setup

    If two or more records are found to be identical, the first-created record will be preserved as the original. Any newer entries will be flagged.

    If a record isn’t linked to the consolidated search data, it will return a warning.

    Screenshot of Airtable data table with records that need to be linked outlined in red

    🤝 Special thanks to Airtable community member kuovonne, whose original deduplication method was invaluable to this tutorial.

    Step 5 - Customize and Repeat As Needed

    Now that any extra records are flagged in the duplicates column, it’s easy to sort and delete them all at once. One easy way to do this is to add a view to your DATA table with one filter: Where Duplicate? contains Duplicate.

    Screenshot of Airtable duplicate view with filter outlined in red

    You can then select all of the duplicated records and delete them in mass.

    This deduplication method only works as long as each record is linked to the consolidated search value. So, you will need to manually repeat the process when new data is added. This is suitable for infrequent data additions or non-exact duplicates, but it can be time-consuming otherwise.

    If you want to take your process to the next step, you can use a third-party tool like Make or Zapier to link records from the moment they are added to the table. A separate automated workflow can also delete duplicates periodically, eliminating the need for you to manually sort through and remove them.

    Bear Tip 🐻: If you’re on a paid plan, Airtable’s Dedupe extension can help you find and manage duplicate records. You can search for exact, similar, or fuzzy matches and merge fields to primary records. However, you will have to manually review and delete dupes which may not suit every use case.

    Automatic Deduplication with Airtable

    Finding and managing duplicates is a relatively common need for Airtable users, but it can be tricky if you don’t want to pay for extensions and prefer an automated process. Fortunately, with some time, you can set up a way to identify extra records quite easily.

    Read on to Part 2 of How to Automatically Delete Duplicate Records in Airtable to learn how to connect your Airtable base with Zapier to automatically link records the moment they are added to the table, then delete duplicates periodically.

    About the authorJulianne Youngberg@paradoxicaljul
    Julianne is a technical content specialist fascinated with digital tools and how they can optimize our lives. She enjoys bridging product-user gaps using the power of words.

    Automate & Scale
    Your Web Scraping

    Browserbear helps you get the data you need to run your business, with our nocode task builder and integrations

    How to Automatically Delete Duplicate Records in Airtable (Part 1)
    How to Automatically Delete Duplicate Records in Airtable (Part 1)