How To Automate Your Google Sheets Deduplication in Only 5 Minutes

Deduplication increases the accuracy of your spreadsheet while saving time and streamlining your workflow. Learn how to automate it with Google Apps Script.
by Julianne Youngberg · February 2024

Contents

    An essential part of storing and managing data on Google Sheets is dealing with duplicates. Whether it’s due to timing, overlapping sources, or human error, extra records can affect the effectiveness of your spreadsheet.

    Fortunately, you don’t need to search for and delete duplicates manually. Google Sheets allows you to set your spreadsheet up for automated deduplication, which ensures your database is actively maintained.

    There are a few major reasons to deal with duplicates automatically:

    • It saves time. When you want to set up a new workflow or analyze data, you don’t have to review your information to make sure it’s ready for use.
    • It increases accuracy. Deleting duplicates by hand isn’t impossible, but it has a greater chance of human error. Computed decision-making is much more reliable.
    • It conserves resources. Removing unnecessary data frees up storage space and makes your spreadsheet more manageable.

    All in all, putting deduplication measures in place is well worth a few minutes of your time.

    In this tutorial, you’ll learn how to set up a custom function on Google Apps Script, and then trigger it to run in the background. This may seem intimidating, but the entire process is actually fairly easy and can be done in just a few minutes.

    What You Will Create

    An automated deduplication process ensures your data remains accurate and ready for use at all times. Since it neither requires a human to find nor delete these extra rows, it saves time and resources while minimizing human error.

    This Google Sheets deduplication method utilizes a custom script written online and hosted by Google’s servers, meaning you will not need to integrate with any other external applications.

    By the end of this tutorial, you should be see any duplicate rows on your spreadsheet deleted in seconds.

    GIF of automated Google Sheets deduplication

    Once your spreadsheet is ready and populated with some data, we’ll set a function up on Apps Script.

    What is Google Apps Script?

    Apps Script is a cloud-based JavaScript platform developed by Google. It enables integration and automation of tasks across different products in the ecosystem. In the case of Google Sheets, it’s commonly used to write custom functions that enhance the capabilities of your spreadsheets beyond the pre-built options.

    Screenshot of Google Apps Script my projects page

    The platform is quite intuitive and has a low barrier to entry. Code is written on a web-based editor and hosted by Google’s servers.

    Screenshot of Google Apps Script editor

    Since it runs completely on the cloud and is connected to your existing Google account, you don’t need to download any additional apps or create new accounts.

    Besides enabling the writing and hosting of custom code, Apps Script has several interesting features, such as being able to monitor the health and usage of your script projects.

    Screenshot of Google Apps Script project details page

    You can also control execution types, which are as follows:

    • Add On. The execution originated from an add-on.
    • Execution API. The execution originated from an invocation of the Apps Script API.
    • Time Driven. The execution was caused by a time event.
    • Trigger. The execution originated from a trigger source.
    • Webapp. The execution originated from a deployed web app.
    • Editor. The execution originated from the Apps Script editor.

    The execution logs allow you to see how each run was triggered as well as its start time, duration, and status.

    Screenshot of Google Apps Script executions page

    Overall, Apps Script is a powerful platform that can supercharge workflows throughout the Google product ecosystem.

    How to Deduplicate with Google Apps Script

    Setting up a deduplication process on Apps Script is just a matter of creating a project linked to your spreadsheet, preparing a script, and adding a trigger. Since we’ll be providing the code for your use, it should take just a few minutes to set up.

    Let’s start from the spreadsheet.

    Populate a Google Sheets Spreadsheet

    You’ll need a spreadsheet populated with sample data to test whether the script is responding correctly. This can be a new sheet or an existing one that already has data.

    Screenshot of Google Sheets spreadsheet

    Keep in mind that your sheet should include a header row and some duplicate rows for testing. This will make it easier to see the script in action.

    Create a Function on Apps Script

    When your spreadsheet is ready, navigate to the Extensions tab and click Apps Script.

    Screenshot of Google Sheets spreadsheet with red arrow pointing to Apps Script extension

    A new tab will open to reveal the editor, where you can write and execute functions.

    Screenshot of Google Apps Script editor page

    Erase the template function data, and paste the following script into the editor:

    function removeDuplicateRows() {
      let sheet = SpreadsheetApp.getActiveSheet();
      const dataRows = sheet.getDataRange().getValues();
      const newDataRows = [];
    
      dataRows.forEach((dataRow) => {
        let duplicate = false;
        
        newDataRows.forEach((newDataRow) => {
          if (dataRow.join() === newDataRow.join()) {
            duplicate = true;
          }
        });
    
        if (!duplicate) {
          newDataRows.push(dataRow);
        }
      });
    
      sheet.clearContents();
      sheet.getRange(1, 1, newDataRows.length, newDataRows[0].length).setValues(newDataRows);
    }
    

    This function will join data rows only containing unique data, then clear and replace the entire spreadsheet with them—removing any duplicate information in the process.

    Click the Save button on the top of the editor. Your function will rename itself as removeDuplicateRows. You can also rename your file and project.

    Now, test the script by clicking Run. You might have to authorize the project by signing in and allowing it access to your sheet.

    Screenshot of Google Apps Script editor with red arrow pointing to run

    Notice that an execution log appears to indicate whether a run was successfully completed.

    Returning to your sheet should reveal all duplicate rows have been deleted.

    Set Up a Trigger

    The final step to fully automate the deduplication process is to set up a trigger that instructs it to execute when needed.

    Click the Triggers tab on the left sidebar of your Apps Script dashboard.

    Screenshot of Google Apps Script editor with red arrow pointing to triggers

    Click + Add Trigger button in the bottom right corner.

    Screenshot of Google Sheets trigger page with red arrow pointing to + add trigger

    Set it up by choosing the removeDuplicateRows function, running the Head deployment, selecting From spreadsheet as the event source, and selecting On edit as the event type.

    Screenshot of Google Apps Script trigger setup

    Click Save.

    You should find that every time new data is added to your sheet (whether manually or automatically), the script will run and remove any duplicate rows.

    GIF of automated Google Sheets deduplication

    Method Considerations

    When choosing the best deduplication method, it’s important to consider the unique characteristics of each one. There are many options available, such as the UNIQUE formula, which can help you quickly filter for duplicates and can easily be nested with other formulas to apply to a specific range or orientation.

    The Apps Script method discussed in this tutorial only flags a row as duplicate when the entire row is an exact match. As a result, it may be better suited for use cases when data is formatted identically from the source, such as with automated web scraping.

    While it is possible to use this method with manually updated spreadsheets, human error during data entry means the process might be less reliable. Automation use cases where you can predict the format of your incoming data are ideal as they minimize data and formatting variations.

    Another consideration is that this method applies deduplication on all spreadsheets in a Google Sheets file. If you would only like deduplication to be triggered and executed on certain sheets, you will have to encode the trigger into your script.

    Simplifying Database Management With Apps Script

    Automating your Google Sheets deduplication process through Apps Script can save you time, increase accuracy, and streamline your workflow. By setting up a custom function and trigger, you can easily remove duplicate workflows in just a few minutes.

    This automated method ensures that your data remains accurate and ready for use, without the need for manual intervention. Say goodbye to the hassle of deleting duplicates manually and embrace the efficiency of automation.

    About the authorJulianne Youngberg@paradoxicaljul
    Julianne is a technical content specialist fascinated with digital tools and how they can optimize our lives. She enjoys bridging product-user gaps using the power of words.

    Automate & Scale
    Your Web Scraping

    Browserbear helps you get the data you need to run your business, with our nocode task builder and integrations

    How To Automate Your Google Sheets Deduplication in Only 5 Minutes
    How To Automate Your Google Sheets Deduplication in Only 5 Minutes