How to Highlight Duplicates in Google Sheets

puneet-gogia-excel-champs

- Written by Puneet

Having duplicates in data can lead to several issues that affect data analysis, reporting accuracy, and workflow efficiency. Here’s why managing duplicates is particularly difficult in Google Sheets.

Calculations such as sums, averages, medians, and other statistical calculations may become inflated or skewed if duplicates exist. For instance, if sales data entries are duplicated, the total sales will appear higher than they are.

In this tutorial, we will learn to highlight duplicates in Google Sheets. As there are multiple ways to do this, we will explore all the methods in detail. So, let’s get started…

Use Conditional Formatting to Highlight Duplicates in Google Sheets

Let’s apply this with an example where we have numbers A1:A12. We want to highlight duplicate numbers:

  • First, select the range of cells you want to check for duplicate values. This can be any column or row where you suspect there might be duplicate data. Once your range is selected, click “Format” in the top menu. A drop-down menu will appear, from which you should select “Conditional formatting”.
  • When you click “Conditional Formatting,” a sidebar will appear on the right. In the ‘Format cells if’ dropdown, select ‘Custom formula is’. It allows you to input a custom formula to tell Google Sheets what condition to look for.
  • In the field below the dropdown, type in the formula =COUNTIF($A$1:A12,A1)>1, or you can replace the range A:A and A1 with your selected range. This formula tells Google Sheets to count the instances where the data in a cell appears more than once.
  • After that, choose the formatting style for the duplicate values. This is how you want the duplicates to be highlighted – for instance, you could choose to have them highlighted in red.
  • Finally, click ‘Done’ to apply the conditional formatting. This will highlight any duplicate values in the range A1:A12.

Do I Need to use $ sign with the first cell of the Range?

You’re right about needing to use absolute references (freeze the cell) when applying conditional formatting formulas in Google Sheets to highlight duplicates.

Let’s clarify how to correctly use the formula with the dollar sign to make it work properly for highlighting duplicates.

=COUNTIF($A$1:A12, A1) > 1

This formula ensures that the range your formula checks will expand as it moves down the column but always starts from the first cell in your selected range (A1 here).

  • $A$1:A12: This range starts at A1 and goes down to the Cell A12 that contains data. The $ signs fix the starting point at A1, no matter where the formula is applied.
  • A1: This is the cell currently being evaluated by the conditional formatting rule. It changes to A2, A3, etc., as the rule is applied to each cell in the column.

Using Dynamic Range in the Formula to Highlight the Duplicate Values

This method works perfectly. However, it’s important to note that while it is useful, it isn’t automatically updated when new data is added below the range.

Let’s say you add a new value to cell A13. If you add, you must change the range in the conditional formatting to ensure that any new duplicates are also highlighted.

So, the best ways to deal with this problem is to create a dynamic named range or refer to the entire column and then use it in the conditional formatting while writing the formula. First let’s understand how to refer to the entire column in the formula:

=COUNTIF($A:A, A1) > 1

In this formula, we have used the entire columns to check for the duplicates instead of a range of specific cells. But we have also used $ dollar sign to freeze the first cell of the range.

Here you can see the entry in the cell A13 is also highlighted when there is duplicated number identified by the conditional formatting.

Highlight Duplicate Values from the Entire Row

To highlight duplicate rows in Google Sheets based on the entirety of the row’s values across multiple columns, you need to adjust your conditional formatting rules accordingly.

It is useful to identify rows where the combination of column values is repeated.

  • First, decide how many columns you want to check for duplicates. For example, if you have columns A and B, Select the range of cells in these columns where duplicates might occur (e.g., A2:B14).

Go to the Format menu. Choose Conditional formatting. Then, choose “Custom formula is” from the dropdown “Format Cells If”. And use the below formula to check for duplicate rows based on all selected columns. Assuming you are checking columns A to B from row 2 to 14.

=COUNTIF(ARRAYFORMULA($A$2:$A$14,$B$2:$B$14),$B2)>1

Select a formatting option to apply to duplicate rows. And then click “Done” to apply the rule.

How this Formula Works?

This formula counts the number of times a specific combination of values from columns A and B appears more than once in Google Sheets.

=COUNTIF(ARRAYFORMULA($A$2:$A$14&$B$2:$B$14),$A2&$B2)>1
  • ARRAYFORMULA($A$2:$A$14&$B$2:$B$14): This part of the formula combines each value from the range A2:A14 with the corresponding value from the range B2:B14 to create a new array. The ARRAYFORMULA function allows us to perform this operation on an array of cells rather than just a single cell.
  • $A2&$B2: This part of the formula combines the values in cells A2 and B2.
  • COUNTIF: The COUNTIF function counts the number of cells within a range that meet the given criterion. In this case, it counts the number of cells in the array created by ARRAYFORMULA that match the combination of values in cells A2 and B2.
  • >1: This part of the formula checks whether the count is greater than 1. If it is, the specific combination of values from cells A2 and B2 appears more than once in the ranges A2:A14 and B2:B14.

Use Conditional Formatting to Highlight only the Second & Subsequent Occurrences of Duplicates

When you use the above methods to highlight the duplicate values, sometimes it’s tough to choose what you want to do with the values highlighted. This problem is because, you have the all the values highlighted and which makes them tough to filter and then delete it.

So, if you want to highlight only the second and subsequent occurrences of duplicates in Google Sheets, leaving the first instance unhighlighted, the correct approach involves a slightly different formula that checks for the count of current and previous appearances.

You need to use this formula:

=COUNTIF(A$1:A1,A1)>1

When you open the conditional formatting, you need to paste this formula to the in the “Custom Formula is” dropdown. And then click “Done” to apply.

This formula counts occurrences up to the current row and compares them to the total count of that item in the entire column.

Now you can choose what you want to do with the data. You can filter these values using the Filter By Color options and then delete these rows to

Remove Conditional Formatting Rule

While working with duplicate values, you might need to delete the conditional formatting rules, and for this:

Go to the Format menu in top menu bar. From the menu, select Conditional formatting. The Conditional Format Rules panel lists all the currently used rules.

Browse through the list to find the rule you want to remove. Each rule has a trash can icon or a “Delete rule” button next to it. Click the trash can icon next to the rule you wish to delete.

Google App Script to Highlight Duplicates from a Range

Here’s a ready to use app script if you want to highlight the cells with the duplicate values from a range of cells in Google Sheet:

function highlightDuplicates() {
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
  var range = sheet.getRange("A1:A13");  // Specify the range to search for duplicates
  var values = range.getValues();
  var duplicates = [];
  
  // Create a map to count occurrences
  var valueCount = {};
  for (var i = 0; i < values.length; i++) {
    var value = values[i][0].toString();  // Convert all values to string to handle different data types uniformly
    if (valueCount[value]) {
      valueCount[value].push(i);  // Push the row index to the map
      if (valueCount[value].length == 2) {  // Check if this is the second occurrence
        duplicates.push(valueCount[value][0]);  // Add the first occurrence row index to duplicates
      }
      duplicates.push(i);  // Add the current row index to duplicates
    } else {
      valueCount[value] = [i];
    }
  }
  
  // Clear any previous formatting
  range.setBackground(null);
  
  // Apply background color to duplicates
  for (var j = 0; j < duplicates.length; j++) {
    sheet.getRange(duplicates[j] + 1, range.getColumn()).setBackground('red');
  }
}

To use this, Go to Extensions > App Scripts > Paste the Code > Save > Run.

And below is the code if you want to highlight the duplicate values but only the only the second and subsequent duplicate values.

function highlightDuplicates() {
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
  var range = sheet.getRange("A1:A13");  // Specify the range to search for duplicates
  var values = range.getValues();
  var valueCount = {};
  
  // Clear any previous formatting
  range.setBackground(null);
  
  // Iterate over each cell in the range
  for (var i = 0; i < values.length; i++) {
    var value = values[i][0].toString();  // Convert value to string to standardize comparisons
    if (valueCount[value]) {
      valueCount[value]++;
      // Only add to duplicates list if this is at least the second occurrence
      if (valueCount[value] === 2) {
        // Highlight the second occurrence immediately
        sheet.getRange(i + 1, range.getColumn()).setBackground('red');
      } else if (valueCount[value] > 2) {
        // Highlight all subsequent occurrences
        sheet.getRange(i + 1, range.getColumn()).setBackground('red');
      }
    } else {
      valueCount[value] = 1;  // Initialize count of this value
    }
  }
}

Tips to use

  • Use Bright Color: Choose a highlight color that stands out well, so duplicates are immediately noticeable.
  • Check Entire Rows: If you want to highlight rows entirely the same across multiple columns, modify the formula to include all those columns.
  • Performance: Using conditional formatting rules over large ranges can slow down your Google Sheets performance, especially with complex formulas like COUNTIFS.
  • Helper Column: Sometimes using a helper column to first compute occurrences and then applying conditional formatting based on that column’s values can simplify your setup.

Wrap Up

Highlighting duplicate cells in Google Sheets can be a powerful tool for data analysis, helping you quickly identify and manage repeating numbers or texts in your data.

Last Updated: May 09, 2024