← Functions

Count duplicates as one item

How to count unique values in a list with duplicates using Excel's SUM and COUNTIF array formula.

I have a list of duplicate entries. There are 19 entries altogether with duplicates. For example, China in the list appears 4 times. For reporting purposes, I am only interested in the number of countries, regardless of duplications. In this example, the number of individual countries are 7 (China, USA, Philippines, Malaysia, Russia, France, and Germany).

There are several ways of doing it:

  1. You can use "Filter" and manually count the list of filtering. Well, we want to stay away from the manual counting part.
  2. You can also use pivot table to see the number of countries and manually count it. Besides the manual counting work, pivot table requires manual refreshing every time you update the data.
  3. If you know the list of the individual countries, you can use the COUNTIF function to list it. However, it defeats the purpose of this project because we don't know the number of individual countries and that is why we want to count the list.

If we need to run a report for this purpose on a frequent basis, none of the above ways will work well.

Microsoft Excel has a nice way of doing it by using one of the array functions. Excel borrows the data manipulation concept from databases and has its own set of array functions. All array functions require you to press Ctrl+Shift+Enter at the same time to enter them. The formula in the cell will appear enclosed by {}.

To achieve our goal, the formula to use is:

=SUM(1/COUNTIF(A1:A19,A1:A19))

Press Ctrl+Shift+Enter instead of just Enter. The formula will then appear as:

{=SUM(1/COUNTIF(A1:A19,A1:A19))}

The formula can be entered anywhere other than cells A1 to A19. Here is how it works:

  1. The COUNTIF function counts the data against itself. The range and criteria in the COUNTIF function are exactly the same — both are A1:A19. The result is that China occurs 4 times, USA occurs 3 times, Philippines occurs 2 times, and so on. Because this is an array function, there are 7 results, not just one.

  2. We want duplicates counted as one. For China, 1 divided by 4 equals 0.25. Since China occurs 4 times, 4 × 0.25 = 1. So China is counted as one even though it occurs 4 times. The same logic applies to all countries. The SUM function adds these results together and returns 7 — the number of unique countries.

The array function works without any manual counting or refreshing. Every time you update the list, the formula automatically returns the correct unique count.