CMake: how to find all matching entries in list

I have a CMake list with thousands of entries, where I want to find (not remove!) all duplicates. Is there a clever way without looping through the whole list and do a lot of searches?

My problems:

list REMOVE_DUPLICATES does not show me which duplicates it removed
list FIND only finds the first occurrence. It does not show how many occurrences there are in the whole list
list FIND only works on the whole list. I cannot tell it to start the FIND from a certain index.

the approach I would use (but do not like):

stepping through each entry of the list n=0..length-1
create a sublist for entries n..length
use list FIND on the sublist to search for the current entry
if FIND was successful, we have a duplicate and know the index (n and search result)

This would work, but looks very bulky to me, since it has to create thousands of sublists .. is there really no better way offered by CMake?

Solution

You can use list SORT and traverse it once.

Sort your list.
list(SORT entries)
Create list of duplicates
set(duplicates)
Store first element and create list of remaining elements
list(GET entries 0 previousEntry)
list(POP_FRONT entries)
Iterate over the list
foreach (i IN LISTS entriesSorted)
if (i EQUAL previousEntry)
list(APPEND duplicates ${i})
else()
set(previousEntry ${i})

duplicates contains all duplicates. If there are n > 2 entries the same, duplicates will contain n-1 of these elements. Removing duplicates in duplicates fixes this.

Complexity: O(n log(n)) + O(n)

Complete CMake file to verify:

cmake_minimum_required(VERSION 3.13)

project(test-duplicates-list)

set(entries 1 3 5 4 6 6 2 3 6 7 1)

list(SORT entries)
set(duplicates)
list(GET entries 0 previousEntry)
list(POP_FRONT entries)

foreach (i IN LISTS entries)
  if (i EQUAL previousEntry)
    list(APPEND duplicates ${i})
  else()
    set(previousEntry ${i})
  endif()
endforeach()

message("Duplicates: ${duplicates}")