I have a CMake list with thousands of entries, where I want to find (not remove!) all duplicates. Is there a clever way without looping through the whole list and do a lot of searches?
My problems:
list REMOVE_DUPLICATES
does not show me which duplicates it removedlist FIND
only finds the first occurrence. It does not show how many occurrences there are in the whole listlist FIND
only works on the whole list. I cannot tell it to start the FIND from a certain index.the approach I would use (but do not like):
list FIND
on the sublist to search for the current entryFIND
was successful, we have a duplicate and know the index (n and search result)This would work, but looks very bulky to me, since it has to create thousands of sublists .. is there really no better way offered by CMake?
You can use list SORT
and traverse it once.
list(SORT entries)
set(duplicates)
list(GET entries 0 previousEntry)
list(POP_FRONT entries)
foreach (i IN LISTS entriesSorted)
if (i EQUAL previousEntry)
list(APPEND duplicates ${i})
else()
set(previousEntry ${i})
duplicates
contains all duplicates. If there are n > 2 entries the same, duplicates
will contain n-1 of these elements. Removing duplicates in duplicates
fixes this.
Complexity: O(n log(n)) + O(n)
Complete CMake file to verify:
cmake_minimum_required(VERSION 3.13)
project(test-duplicates-list)
set(entries 1 3 5 4 6 6 2 3 6 7 1)
list(SORT entries)
set(duplicates)
list(GET entries 0 previousEntry)
list(POP_FRONT entries)
foreach (i IN LISTS entries)
if (i EQUAL previousEntry)
list(APPEND duplicates ${i})
else()
set(previousEntry ${i})
endif()
endforeach()
message("Duplicates: ${duplicates}")