listcmakefind

CMake: how to find all matching entries in list


I have a CMake list with thousands of entries, where I want to find (not remove!) all duplicates. Is there a clever way without looping through the whole list and do a lot of searches?

My problems:

the approach I would use (but do not like):

  1. stepping through each entry of the list n=0..length-1
  2. create a sublist for entries n..length
  3. use list FIND on the sublist to search for the current entry
  4. if FIND was successful, we have a duplicate and know the index (n and search result)

This would work, but looks very bulky to me, since it has to create thousands of sublists .. is there really no better way offered by CMake?


Solution

  • You can use list SORT and traverse it once.

    1. Sort your list.
      list(SORT entries)
    2. Create list of duplicates
      set(duplicates)
    3. Store first element and create list of remaining elements
      list(GET entries 0 previousEntry)
      list(POP_FRONT entries)
    4. Iterate over the list
      foreach (i IN LISTS entriesSorted)
      if (i EQUAL previousEntry)
      list(APPEND duplicates ${i})
      else()
      set(previousEntry ${i})

    duplicates contains all duplicates. If there are n > 2 entries the same, duplicates will contain n-1 of these elements. Removing duplicates in duplicates fixes this.

    Complexity: O(n log(n)) + O(n)

    Complete CMake file to verify:

    cmake_minimum_required(VERSION 3.13)
    
    project(test-duplicates-list)
    
    set(entries 1 3 5 4 6 6 2 3 6 7 1)
    
    list(SORT entries)
    set(duplicates)
    list(GET entries 0 previousEntry)
    list(POP_FRONT entries)
    
    foreach (i IN LISTS entries)
      if (i EQUAL previousEntry)
        list(APPEND duplicates ${i})
      else()
        set(previousEntry ${i})
      endif()
    endforeach()
    
    message("Duplicates: ${duplicates}")