c++stdset

How to remove duplicate values while they are being read?


MRE (because license issues):

std::set<std::string> pool;

// values: 10 10 1 3 4 3 3 2 5 7 5 4 3 9 8 8 7
// (values is an iterable collection)
for (const auto& value : values) {
    pool.insert(value);
}

// expected read: 10 1 3 4 2 5 7 9 8

// actual read: 1 10 2 3 4 5 7 8 9 (10 after 1 because std::string)

what collection do I use to achieve this?


Solution

  • Use 2 containers. Read into the set, using the return value of insert to work out if this item was new. Then add only new items to a vector. That vector will be the unique items in insertion order.

    #include <vector>
    #include <set>
    #include <string>
    #include <iostream>
    
    int main()
    {
        std::vector<std::set<std::string>::iterator> itemsInOrder;
        std::set<std::string> pool;
    
        std::vector<std::string> values = {"10", "10", "1", "3"};
    
        for (const auto& value : values)
        {
            auto ret = pool.insert(value);
            if (ret.second)
            {
                itemsInOrder.push_back(ret.first);
            }
        }
    
        for (auto item : itemsInOrder)
        {
            std::cout << *item << std::endl;
        }
    }