qtsortingoptimizationqstringqstringlist

How can I optimize the code for big QStringList?


The following code does not seem to finish in tolerable time if list2 contains more than 200,000 entries.

    QStringList list1; 
    QStringList list2=getalist();
    int count=list2.count();
    for(int j=0;j<count;j++)
    {
        QString e=list2[j];
        if(!list1.contains(e,Qt::CaseInsensitive))
        {
            list1<<e;
        }
    }

What is the bottleneck and how can I optimize it?


Solution

  • The bottleneck is the choice of QStringList as the container for that task. It is a QList<QString> and I believe it uses linear search to implement the function contains.

    The best solution would be to use a tree-based container like std::set or a hash based container like QSet (yes, it it hash based contrary to the std::set). From Qt documentation:

    QSet is one of Qt's generic container classes. It stores values in an unspecified order and provides very fast lookup of the values. Internally, QSet is implemented as a QHash.