The following code does not seem to finish in tolerable time if list2 contains more than 200,000 entries.
QStringList list1;
QStringList list2=getalist();
int count=list2.count();
for(int j=0;j<count;j++)
{
QString e=list2[j];
if(!list1.contains(e,Qt::CaseInsensitive))
{
list1<<e;
}
}
What is the bottleneck and how can I optimize it?
The bottleneck is the choice of QStringList
as the container for that task. It is a QList<QString>
and I believe it uses linear search to implement the function contains
.
The best solution would be to use a tree-based container like std::set or a hash based container like QSet
(yes, it it hash based contrary to the std::set). From Qt documentation:
QSet is one of Qt's generic container classes. It stores values in an unspecified order and provides very fast lookup of the values. Internally, QSet is implemented as a QHash.