c++qtweb-scrapingqwebviewqwebelement

Web scraping with QWebView and QWebElement returns increasing multiples


I'm currently working on a piece of software that will query gatherer.magic.com in order to build a card database. While testing my functions I found that I'm getting weird results. My functions are as follows:

void cardDB::updateDB()
{
    this->view = new QWebView;
    QString urlString("http://gatherer.wizards.com/Pages/Card/Details.aspx?  multiverseid=");

    for(int i = 1; i <= 4; i++)
    {

        // Load the page
        view->load(QUrl(urlString+QString::number(i)));
        QObject::connect(view, SIGNAL(loadFinished(bool)), this, SLOT(saveFile()));

        // Wait for saveFile() to finish
        QEventLoop loop;
        QObject::connect(this, SIGNAL(done()), &loop, SLOT(quit()));

        loop.exec();
    }
}

void cardDB::saveFile()
{
    QString fileName("test");
    // Grab the name tag
    QWebElement e = view->page()->mainFrame()->findFirstElement("div#ctl00_ctl00_ctl00_MainContent_SubContent_SubContent_nameRow");
    QString pageString = e.toPlainText();
    pageString.remove(0, 11);

    QFile localFile(fileName +".txt");
    if (!localFile.open(QIODevice::Append))
    {
        // Still need to implement error catching
    }
    else
    {
        localFile.write(pageString.toUtf8());
        localFile.close();
    }

    emit done();
}

my results come out like this:

Ankh of Mishra
Basalt Monolith
Basalt Monolith
Black Lotus
Black Lotus
Black Lotus
Black Vise
Black Vise
Black Vise
Black Vise

Before I added the event loop I would just get the i card name i times now it seems to match based on what number in the loop it is.


Solution

  • The following line of code added at the end of the for loop fixed my issue:

    QObject::disconnect(view, SIGNAL(loadFinished(bool)), this, SLOT(saveFile()));
    

    I believe this was because on each iteration of the loop I would connect a new signal/slot combination so each would happen when the loadFinished signal came through.