phparraysweb-scrapingforeachsimple-html-dom

Doubled elements in result array while scraping HTML content


I'm crawling through the old page with more than 10 000 comments which I'm trying to import to WordPress.

I'm using simple_html_dom.php library, which in this case is not important.

What I'm doing is getting a URL with 24 first posts crawling through them and getting an element with comments.

$url = 'http://xx/aktualnosci,wszystkie,0,'.$x.'.html'; //some URL with first 24 posts
$html = file_get_html($url);

$articlesCount = 0;
$commentsCount = 0;

foreach ($html->find('ul.news_codrugi li') as $article) { //get all 24 posts urls
    $rawLink = $article->find('a');

    foreach ($rawLink as $testLink) {    
        $link = 'http://xx/'.$testLink->href;

        $rawTitle = $testLink->href;
        $rawTitle = explode(",", $rawTitle);
        $ggTitle = $rawTitle[1];
        $htmlNew = file_get_html($link);

        foreach ($htmlNew->find('div.komentarz_lista') as $comment) { //comment element
            $comm = $comment->find('p');
            foreach ($comm as $commText) {
                $cleanerCommText = trim(strip_tags($commText));
                $item['commRaw'] = $cleanerCommText;
                $comments[] = $item;
            }
            $commentsCount++;
        }
        $articlesCount++;
    }
    //unset($articles);
}

For this moment everything is pretty fine, I've got all comments in Array. The problem is that the comments text, date and author are in

item without any class or ID so I've got no trigger to get them separately, so my array is

[0] => text, [1] => date and author, [3] => text, [4] => date and author etc

I'm trying to put it in to a new array like [text] => text, [sign] => date and author :

$x = $commentsCount;
echo $x.'<br />';

$rawComm = array_column($comments, 'commRaw');
$rawCommCount = count($rawComm);

echo 'Pobrane wpisy: '.$rawCommCount.'<br />';
$z = 0;

foreach($rawComm as $commItem) {
    if($z % 2 == 0) {
        $commArr['text']    = $commItem;
    }else{
        $commArr['sign']    = $commItem;
        //echo $commItem;
    }
    echo 'Numer wpisu: '.$z.'<br />';
    $z++;
}

In the last loop foreach($rawComm as $commItem) when I echo the values everything is fine, I've got Comment Text and Comment Date and Author printed properly. But when I'm trying to put it into a new array $commArr I'm getting double items, so my array is twice bigger with doubled everything.

And why do I need it in a new array? Because I want to put it into a DB.

So at this point, I don't know what causes this problem.


Solution

  • I am not a wp coder and been years I used it for a demo! You can use key like this, atleast how I would do in php.

        $a = array(
          array(
            'id' => 5698,
            'first_name' => 'Peter',
            'last_name' => 'Griffin',
          ),
          array(
            'id' => 4767,
            'first_name' => 'Ben',
            'last_name' => 'Smith',
          ),
          array(
            'id' => 3809,
            'first_name' => 'Joe',
            'last_name' => 'Doe',
          )
        );
    //Collect array values excrated from foreach
        $Collected_array_result = array();
        foreach($a as $key => $value ) {
            $Collected_array_result[':'.$key] = $value;
        }
       //Create another array from that values 
        print_r($Collected_array_result);
    

    Output

    Array ( [:0] => Array ( [id] => 5698 [first_name] => Peter [last_name] => Griffin ) [:1] => Array ( [id] => 4767 [first_name] => Ben [last_name] => Smith ) [:2] => Array ( [id] => 3809 [first_name] => Joe [last_name] => Doe ) );
    

    And how to put in db

    $stmt = $pdo->prepare("INSERT INTO comments ( " . implode(', ',array_keys($a)) . ") VALUES (" . implode(', ',array_keys($Collected_array_result)) . ")");
    $result = $stmt->execute($Collected_array_result);
    

    Get names from array and create a new array with names:

    $first_name = array_column($a, 'first_name', 'id');
    print_r($first_name);
    

    output

    Array ( [5698] => Peter [4767] => Ben [3809] => Joe );
    

    UPDATE : On @Dharman comment for sql injection and insert data in db using prepared statement, wasnt asked for insert query in question but in case you use that query, please filter values from array or use like following.

    $first_name = array_column($a, 'first_name');
    $first = implode(', ', $first_name);
     echo $first;
    
    $last_names = array_column($a, 'last_name');
    $last = implode(', ', $last_names);
     echo $last;
    
    $id = array_column($a, 'id');
    $iddd = implode(', ', $id);
     echo $iddd;
    
    $sql = "INSERT INTO comments (first_name, last_names) VALUES (?,?)";
    $stmt= $pdo->prepare($sql);
    $stmt->execute([$first, $last]);
    

    imploded all values in array and added to query 1 by 1.