php

get_meta_tags() throwing error failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden


I'm trying to get meta data from a website URL using get_meta_tags() function. Most URL that I inserted are working fine but there is this 1 URL throwing the error failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden.

I was wondering if there is a way I can get through with the permission? If no, is there any way I can detect if the specific website can be accessed or not? At least I can do something to work it out without having the error showing up cause I need to get some information from meta data.

My code is simply putting like this:

get_meta_tags("https://www.udemy.com/course/beginning-c-plus-plus-programming/");


Solution

  • It looks like site blocks PHP scripts to prevent scraping.

    You can try to make site think that it is accessed by a human (Web browser).

    You can change the User-Agent header during the request using stream_context_create():

    $context = stream_context_create(
        array(
            "http" => array(
                "header" => "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
            )
        )
    );
    
    $tags = get_meta_tags(file_get_contents('https://www.udemy.com/course/beginning-c-plus-plus-programming/', false, $context));
    var_dump($tags)
    

    Here you can find the list of most common user agents.

    P.S. Keep in mind this is not really fair.