[SOLVED] Why get_headers() returns 400 Bad request, while CLI curl returns 200 OK?

Why get_headers() returns 400 Bad request, while CLI curl returns 200 OK?

Here's the URL: https://www.grammarly.com

I'm trying to fetch HTTP headers by using the native get_headers() function:

$headers = get_headers('https://www.grammarly.com')

The result is

HTTP/1.1 400 Bad Request
Date: Fri, 27 Apr 2018 12:32:34 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 52
Connection: close

But, if I do the same with the curl command line tool, the result will be different:

curl -sI https://www.grammarly.com/

HTTP/1.1 200 OK
Date: Fri, 27 Apr 2018 12:54:47 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 25130
Connection: keep-alive

What is the reason for this difference in responses? Is it some kind of poorly implemented security feature on Grammarly's server-side or something else?

Solution

It is because get_headers() uses the default stream context, which basically means that almost no HTTP headers are sent to the URL, which most remote servers will be fussy about. Usually the missing header most likely to cause issues is the User-Agent. You can set it manually before calling get_headers() using stream_context_set_default. Here's an example that works for me:

$headers = get_headers('https://www.grammarly.com');

print_r($headers);

// has [0] => HTTP/1.1 400 Bad Request

stream_context_set_default(
    array(
        'http' => array(
            'user_agent'=>"php/testing"
        ),
    )
);

$headers = get_headers('https://www.grammarly.com');

print_r($headers);

// has [0] => HTTP/1.1 200 OK