chttpsocketsunistd.h

Read call is slowing down execution by 1 minute?


So, I have this simple program in C that tries to make a HTTP request:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

void err(const char *msg) {
    fprintf(stderr, "[ERROR] %s\n", msg);
    exit(1);
}

int main(int argc,char *argv[])
{
    char *host;
    int port;
    char *request;

    host = "api.ipify.org";
    port = 80;
    request = "GET / HTTP/1.1\r\nHost: api.ipify.org\r\n\r\n";

    struct hostent *server;
    struct sockaddr_in serv_addr;
    int sockfd, bytes, sent, received, total;
    char response[4096];

    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) err("Couldn't open socket");

    server = gethostbyname(host);
    if (server == NULL) err("No such host");

    memset(&serv_addr, 0, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_port = htons(port);
    memcpy(&serv_addr.sin_addr.s_addr, server->h_addr, server->h_length);

    /* connect the socket */
    if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) err("Couldn't connect");

    /* send the request */
    total = strlen(request);
    sent = 0;
    while (sent < total) {
        bytes = write(sockfd, request + sent, total - sent);
        if (bytes < 0) err("Couldn't send request");
        if (bytes == 0) break;
        sent += bytes;
    }

    /* receive the response */
    memset(response, 0, sizeof(response));
    total = sizeof(response) - 1;
    received = 0;
    while (received < total) {
        bytes = read(sockfd, response + received, total - received);
        if (bytes < 0) err("Couldn't receive response");
        if (bytes == 0) break;
        received += bytes;
    }

    /*
     * if the number of received bytes is the total size of the
     * array then we have run out of space to store the response
     * and it hasn't all arrived yet - so that's a bad thing
     */
    if (received == total) err("Couldn't store response");

    /* close the socket */
    close(sockfd);

    /* process response */
    printf("Response:\n%s\n",response);

    return 0;
}

But, when I compile it and run it, it does what it is supposed to do, but it takes very long. Running it with the time command reveals that it takes ~1m 0.3s to execute. If I comment out the read function call, the execution time goes back to 0.3s. This means that for some reason, it is delaying my program by exactly 1 minute.

I've tried putting a printf at the very start of the main function, but that also doesn't get called until 1 minute has passed.

Why is the entire main function delayed by one function and how can I fix this?


Solution

  • First of all, you should include a field in the header like this:

    Content-length: 0\r\n
    

    It is true that the request is a GET request, and it is also true that a GET request doesn't have a body included. But this requirement is not mandatory, and you can send a body if you like (while servers are required to ignore it, for a GET request) it is not forbidden to send an nonempty body for a get. You should send a Content-length field.

    Second, as you don't close the connection, neither shut it down on your side, and as you are using version 1.1 of HTTP, the server is waiting for more content to come (a new request, or the body of the first GET method). Read the RFC document RFC2616 for information about 1.1 connection model.

    After changing the following lines

        while (received < total) {
            bytes = read(sockfd, response + received, total - received);
            if (bytes < 0) err("Couldn't receive response");
            if (bytes == 0) break;
            received += bytes;
        }
    

    into

        while (received < total) {
            bytes = read(sockfd, response + received, total - received);
            if (bytes < 0) err("Couldn't receive response");
            if (bytes == 0) break;
            write(1, response + received, bytes); /* <<< this line added */
            received += bytes;
        }
    

    to see what we are receiving, I was able to observe that the server sends you a short response:

    $ http
    HTTP/1.1 200 OK
    Server: Cowboy
    Connection: keep-alive
    Content-Type: text/plain
    Vary: Origin
    Date: Mon, 21 Feb 2022 21:39:22 GMT
    Content-Length: 14
    Via: 1.1 vegur
    
    82.181.193.234_    (<-- cursor remains there, as the last _ char)
    

    and indeed remains waiting for the next request (as specified in HTTP/1.1, for a keep alive connection) Finally, the server times out, and closes the connection.

    You can switch to HTTP/1.0, as keep alive connections are not supported by HTTP/1.0, and will see how the connection is closed by the server immediately after the request. For a good interaction with the server, you should parse the headers of the response as you receive them, and then interpret the Content-length: sent by the server (or the chunks, in case of a Content-encoding: chunked content) to detect when the response is finished. The reason is the same as for the request. It is used by the other side in order to reuse the connection for the next request, and to detect where the request/response ends. In this case, the server sends a

    Content-Length: 14\r\n
    

    so when you receive the two consecutive CRLFs, you should read 14 bytes and stop there, in order to continue reading from that point the response to the next request.