apacheerlangcgiinets

inets httpd cgi script: How do you retrieve json data?


The cgi scripts that I have tried are unable to retrieve json data from my inets httpd server.

In order to retrieve json data in a cgi script, you need to be able to read the body of the request, which will contain something like:

{"a": 1, "b": 2}

With a perl cgi script, I can read the body of a request like this:

my $cgi = CGI->new;
my $req_body = $cgi->param('POSTDATA');

I assume that is an indirect way of reading what the server pipes to the script's stdin because in a python cgi script I have to write:

req_body = sys.stdin.read()

When I request a cgi script from an apache server, my perl and python cgi scripts can successfully get the json data from apache. But when I request the same cgi scripts from my inets httpd server, my perl cgi script reads nothing for the request body, and my python cgi script hangs then the server times out. My cgi scripts are able to retrieve data formatted as "a=1&b=2" from an inets httpd server--in that case the cgi facilities in both perl and python automatically parse the data for me, so instead of trying to read the body of the request, I just access the structures that cgi created.

Here is my httpd sever configuration (server.conf):

[
  {modules, [
    mod_alias,
    mod_actions,
    mod_esi,
    mod_cgi,
    mod_get,
    mod_log
  ]},
  {bind_address, "localhost"}, 
  {port,0},
  {server_name,"httpd_test"},
  {server_root,"/Users/7stud/erlang_programs/inets_proj"},
  {document_root,"./htdocs"},
  {script_alias, {"/cgi-bin/", "/Users/7stud/erlang_programs/inets_proj/cgi-bin/"} },
  {erl_script_alias, {"/erl", [mymod]} },
  {erl_script_nocache, true},
  {error_log, "./errors.log"},
  {transfer_log, "./requests.log"}
].

I start my httpd server with this program (s.erl):

-module(s).
-compile(export_all).

%Need to look up port with httpd:info(Server)

ensure_inets_start() ->
    case inets:start() of
        ok -> ok;
        {error,{already_started,inets}} -> ok
    end.

start() ->
    ok = ensure_inets_start(),

    {ok, Server} = inets:start(httpd, 
        [{proplist_file, "./server.conf"}]
    ),
    Server.

stop(Server) ->
    ok = inets:stop(httpd, Server).

My cgi script (1.py):

#!/usr/bin/env python3

import json
import cgi
import cgitb
cgitb.enable()  #errors to browser
import sys

sys.stdout.write("Content-Type: text/html")
sys.stdout.write("\r\n\r\n")

#print("<div>hello</div>")

req_body = sys.stdin.read()
my_dict = json.loads(req_body)

if my_dict:
    a = my_dict.get("a", "Not found")
    b = my_dict.get("b", "Not found")
    total = a + b
    print("<div>Got json: {}</div>".format(my_dict) )
    print("<div>a={}, b={}, total={}</div>".format(a, b, total))
else:
    print("<div>Couldn't read json data.</div>")

My cgi script (1.pl):

#!/usr/bin/env perl

use strict;
use warnings;
use 5.020;
use autodie;
use Data::Dumper;
use CGI;
use CGI::Carp qw(fatalsToBrowser);
use JSON;

my $q = CGI->new;

print $q->header,
      $q->start_html("Test Page"),
      $q->h1("Results:"),
      $q->div("json=$json"),
      $q->end_html;

Server startup in terminal window:

~/erlang_programs/inets_proj$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.2  (abort with ^G)

1> c(s).              
s.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,s}

2> Server = s:start().
<0.86.0>

3> httpd:info(Server).
[{mime_types,[{"htm","text/html"},{"html","text/html"}]},
 {server_name,"httpd_test"},
 {erl_script_nocache,true},
 {script_alias,{"/cgi-bin/",
                "/Users/7stud/erlang_programs/inets_proj/cgi-bin/"}},
 {bind_address,{127,0,0,1}},
 {modules,[mod_alias,mod_actions,mod_esi,mod_cgi,mod_get,
           mod_log]},
 {server_root,"/Users/7stud/erlang_programs/inets_proj"},
 {erl_script_alias,{"/erl",[mymod]}},
 {port,51301},
 {transfer_log,<0.93.0>},
 {error_log,<0.92.0>},
 {document_root,"./htdocs"}]
4> 

curl request:

$ curl -v \
> -H 'Content-Type: application/json' \
> --data '{"a": 1, "b": 2}' \
> http://localhost:51301/cgi-bin/1.py

*   Trying ::1...
* TCP_NODELAY set
* Connection failed
* connect to ::1 port 51301 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 51301 (#0)
> POST /cgi-bin/1.py HTTP/1.1
> Host: localhost:51301
> User-Agent: curl/7.58.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 16
> 
* upload completely sent off: 16 out of 16 bytes

===== hangs for about 5 seconds ====

< HTTP/1.1 504 Gateway Time-out
< Date: Thu, 08 Mar 2018 11:02:27 GMT
< Content-Type: text/html
< Server: inets/6.4.5
* no chunk, no close, no size. Assume close to signal end
< 
* Closing connection 0
$ 

My directory structure:

~/erlang_programs$ tree inets_proj/
inets_proj/
├── apache_cl.erl
├── cgi-bin
│   ├── 1.pl
│   └── 1.py
├── cl.beam
├── cl.erl
├── errors.log
├── htdocs
│   └── file1.txt
├── mylog.log
├── mymod.beam
├── mymod.erl
├── old_server.conf
├── old_server3.conf
├── old_server4.conf
├── requests.log
├── s.beam
├── s.erl
├── server.conf
└── urlencoded_post_cl.erl

Solution

  • I dug up the RFC for the cgi spec, which says:

    RFC 3875                    CGI Version 1.1                 October 2004
    
    
    4.2.  Request Message-Body
    
       Request data is accessed by the script in a system-defined method;
       unless defined otherwise, this will be by reading the 'standard
       input' file descriptor or file handle.
    
          Request-Data   = [ request-body ] [ extension-data ]
          request-body   = <CONTENT_LENGTH>OCTET
          extension-data = *OCTET
    
       A request-body is supplied with the request if the CONTENT_LENGTH is
       not NULL.  The server MUST make at least that many bytes available
       for the script to read.  The server MAY signal an end-of-file
       condition after CONTENT_LENGTH bytes have been read or it MAY supply
       extension data.  Therefore, the script MUST NOT attempt to read more
       than CONTENT_LENGTH bytes, even if more data is available.  However,
       it is not obliged to read any of the data.
    

    I don't understand what extension data is, but the key line is:

    the [cgi] script MUST NOT attempt to read more than CONTENT_LENGTH bytes, even if more data is available.

    If I alter my python script to read in the content length rather than trying to read in the whole stdin file--which doesn't stop reading until it gets an eof signal--then my python cgi script successfully retrieves the json data from my inets httpd server.

    #!/usr/bin/env python3
    
    import json
    import sys
    import os
    
    content_len = int(os.environ["CONTENT_LENGTH"])
    
    req_body = sys.stdin.read(content_len)
    my_dict = json.loads(req_body)
    
    sys.stdout.write("Content-Type: text/html")
    sys.stdout.write("\r\n\r\n")
    
    if my_dict:
        a = my_dict.get("a", "Not found")
        b = my_dict.get("b", "Not found")
        total = a + b
        print("<div>Content-Length={}</div".format(content_len))
        print("<div>Got json: {}</div>".format(my_dict) )
        print("<div>a={}, b={}, total={}</div>".format(a, b, total))
    else:
        print("<div>Couldn't read json data.</div>")
    
    '''
    form = cgi.FieldStorage()
    
    if "a" not in form:
        print("<H1>Error:</H1>")
        print("<div>'a' not in form</div>")
    else:
        print("<p>a:{}</p>".format( form["a"].value) )
    
    
    if "b" not in form:
        print("<H1>Error:</H1>")
        print("<div>'b' not in form</div>")
    else:
        print("<p>b:{}</p>".format(form["b"].value) )
    '''
    

    Server info:

    4> httpd:info(Server).
    [{mime_types,[{"htm","text/html"},{"html","text/html"}]},
     {server_name,"httpd_test"},
     {erl_script_nocache,true},
     {script_alias,{"/cgi-bin/",
                    "/Users/7stud/erlang_programs/inets_proj/cgi-bin/"}},
     {bind_address,{127,0,0,1}},
     {modules,[mod_alias,mod_actions,mod_esi,mod_cgi,mod_get,
               mod_log]},
     {server_root,"/Users/7stud/erlang_programs/inets_proj"},
     {erl_script_alias,{"/erl",[mymod]}},
     {port,65451},
     {transfer_log,<0.93.0>},
     {error_log,<0.92.0>},
     {document_root,"./htdocs"}]
    5> 
    

    curl request (note that curl automatically calculates the content length and puts it in a Content-Length header):

    ~$ curl -v \
    > -H 'Content-Type: application/json' \
    > --data '{"a": 1, "b": 2}' \
    > http://localhost:65451/cgi-bin/1.py
    
    *   Trying ::1...
    * TCP_NODELAY set
    * Connection failed
    * connect to ::1 port 65451 failed: Connection refused
    *   Trying 127.0.0.1...
    * TCP_NODELAY set
    * Connected to localhost (127.0.0.1) port 65451 (#0)
    > POST /cgi-bin/1.py HTTP/1.1
    > Host: localhost:65451
    > User-Agent: curl/7.58.0
    > Accept: */*
    > Content-Type: application/json
    > Content-Length: 16
    > 
    * upload completely sent off: 16 out of 16 bytes
    < HTTP/1.1 200 OK
    < Date: Fri, 09 Mar 2018 04:36:42 GMT
    < Server: inets/6.4.5
    < Transfer-Encoding: chunked
    < Content-Type: text/html
    < 
    <div>Content-Length=16</div
    <div>Got json: {'a': 1, 'b': 2}</div>
    <div>a=1, b=2, total=3</div>
    * Connection #0 to host localhost left intact
    ~$ 
    

    Here's the perl script that I got work with inets httpd (1.pl):

    #!/usr/bin/env perl
    
    use strict;
    use warnings;
    use 5.020;
    use autodie;
    use Data::Dumper;
    use JSON;
    
    if (my $content_len = $ENV{CONTENT_LENGTH}) {
    
        read(STDIN, my $json, $content_len);
        my $href = decode_json($json);
        my $a = $href->{a};
        my $b = $href->{b};
    
        print 'Content-type: text/html';
        print "\r\n\r\n";
        print "<div>a=$a</div>";
        print "<div>b=$b</div>";
    
        #my $q = CGI->new; #Doesn't work with inets httpd server
        #my $q = CGI->new(''); #Doesn't try to read from stdin, do does work.
    
        #  print $q->header,
        #    $q->start_html("Test Page"),
        #    $q->div("json=$json"),
        #    $q->div("a=$a"),
        #    $q->div("b=$b"),
        #    $q->div("total=$total"),
        #    $q->end_html;
    } 
    else {
        my $error = "Could not read json: No Content-Length header in request.";
        print 'Content-type: text/html';
        print "\r\n\r\n";
        print "<div>$error</div>";
    
    
        #   my $q = CGI->new;
        #   print $q->header,
        #         $q->start_html("Test Page"),
        #         $q->h1($error),
        #         $q->end_html;
    }
    

    I couldn't get perl's CGI module to work in conjunction with reading from STDIN. Edit: A kind soul at perlmonks helped me solve that one:

    my $q = CGI->new('');
    

    The blank string tells CGI->new not to read from stdin and parse the data.