node-http-proxy

How do I inject a snippet after the opening body tag in a reverse-proxied response


I get an HTML page from upstream the following format:

<!DOCTYPE html>
<html lang="en" dir="ltr">
  <head>
    <meta charset="utf-8">
    ...
  </head>
  <body class="foo bar baz" data-foo="klaskassa" data-baz="lkaslkas" id="body">
    ...
  </body>
</html>

I have an HTML snippet in the form:

<div class="my-snippet">
  ...
</div>

I'd like to insert the snippet after the opening body tag, to give me:

<!DOCTYPE html>
<html lang="en" dir="ltr">
  <head>
    <meta charset="utf-8">
    ...
  </head>
  <body class="foo bar baz" data-foo="klaskassa" data-baz="lkaslkas" id="body">
    <div class="my-snippet">
      ...
    </div>
    ...
  </body>
</html>

Restrictions

The solution must modify the stream, and not collect the body into a single string before running a transform. This app is restricted by memory, and handles far too many requests to take such a performance hit.

Things I've Tried

  1. Harmon: Apparently you can't read and write to the inner of an element. See this, this, and this.
  2. Used replacestream as outlined here but this didn't work, in fact my responses just stopped.
  3. Transformer-proxy: but the data object can only be appended to.
  4. Spent 4 hours writing a Rack app in Ruby, but then I came to my senses and stopped rewriting my entire codebase.

Please:

Add sample code to the answer. Since this is basically a connect app, I can plug any middlewares you give me.


Solution

  • So I created a simple server launch a proxy server as well a normal server

    var http = require('http'),
        httpProxy = require('http-proxy');
    
    
    proxy = httpProxy.createProxyServer({
        target:'http://localhost:9000',
    }).listen(8000); 
    
    //
    // Create your target server
    //
    http.createServer(function (req, res) {
        let data = 'request successfully proxied!' + '\n' + JSON.stringify(req.headers, true, 2);
        res.writeHead(200, { 'Content-Type': 'text/plain', 'Content-Length': data.length });
        res.write(data);
        res.end();
    }).listen(9000);
    

    And then tested the same using below

    $ curl "localhost:8000"
    request successfully proxied!
    {
      "accept": "*/*",
      "user-agent": "curl/7.54.0",
      "host": "localhost:8000",
      "connection": "close"
    }
    

    Then in the documentation I found below

    selfHandleResponse true/false, if set to true, none of the webOutgoing passes are called and it's your responsibility to appropriately return the response by listening and acting on the proxyRes event

    So updating the code like below

    var http = require('http'),
        httpProxy = require('http-proxy');
    
    
    proxy = httpProxy.createProxyServer({
        target:'http://localhost:9000',
        selfHandleResponse: true
    }).listen(8000); // See (†)
    
    proxy.on('proxyRes', function(proxyRes, req, res) {
        if (proxyRes.headers["content-type"] && proxyRes.headers["content-type"].indexOf("text/plain") >=0) {
            // We need to do our modification
            if (proxyRes.headers["content-length"]) {
                //need to remove this header as we may modify the response
                delete proxyRes.headers["content-length"];
            }
            var responseModified = false;
            proxyRes.on('data', (data) => {
                let dataStr = "";
                if (!responseModified && (dataStr = data.toString()) && dataStr.indexOf("proxied!") >= 0) {
                    responseModified = true;
                    dataStr = dataStr.replace("proxied!", "proxied? Are you sure?")
                    res.write(Buffer.from(dataStr, "utf8"));
                    console.log("Writing modified data");
                } else {
                    res.write(data);
                    console.log("Writing unmodified data");
                }
            });
            proxyRes.on('end', (data) => {
                console.log("data ended")
                res.end();
            });
        } else {
            proxyRes.pipe(res)
        }
    });
    
    
    
    //
    // Create your target server
    //
    http.createServer(function (req, res) {
        let data = 'request successfully proxied!' + '\n' + JSON.stringify(req.headers, true, 2);
        res.writeHead(200, { 'Content-Type': 'text/plain', 'Content-Length': data.length });
        res.write(data);
        res.end();
    }).listen(9000);
    

    And testing it again

    $ curl "localhost:8000"
    request successfully proxied? Are you sure?
    {
      "accept": "*/*",
      "user-agent": "curl/7.54.0",
      "host": "localhost:8000",
      "connection": "close"
    }
    

    Now the output on the server console is below

    Writing modified data
    data ended
    

    This doesn't confirm if we really got only partial stream modified. So I changed the code like below

    http.createServer(function (req, res) {
        let data = 'request successfully proxied!' + '\n' + JSON.stringify(req.headers, true, 2);
        res.writeHead(200, { 'Content-Type': 'text/plain', 'Content-Length': data.length * 5});
        res.write(data + data)
    
        setTimeout(() => {
            res.write(data + data + data);
            res.end();
        });
    
    }).listen(9000);
    

    And opened in browser

    BrowserOutput

    As you can see the data was replace in streams and as per the logic, replacement only happens once and the rest of the stream is passed as it is