HTTP Proxies: AKA School Firewall Evaders

As a fun little side project, and because I work with proxies a lot in my current job at Verodin, I decided to build an HTTP proxy. I fiddled with several language choices, but eventually landed on Node.js as it’s event-driven default nature lends itself well to networking applications. I also wanted to take advantage of cool language features like closures, which only exist in languages like Javascript and Common Lisp for reasons unbeknownst to me. Before I talk about my implementation, and stumbles along the way, first I’ll give a quick explanation on HTTP proxies.

HTTP Proxies

Image

A proxy, like the one shown above, is often used as a gateway between two servers or networks of servers. In short, a proxy will be handed requests from servers on one side of the proxy (often an internal network), and the proxy will send the request off to wherever the server wanted it to go. The server’s intended recipient will reply directly to the proxy, which will then forward the reply back to the server who sent the request.

HTTP proxies work on an application level, and will only mediate the sending and receiving of HTTP requests/responses.

Proxies are useful tools for several reasons, many pertaining to security. At a base level, a proxy can log any and all traffic that goes through it, making monitoring of traffic to and from a network a breeze. In addition, proxies can have extra capabilities. One of the earliest uses of proxy servers were as caching clients. Any frequently used resources were cached by the proxy, saving money on bandwidth and processing costs.

My Node.js implementation.

In all it’s glory, can be seen on my Github (link on the side). There are some key features of Node.js, particularly the http library, that allowed me to make the proxy in a simple way. In the end, I went with a combination of the http Node.js library, the winston logging library from NPM, sqlite3 and the sqlite3 Javascript library for making queries, and closures.

Starting out

Starting the code was simple enough, I had to create an http server that I could begin passing requests to. That function looked something like this:

/**
 * Start the server.
 */
function startServer() {
    // Start the server, set up data and end handlers.
    var server = http.createServer();

    // Start the server
    server.listen(8124, () => {
        console.log("Server bound.");
    });

    return server;
}

Once I had the server running, I had to figure out how requests to HTTP proxies should be structured. To do some testing, I wrote code with the py.test library and requests.

http_proxy = "http://127.0.0.1:8124"

def test_proxy_basic():
    resp = requests.get("http://httpbin.org", proxies={"http": http_proxy})
    assert resp.status_code == 200

With the test code in place, I modified my proxy with a console.log to show the request object, and used py.test to fire sample requests at the proxy. What I found was surprising.

  trailers: {},
  rawTrailers: [],
  aborted: false,
  upgrade: false,
  url: 'http://httpbin.org/',
  method: 'GET',
  statusCode: null,
  statusMessage: null,

The URL field of the IncomingRequest object was the full URI, versus being just a path off of the host. With this information in mind, I realized that utilizing the URL field this way allowed for HTTP proxies to easily parse the URL and forward the request without incident, so I set to work building a parsing function. My parsing function, affectionately known as processRequest was designed with closures in mind.

/**
 * Connection handler for proxy, forward request and then pipe response
 * back to the client.
 *
 * @param {IncomingMessage} request The request from the client
 * @param {OutgoingMessage} response The response to write back to the client
 */
const processRequest = (request, response) => {
    let body = [];
    // Use closure for request.
    let req = request;
    let res = response;

    // Set up a data handler for the socket connection.
    request.on("data", (chunk) => {
        body.push(chunk);
    });

    // Set up an end handler for the socket connection.
    request.on("end", () => {
        console.log(req)
        body = Buffer.concat(body).toString();
        logger.info({message: "Requesting: " + req.url});

        const hostUrl = new url.URL(req.url);

        // Do the work of actually updating our db with the visit.
        // Val in the callback will be false if blocked.
        myDb.visitHost(hostUrl.hostname, (val) => {
            if (val === false) {
                // blocked.
                console.log("Blocked you fool.")
                sendBlockPage(res);
            } else {
                // Not blocked
                req.pipe(http.request(req.url, (resp) => {
                    resp.pipe(res);
                }))
            }
        });
    });
}

It took a few iterations to get here but I’ll just explain the end product. I first use let bindings to bind req, and res, needing both of them in a closure to properly forward the request and pipe the response back to the client. I then set up two socket event handlers, end and data. On end, meaning that the proxy received all of the data from the client, my proxy asks the DB to check out the host. As configured currently, the myDb.visitHost function will query sqlite to see if the host should be blocked. If the host should be blocked, visitHost calls the callback with false. If the host is allowed, using a closure, I pipe the request to a new http request using the url field. Then, in the reply callback for the piped request, I pipe the reply from the outer host back to the client who sent the request to the proxy.

My proxy logs all traffic that goes through it, and allows users to configure hosts that they want to block in a file blocked.json which gets saved to the database each time the proxy is started.

I had a lot of fun building this project, and look forward to my next one soon!

 Share!

 
I run WindleWare! Feel free to reach out!

Subscribe for Exclusive Updates

* indicates required