I am trying to implement a web proxy server in java that would relay requests and responses between my browser and the web. In the current set up, I get my browser to send all of the page requests to localhost on a specified port and my proxy is listening on that port for incoming requests.
The whole thing is threaded so that multiple requests can be handled at the same time and here's what my code looks like:
private void startProxy(int serverPort){
try {
// create a socket to listen on browser requests
ServerSocket servSocket = new ServerSocket(serverPort);
while(true) {
// create a thread for each connection
ProxyThread thread = new ProxyThread(servSocket.accept());
thread.start();
}
} catch (IOException e) {}
}
class ProxyThread extends Thread {
private Socket client;
private Socket server;
public ProxyThread(Socket client) {
this.client = client;
server = new Socket();
}
public void run() {
// passes on requests and responses here
}
I have noticed that when I try to load a page with 20 different requests for html/css/js, sometimes only 18-19 threads are created, losing some requests in the process. Most often requests for a js resource or an image get dropped and they're never the last requests made by the browser, so it's not an issue of running out of resources.
Using wireshark, I am able to determine that the lost requests do get through to the localhost, so for some reason ServerSocket.accept() does not actually accept the connections. Are there any particular reasons why this might be happening? Or maybe my code is wrong in some way?
Here is the body of run()
try {
BufferedReader clientOut = new BufferedReader(
new InputStreamReader(client.getInputStream()));
OutputStream clientIn = client.getOutputStream();
// assign default port to 80
int port = 80;
String request = "";
// read in the first line of a HTTP request containing the url
String subRequest = clientOut.readLine();
String host = getHost(subRequest);
// read in the rest of the request
while(!subRequest.equals("")) {
request += subRequest + "\r\n";
subRequest = clientOut.readLine();
}
request += "\r\n";
try {
server.connect(new InetSocketAddress(host, port));
} catch (IOException e) {
String errMsg = "HTTP/1.0 500\nContent Type: text/plain\n\n" +
"Error connecting to the server:\n" + e + "\n";
clientIn.write(errMsg.getBytes());
clientIn.flush();
}
PrintWriter serverOut = new PrintWriter(server.getOutputStream(), true);
serverOut.println(request);
serverOut.flush();
InputStream serverIn = server.getInputStream();
byte[] reply = new byte[4096];
int bytesRead;
while ((bytesRead = serverIn.read(reply)) != -1) {
clientIn.write(reply, 0, bytesRead);
clientIn.flush();
}
serverIn.close();
serverOut.close();
clientOut.close();
clientIn.close();
client.close();
server.close();
} catch(IOException e){
e.printStackTrace();
}
for a webpage with 10 requests, I get 10 HTTP GET, 6 SYN and SYN, ACK with 7 requests successfully passing through the proxy and 3 getting stuck.
So you have 6 separate connections but 10 requests, and you're only processing one request per connection. You've forgotten to implement HTTP keepalive. See RFC 2616. More than one request may arrive per connection. You need to read exactly as many bytes per request as are defined by the content-length header, or the sum of the chunks, whatever is present, if anything, and then instead of just closing the socket you need to go back and try to read another request. If that gives you end of stream, close the socket.
Or else send your response back to the client as HTTP 1.0, or with a Connection: close
header, so it won't try to reuse the connection for another request.