Things has been done :
Hadoop installation from the following link:
Installed Hping3 to generate flood requests using:
sudo hping3 -c 10000 -d 120 -S -w 64 -p 8000 --flood --rand-source 192.168.1.12
Installed snort to Log the requests for the above using:
sudo snort -ved -h 192.168.1.0/24 -l .
This generates the Log file snort.log.1427021231
which i can read it with
sudo snort -r snort.log.1427021231
which gives output of the Form:
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
03/22-16:17:14.259633 192.168.1.12:8000 -> 117.247.194.105:46639 TCP TTL:64 TOS:0x0 ID:0 IpLen:20 DgmLen:44 DF AS Seq: 0x6EEE4A6B Ack: 0x6DF6015B Win: 0x7210 TcpLen: 24 TCP Options (1) => MSS: 1460 =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
I used
hdfs dfs -put <localsrc> ... <dst>
to copy this log file to HDFS.
Now, Thnigs i want help with:
How to count total number of source IP address,dest IP addr ,Port addr, Protocol, Timestamp in log file.
( Do i have to write my own Map reduce program ? Or there is a Library for that.)
I have also found
But could not make it run. looked into the content of JAR file but could not run it.
ratan@lenovo:~/Desktop$ hadoop jar ./p3lite.jar p3.pcap.examples.PacketCount
Exception in thread "main" java.lang.ClassNotFoundException: nflow.runner.Runner
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:201)
Thanks.
After a quick search, this appears to be something that you might need a custom MapReduce job for.
The algorithm would look something like the following pseudo-code:
Parse the file line by line (or parse every n lines if logs are more than one line long).
in the mapper, use regex to figure out if something is a source IP, destination IP etc.
output these with key value structure of <Type, count>
type is the type of text that was matched (ex. source IP)
count is the number of times it was matched in the record
have reducer sum all of the values from the mappers, and get global totals for each type of information you want
write to file in desired format.