elasticsearchlogstashkibanaelastic-stacklog-analysis

Program to generate sample log to feed to logstash?


I have written a small java program which generates some dummy logs (writes stuff to a txt file basically). Now I want to feed this data to the ELK stack. Basically logstash should read this data from the txt file and I want to visualize these changes on kibana, just to get a feel of it.

What I basically want to do is then change the speed at which my program writes the dummy logs to the txt file so that I can see the changes on kibana.

I have just started exploring the ELK stack and this might be a completely wrong way to do this kind of analysis. Please do suggest if there are other better ways to do this (considering I don't have actual logs to work with right now)

Edit : @Val

input {
    generator {
        message => “’83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36””
        count => 10
    }
}

So here is my logstash.conf:

input {

 stdin { }

}


filter {
  grok {
    match => {
      "message" => '%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}'
    }
  }

  date {
    match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
    locale => en
  }

  geoip {
    source => "clientip"
  }

  useragent {
    source => "agent"
    target => "useragent"
  }
}

output {
  stdout {
codec => plain {
                        charset => "ISO-8859-1"
                }

}
  elasticsearch {
    hosts => "http://localhost:9200"
    index => "apache_elk_example"
    template => "./apache_template.json"
    template_name => "apache_elk_example"
    template_overwrite => true
  }
}

Now after starting elasticsearch and kabana I do:

cat apache_logs | /usr/local/opt/logstash/bin/logstash -f apache_logs

where apache_logs is been fed from my java program:

public static void main(String[] args) {
    // TODO Auto-generated method stub
    try {
        PrintStream out = new PrintStream(new FileOutputStream("/Users/username/Desktop/user/apache_logs"));
        System.setOut(out);
    } catch (FileNotFoundException ex) {
        System.out.print("Exception");
    }
    while(true)
    //for(int i=0;i<5;++i)
    {
        System.out.println(generateRandomIPs() + //other log stuff);
        try {
            Thread.sleep(1000);                 //1000 milliseconds is one second.
        } catch(InterruptedException ex) {
            Thread.currentThread().interrupt();
        }
    }
}

So here is the problem :

Kibana doesn't show me real time visualization i.e. as and when my java program feeds data into the apache_log file it does not show it to me. It only shows only until whatever data was already written into 'apache_log' at the time of execution of :

cat apache_logs | /usr/local/opt/logstash/bin/logstash -f apache_logs

Solution

  • might be a bit late but I wrote up a small sample of what I meant.

    I modified your java program to add a timestamp like this:

    public class LogWriter {
    
    
        public static Gson gson = new Gson();
    
        public static void main(String[] args) {
    
            try {
                PrintStream out = new PrintStream(new FileOutputStream("/var/logstash/input/test2.log"));
                System.setOut(out);
            } catch (FileNotFoundException ex) {
                System.out.print("Exception");
            }
    
            Map<String, String> timestamper = new HashMap<>();
    
            while(true)
            {
    
                String format = LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME);
    
                timestamper.put("myTimestamp", format);
                System.out.println(gson.toJson(timestamper));
                try {
                    Thread.sleep(1000);                 //1000 milliseconds is one second.
                } catch(InterruptedException ex) {
                    Thread.currentThread().interrupt();
                }
            }
    
        }
    }
    

    This now write json like:

    {"myTimestamp":"2016-06-10T10:42:16.299"}
    {"myTimestamp":"2016-06-10T10:42:17.3"}
    {"myTimestamp":"2016-06-10T10:42:18.301"}
    

    I then setup logstash to read that file and parse it and output to stdout:

    input {
      file {
         path => "/var/logstash/input/*.log"
         start_position => "beginning"
         ignore_older => 0
         sincedb_path => "/dev/null"
      }   
    }
    
    filter {
       json {
          source => "message"
       }
    }
    
    output {
        file {
               path => "/var/logstash/out.log"
        }
        stdout { codec => rubydebug }
    }
    

    So it'll pick up my log, which knows when it was created, parses it, and creates a new timestamp which represents when it saw the log:

    {
            "message" => "{\"myTimestamp\":\"2016-06-10T10:42:17.3\"}",
           "@version" => "1",
         "@timestamp" => "2016-06-10T09:42:17.687Z",
               "path" => "/var/logstash/input/test2.log",
               "host" => "pandaadb",
        "myTimestamp" => "2016-06-10T10:42:17.3"
    }
    {
            "message" => "{\"myTimestamp\":\"2016-06-10T10:42:18.301\"}",
           "@version" => "1",
         "@timestamp" => "2016-06-10T09:42:18.691Z",
               "path" => "/var/logstash/input/test2.log",
               "host" => "pandaadb",
        "myTimestamp" => "2016-06-10T10:42:18.301"
    }
    

    Here you can now see how long it takes for a log to be seen an processed. Which is around 300 miliseconds, which I would account to the fact that your java writer is an async writer and will not flush right away.

    You can even make this a bit "cooler" by using the elapsed plugin which will calculate the difference between those timestamps for you.

    I hope that helps for your testing :) Might not be the most advanced way of doing it, but it's easy to understand and pretty forward and fast.

    Artur