A full-featured logging system with Fluentd ElasticSearch Kibana

There are many open source logging / aggregators / monitoring systems, but I alwais been a bit worried about by their dependencies and features. Fluentd is a small core but extensible with a lot input and output plugins.

Use Case. This is how unreliable my Adsl is in last hours, I pay for Alice Adsl 7 Mega (up to), but the modem aligns at 2464 Kbps and during the rain I lose the signal. (*)

Here the full list of components and responsabilities:

  • Fluentd: multiple input and output event collector. Logstash compatible output.
  • ElasticSearch: storage, aggregation
  • Kibana 3: Pretty Web UI to filter and explore the data (Elastic Search API, Logstash conventions, Lucene syntax queries)

Basic usage & use case

I don't want to be tight to a particular client gem or workflow. I know that an app, a script or a dmesg produce some data as output and sometimes I'd like to keep track or see aggregated this data. In brief a general purpose "log me anything" with eyecandy charts with aggregation.

Collecting the data

This is my current Fluentd configuration, it does pretty basic things and it is very useful to test what is going on:

# /etc/fluend/fluentd.conf  
# collect http with: http://localhost:8888/personal.adsl?json={"event":"event-123", "resp_time":5}
<source>
  type http
  port 8888
  bind 0.0.0.0
  body_size_limit 32m
  keepalive_timeout 10s
</source>

# collect the dmesg output
<source>
  type syslog
  port 42185
  tag system
</source>

# collect tail with: echo '{"event":"event-123","duration":2700}' >> /var/log/example.log
# Fluentd user need read permission on .log and r/w permission on .pos
<source>
  type tail
  path /var/log/example.log
  pos_file /var/log/example.log.pos # to store last read position
  tag personal.example
  format json
</source>

# events stored on Elastic Search
<match personal.**>
  type elasticsearch
  logstash_format true
  flush_interval 10s # for testing
  include_tag_key true
  tag_key _key
</match>

# events just printed on the screen for debugging purpose
<match test.**>
  type stdout
</match>

With a few lines, You can alrealy collect a lot of data, from http requests, a particular log file or the operative system logger. Many others advanced use cases are covered in the Docs and by third party plugins documentation.

Installation on Ubuntu Linux LTS or 13.04

This is my setup, you could use a different storage, other plugins,.. but it just work because the components use the same conventions.

# Fluentd & ruby  
sudo apt-get update  
sudo apt-get install ruby1.9.1 ruby1.9.1-dev build-essential git curl  
sudo gem install fluentd fluent-plugin-elasticsearch --no-rdoc --no-ri

sudo fluentd --setup /etc/fluent  
sudo vim.tiny /etc/fluent/fluent.conf # see the example above

# Elastic Search & java
sudo apt-get install openjdk-7-jre-headless -y  
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.2.deb # check latest version  
sudo dpkg -i elasticsearch-*

# Kibana 3
git clone https://github.com/elasticsearch/kibana.git /var/www/kibana3  
# it's just static app js that points to Elastic Search API
sudo apt-get install nginx

# if you follow their convention
sudo ln -s /usr/share/kibana3/sample/nginx.conf /etc/nginx/sites-enabled/kibana

Recap

  1. Send namespaced events via http://localhost:8888 Fluentd API or via the local system.
  2. Get aggregated events via http://localhost:9200 Elastic Search API (JSON)
  3. Get aggregated events via http://localhost:5432 Kibana Web UI (HTML Human readeable)

I like really much these combo of softwares for their simplicity and modularity, an open issue which You have to solve by yourself is security. Kibana could be password protected via NGINX but ES Api aren't, so If you read sensible data via public internet use VPN or others techniques. The same about writing public events, if you need SSL to send or forward data to Fluentd, think about it.

Ports. This is easier to fix. If you want route all the comunication on port 80 you can, with virtualhosts and NGINX or Apache configured as reverse proxy.

I hope you found the post iseful and remember "You can't improve it if you can't measure it". Here is a live demo used in a more common context.


(*) The Telecom Italia technician sayd there's nothing to do because I'm 1,5km from the central (..and I live in a big city).

corso javascript