README.md 3.57 KB
Newer Older
Jordan Sissel's avatar
-  
Jordan Sissel committed
1
2
# lumberjack

Brandon Burton's avatar
Brandon Burton committed
3
4
o/~ I'm a lumberjack and I'm ok! I sleep when idle, then I ship logs all day! I parse your logs, I eat the JVM agent for lunch! o/~

Jordan Sissel's avatar
Jordan Sissel committed
5
6
7
8
9
10
11
12
13
## QUESTIONS?

If you have questions and cannot find answers, please join the #logstash irc
channel on freenode irc or ask on the logstash-users@googlegroups.com mailing
list.

## What is this?

A tool to collect logs locally in preparation for processing elsewhere!
Jordan Sissel's avatar
Jordan Sissel committed
14
15
16

Problem: logstash jar releases are too fat for constrained systems.

Jordan Sissel's avatar
Jordan Sissel committed
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Solution: lumberjack

## Building it

* compile: make 
* rpm package: make rpm
* deb package: make deb

Packages install to /opt/lumberjack. Lumberjack builds all necessary
dependencies itself, so there should be no run-time dependencies you
need.

## Running it

Generally: `lumberjack.sh --host somehost --port 12345 /var/log/messages`

Jordan Sissel's avatar
Jordan Sissel committed
33
34
You'll need an ssl ca to verify the server (host) with.

Jordan Sissel's avatar
Jordan Sissel committed
35
36
See `lumberjack.sh --help`

Jordan Sissel's avatar
Jordan Sissel committed
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
## Use with logstash

In logstash, you'll want to use the [lumberjack](http://logstash.net/docs/latest/inputs/lumberjack) input, something like:

    input {
      lumberjack {
        # The port to listen on
        port => 12345

        # The paths to your ssl cert and key
        ssl_certificate => "path/to/ssl.crt"
        ssl_key => "path/to/ssl.key"

        # Set this to whatever you want.
        type => "somelogs"
      }
    }

Nils Landt's avatar
Nils Landt committed
55
## Goals
Jordan Sissel's avatar
Jordan Sissel committed
56

57
58
* minimize resource usage where possible (cpu, memory, network)
* secure transmission of logs
59
* configurable event data
60
* easy to deploy with minimal moving parts.
Jordan Sissel's avatar
Jordan Sissel committed
61
62
63
64

Simple inputs only:

* follow files, respect rename/truncation conditions
Jordan Sissel's avatar
Jordan Sissel committed
65
* stdin, useful for things like 'varnishlog | lumberjack ...'
Jordan Sissel's avatar
Jordan Sissel committed
66

67
68
69
70
71
## Implementation details 

Below is valid as of 2012/09/19

### Minimize resource usage
Jordan Sissel's avatar
Jordan Sissel committed
72

Jordan Sissel's avatar
Jordan Sissel committed
73
74
* sets small resource limits (memory, open files) on start up based on the
  number of files being watched
75
* cpu: sleeps when there is nothing to do
Nils Landt's avatar
Nils Landt committed
76
* network/cpu: sleeps if there is a network failure
77
* network: uses zlib for compression
78

79
### secure transmission
80

81
82
* uses openssl to transport logs. Currently supports verifying the server
  certificate only (so you know who you are sending to).
83

84
### configurable event data
85

86
87
* the protocol lumberjack uses supports sending a string:string map
* the lumberjack tool lets you specify arbitrary extra data with `--field name=value`
Jordan Sissel's avatar
.    
Jordan Sissel committed
88
89
90

## easy deployment

Jordan Sissel's avatar
Jordan Sissel committed
91
* all dependencies are built at compile-time (openssl, jemalloc, etc) because many os distributions lack these dependencies.
Jordan Sissel's avatar
.    
Jordan Sissel committed
92
* 'make deb' (or make rpm) will package everything into a single deb (or rpm)
Jordan Sissel's avatar
Jordan Sissel committed
93
94
95
96
97
98
* bin/lumberjack.sh makes sure the dependencies are found when run in production

## future functional features

* re-evaluate globs periodically to look for new log files
* track position of in the log
99

Jordan Sissel's avatar
Jordan Sissel committed
100
## future protocol discussion
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117

I would love to not have a custom protocol, but nothing I've found implements
what I need, which is: encrypted, trusted, compressed, latency-resilient, and
reliable transport of events.

* redis development refuses to accept encryption support, would likely reject
  compression as well.
* zeromq lacks authentication, encryption, and compression.
* thrift also lacks authentication, encryption, and compression, and also is an
  RPC framework, not a streaming system.
* websockets don't do authentication or compression, but support encrypted
  channels with SSL. Websockets also require XORing the entire payload of all
  messages - wasted energy.
* SPDY is still changing too frequently and is also RPC. Streaming requires
  custom framing.
* HTTP is RPC and very high over head for small events (uncompressable headers,
  etc). Streaming requires custom framing.