README.md 2.67 KB
Newer Older
Jordan Sissel's avatar
-  
Jordan Sissel committed
1
2
# lumberjack

Brandon Burton's avatar
Brandon Burton committed
3
4
o/~ I'm a lumberjack and I'm ok! I sleep when idle, then I ship logs all day! I parse your logs, I eat the JVM agent for lunch! o/~

Jordan Sissel's avatar
-  
Jordan Sissel committed
5
Collect logs locally in preparation for processing elsewhere!
Jordan Sissel's avatar
Jordan Sissel committed
6
7
8

Problem: logstash jar releases are too fat for constrained systems.

Jordan Sissel's avatar
Jordan Sissel committed
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Solution: lumberjack

## Building it

* compile: make 
* rpm package: make rpm
* deb package: make deb

Packages install to /opt/lumberjack. Lumberjack builds all necessary
dependencies itself, so there should be no run-time dependencies you
need.

## Running it

Generally: `lumberjack.sh --host somehost --port 12345 /var/log/messages`

See `lumberjack.sh --help`

Nils Landt's avatar
Nils Landt committed
27
## Goals
Jordan Sissel's avatar
Jordan Sissel committed
28

29
30
* minimize resource usage where possible (cpu, memory, network)
* secure transmission of logs
31
* configurable event data
32
* easy to deploy with minimal moving parts.
Jordan Sissel's avatar
Jordan Sissel committed
33
34
35
36

Simple inputs only:

* follow files, respect rename/truncation conditions
Jordan Sissel's avatar
Jordan Sissel committed
37
* stdin, useful for things like 'varnishlog | lumberjack ...'
Jordan Sissel's avatar
Jordan Sissel committed
38

39
40
41
42
43
## Implementation details 

Below is valid as of 2012/09/19

### Minimize resource usage
Jordan Sissel's avatar
Jordan Sissel committed
44

45
46
* sets small resource limits (memory, open files) on start up based on the number of files being watched
* cpu: sleeps when there is nothing to do
Nils Landt's avatar
Nils Landt committed
47
* network/cpu: sleeps if there is a network failure
48
* network: uses zlib for compression
49

50
### secure transmission
51

52
53
* uses openssl to transport logs. Currently supports verifying the server
  certificate only (so you know who you are sending to).
54

55
### configurable event data
56

57
58
* the protocol lumberjack uses supports sending a string:string map
* the lumberjack tool lets you specify arbitrary extra data with `--field name=value`
Jordan Sissel's avatar
.    
Jordan Sissel committed
59
60
61
62
63
64

## easy deployment

* all dependencies are built at compile-time (openssl, jemalloc, etc)
* 'make deb' (or make rpm) will package everything into a single deb (or rpm)
* bin/lumberjack.sh makes sure the dependencies are found
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83

## future

I would love to not have a custom protocol, but nothing I've found implements
what I need, which is: encrypted, trusted, compressed, latency-resilient, and
reliable transport of events.

* redis development refuses to accept encryption support, would likely reject
  compression as well.
* zeromq lacks authentication, encryption, and compression.
* thrift also lacks authentication, encryption, and compression, and also is an
  RPC framework, not a streaming system.
* websockets don't do authentication or compression, but support encrypted
  channels with SSL. Websockets also require XORing the entire payload of all
  messages - wasted energy.
* SPDY is still changing too frequently and is also RPC. Streaming requires
  custom framing.
* HTTP is RPC and very high over head for small events (uncompressable headers,
  etc). Streaming requires custom framing.