README.md 3.88 KB
Newer Older
Jordan Sissel's avatar
-  
Jordan Sissel committed
1
2
# lumberjack

Brandon Burton's avatar
Brandon Burton committed
3
4
o/~ I'm a lumberjack and I'm ok! I sleep when idle, then I ship logs all day! I parse your logs, I eat the JVM agent for lunch! o/~

Jordan Sissel's avatar
Jordan Sissel committed
5
6
7
8
9
10
11
12
13
## QUESTIONS?

If you have questions and cannot find answers, please join the #logstash irc
channel on freenode irc or ask on the logstash-users@googlegroups.com mailing
list.

## What is this?

A tool to collect logs locally in preparation for processing elsewhere!
Jordan Sissel's avatar
Jordan Sissel committed
14
15
16

Problem: logstash jar releases are too fat for constrained systems.

Jordan Sissel's avatar
Jordan Sissel committed
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Solution: lumberjack

## Building it

* compile: make 
* rpm package: make rpm
* deb package: make deb

Packages install to /opt/lumberjack. Lumberjack builds all necessary
dependencies itself, so there should be no run-time dependencies you
need.

## Running it

Generally: `lumberjack.sh --host somehost --port 12345 /var/log/messages`

Jordan Sissel's avatar
Jordan Sissel committed
33
See `lumberjack.sh --help` for all the flags
Jordan Sissel's avatar
Jordan Sissel committed
34

Jordan Sissel's avatar
Jordan Sissel committed
35
36
37
38
39
40
41
42
Key points:

* You'll need an ssl ca to verify the server (host) with.
* You can specify custom fields with the '--field foo=bar'. Any number of these
  may be specified. I use them to set fields like 'type' and other custom
  attributes relevant to each log.
* Any non-flag argument after is considered a file path. You can watch any
  number of files.
Jordan Sissel's avatar
Jordan Sissel committed
43

Jordan Sissel's avatar
Jordan Sissel committed
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
## Use with logstash

In logstash, you'll want to use the [lumberjack](http://logstash.net/docs/latest/inputs/lumberjack) input, something like:

    input {
      lumberjack {
        # The port to listen on
        port => 12345

        # The paths to your ssl cert and key
        ssl_certificate => "path/to/ssl.crt"
        ssl_key => "path/to/ssl.key"

        # Set this to whatever you want.
        type => "somelogs"
      }
    }

Nils Landt's avatar
Nils Landt committed
62
## Goals
Jordan Sissel's avatar
Jordan Sissel committed
63

64
65
* minimize resource usage where possible (cpu, memory, network)
* secure transmission of logs
66
* configurable event data
67
* easy to deploy with minimal moving parts.
Jordan Sissel's avatar
Jordan Sissel committed
68
69
70
71

Simple inputs only:

* follow files, respect rename/truncation conditions
Jordan Sissel's avatar
Jordan Sissel committed
72
* stdin, useful for things like 'varnishlog | lumberjack ...'
Jordan Sissel's avatar
Jordan Sissel committed
73

74
75
76
77
78
## Implementation details 

Below is valid as of 2012/09/19

### Minimize resource usage
Jordan Sissel's avatar
Jordan Sissel committed
79

Jordan Sissel's avatar
Jordan Sissel committed
80
81
* sets small resource limits (memory, open files) on start up based on the
  number of files being watched
82
* cpu: sleeps when there is nothing to do
Nils Landt's avatar
Nils Landt committed
83
* network/cpu: sleeps if there is a network failure
84
* network: uses zlib for compression
85

86
### secure transmission
87

88
89
* uses openssl to transport logs. Currently supports verifying the server
  certificate only (so you know who you are sending to).
90

91
### configurable event data
92

93
94
* the protocol lumberjack uses supports sending a string:string map
* the lumberjack tool lets you specify arbitrary extra data with `--field name=value`
Jordan Sissel's avatar
.    
Jordan Sissel committed
95
96
97

## easy deployment

Jordan Sissel's avatar
Jordan Sissel committed
98
* all dependencies are built at compile-time (openssl, jemalloc, etc) because many os distributions lack these dependencies.
Jordan Sissel's avatar
.    
Jordan Sissel committed
99
* 'make deb' (or make rpm) will package everything into a single deb (or rpm)
Jordan Sissel's avatar
Jordan Sissel committed
100
101
102
103
104
105
* bin/lumberjack.sh makes sure the dependencies are found when run in production

## future functional features

* re-evaluate globs periodically to look for new log files
* track position of in the log
106

Jordan Sissel's avatar
Jordan Sissel committed
107
## future protocol discussion
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124

I would love to not have a custom protocol, but nothing I've found implements
what I need, which is: encrypted, trusted, compressed, latency-resilient, and
reliable transport of events.

* redis development refuses to accept encryption support, would likely reject
  compression as well.
* zeromq lacks authentication, encryption, and compression.
* thrift also lacks authentication, encryption, and compression, and also is an
  RPC framework, not a streaming system.
* websockets don't do authentication or compression, but support encrypted
  channels with SSL. Websockets also require XORing the entire payload of all
  messages - wasted energy.
* SPDY is still changing too frequently and is also RPC. Streaming requires
  custom framing.
* HTTP is RPC and very high over head for small events (uncompressable headers,
  etc). Streaming requires custom framing.