README.md 4.28 KB
Newer Older
Jordan Sissel's avatar
-  
Jordan Sissel committed
1
2
# lumberjack

Brandon Burton's avatar
Brandon Burton committed
3
4
o/~ I'm a lumberjack and I'm ok! I sleep when idle, then I ship logs all day! I parse your logs, I eat the JVM agent for lunch! o/~

5
## Questions and support
Jordan Sissel's avatar
Jordan Sissel committed
6
7
8
9
10
11
12
13

If you have questions and cannot find answers, please join the #logstash irc
channel on freenode irc or ask on the logstash-users@googlegroups.com mailing
list.

## What is this?

A tool to collect logs locally in preparation for processing elsewhere!
Jordan Sissel's avatar
Jordan Sissel committed
14
15
16

Problem: logstash jar releases are too fat for constrained systems.

Jordan Sissel's avatar
Jordan Sissel committed
17
18
Solution: lumberjack

19
20
21
22
23
24
25
26
27
28
### Goals

* Minimize resource usage where possible (CPU, memory, network).
* Secure transmission of logs.
* Configurable event data.
* Easy to deploy with minimal moving parts.
* Simple inputs only:
  * Follows files and respects rename/truncation conditions.
  * Accepts `STDIN`, useful for things like `varnishlog | lumberjack...`.

Jordan Sissel's avatar
Jordan Sissel committed
29
30
## Building it

31
32
33
34
1. Install [FPM](https://github.com/jordansissel/fpm)

        $ sudo gem install fpm

Jordan Sissel's avatar
Jordan Sissel committed
35
2. Install [go](http://golang.org/doc/install)
36

Jordan Sissel's avatar
Jordan Sissel committed
37
38

3. Compile lumberjack
39
40
41
42
43
44
45
46
47
48

        $ git clone git://github.com/jordansissel/lumberjack.git
        $ cd lumberback
        $ make

4. Make packages, either:

        $ make rpm

    Or:
sgzijl's avatar
sgzijl committed
49

50
        $ make deb
Jordan Sissel's avatar
Jordan Sissel committed
51

52
53
54
## Installing it

Packages install to `/opt/lumberjack`. Lumberjack builds all necessary
Jordan Sissel's avatar
Jordan Sissel committed
55
56
57
58
59
dependencies itself, so there should be no run-time dependencies you
need.

## Running it

60
61
62
Generally:

    $ lumberjack.sh --host somehost --port 12345 /var/log/messages
Jordan Sissel's avatar
Jordan Sissel committed
63

Jordan Sissel's avatar
Jordan Sissel committed
64
See `lumberjack.sh --help` for all the flags
Jordan Sissel's avatar
Jordan Sissel committed
65

66
### Key points
Jordan Sissel's avatar
Jordan Sissel committed
67

68
69
70
* You'll need an SSL CA to verify the server (host) with.
* You can specify custom fields with the `--field foo=bar`. Any number of these
  may be specified. I use them to set fields like `type` and other custom
Jordan Sissel's avatar
Jordan Sissel committed
71
72
73
  attributes relevant to each log.
* Any non-flag argument after is considered a file path. You can watch any
  number of files.
Jordan Sissel's avatar
Jordan Sissel committed
74

Jordan Sissel's avatar
Jordan Sissel committed
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
## Use with logstash

In logstash, you'll want to use the [lumberjack](http://logstash.net/docs/latest/inputs/lumberjack) input, something like:

    input {
      lumberjack {
        # The port to listen on
        port => 12345

        # The paths to your ssl cert and key
        ssl_certificate => "path/to/ssl.crt"
        ssl_key => "path/to/ssl.key"

        # Set this to whatever you want.
        type => "somelogs"
      }
    }

93
94
95
96
97
## Implementation details 

Below is valid as of 2012/09/19

### Minimize resource usage
Jordan Sissel's avatar
Jordan Sissel committed
98

99
100
101
102
103
* Sets small resource limits (memory, open files) on start up based on the
  number of files being watched.
* CPU: sleeps when there is nothing to do.
* Network/CPU: sleeps if there is a network failure.
* Network: uses zlib for compression.
104

105
### Secure transmission
106

107
108
109
* Uses OpenSSL to verify the server certificates (so you know who you
  are sending to).
* Uses OpenSSL to transport logs.
110

111
### Configurable event data
112

113
114
115
* The protocol lumberjack uses supports sending a `string:string` map.
* The lumberjack tool lets you specify arbitrary extra data with
  `--field name=value`.
Jordan Sissel's avatar
.    
Jordan Sissel committed
116

117
### Easy deployment
Jordan Sissel's avatar
.    
Jordan Sissel committed
118

119
120
121
122
123
* All dependencies are built at compile-time (OpenSSL, jemalloc, etc) because many os distributions lack these dependencies.
* The `make deb` or `make rpm` commands will package everything into a
  single DEB or RPM.
* The `bin/lumberjack.sh` script makes sure the dependencies are found
  when run in production.
Jordan Sissel's avatar
Jordan Sissel committed
124

125
### Future functional features
Jordan Sissel's avatar
Jordan Sissel committed
126

127
128
* Re-evaluate globs periodically to look for new log files.
* Track position of in the log.
129

130
### Future protocol discussion
131
132
133
134
135

I would love to not have a custom protocol, but nothing I've found implements
what I need, which is: encrypted, trusted, compressed, latency-resilient, and
reliable transport of events.

136
* Redis development refuses to accept encryption support, would likely reject
137
  compression as well.
138
139
* ZeroMQ lacks authentication, encryption, and compression.
* Thrift also lacks authentication, encryption, and compression, and also is an
140
  RPC framework, not a streaming system.
141
* Websockets don't do authentication or compression, but support encrypted
142
143
144
145
  channels with SSL. Websockets also require XORing the entire payload of all
  messages - wasted energy.
* SPDY is still changing too frequently and is also RPC. Streaming requires
  custom framing.
146
* HTTP is RPC and very high overhead for small events (uncompressable headers,
147
  etc). Streaming requires custom framing.
148
149
150
151
152

## License 

See LICENSE file.