README.md 7.6 KB
Newer Older
1
# logstash-forwarder
Jordan Sissel's avatar
-  
Jordan Sissel committed
2

3 4 5
♫ I'm a lumberjack and I'm ok! I sleep when idle, then I ship logs all day! I parse your logs, I eat the JVM agent for lunch! ♫

(This project was recently renamed from 'lumberjack' to 'logstash-forwarder' to
Jordan Sissel's avatar
Jordan Sissel committed
6
make its intended use clear. The 'lumberjack' name now remains as the network protocol, and 'logstash-forwarder' is the name of the program. It's still the same lovely log forwarding program you love.)
Brandon Burton's avatar
Brandon Burton committed
7

8
## Questions and support
9 10 11 12 13 14 15 16

If you have questions and cannot find answers, please join the #logstash irc
channel on freenode irc or ask on the logstash-users@googlegroups.com mailing
list.

## What is this?

A tool to collect logs locally in preparation for processing elsewhere!
Jordan Sissel's avatar
Jordan Sissel committed
17

Jordan Sissel's avatar
Jordan Sissel committed
18
### Resource Usage Concerns
Jordan Sissel's avatar
Jordan Sissel committed
19

Jordan Sissel's avatar
Jordan Sissel committed
20 21 22 23
Perceived Problems: Some users view logstash releases as "large" or have a generalized fear of Java.

Actual Problems: Logstash, for right now, runs with a footprint that is not
friendly to underprovisioned systems such as EC2 micro instances; on other
24
systems it is fine. This project will exist until that is resolved.
Jordan Sissel's avatar
Jordan Sissel committed
25 26 27 28 29

### Transport Problems

Few log transport mechanisms provide security, low latency, and reliability.

30 31 32
The lumberjack protocol used by this project exists to provide a network
protocol for transmission that is secure, low latency, low resource usage, and
reliable.
Jordan Sissel's avatar
Jordan Sissel committed
33

34 35
## Configuring

36
logstash-forwarder is configured with a json file you specify with the -config flag:
37

38
`logstash-forwarder -config yourstuff.json`
39 40 41 42 43 44 45 46 47

Here's a sample, with comments in-line to describe the settings. Please please
please keep in mind that comments are technically invalid in JSON, so you can't
include them in your config.:

    {
      # The network section covers network configuration :)
      "network": {
        # A list of downstream servers listening for our messages.
48
        # logstash-forwarder will pick one at random and only switch if
49 50 51 52
        # the selected one appears to be dead or unresponsive
        "servers": [ "localhost:5043" ],

        # The path to your client ssl certificate (optional)
53
        "ssl certificate": "./logstash-forwarder.crt",
54
        # The path to your client ssl key (optional)
55
        "ssl key": "./logstash-forwarder.key",
56 57 58

        # The path to your trusted ssl CA file. This is used
        # to authenticate your downstream server.
59
        "ssl ca": "./logstash-forwarder.crt",
Jordan Sissel's avatar
Jordan Sissel committed
60

61 62 63 64 65
        # Network timeout in seconds. This is most important for
        # logstash-forwarder determining whether to stop waiting for an
        # acknowledgement from the downstream server. If an timeout is reached,
        # logstash-forwarder will assume the connection or server is bad and
        # will connect to a server chosen at random from the servers list.
Jordan Sissel's avatar
Jordan Sissel committed
66
        "timeout": 15
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
      },

      # The list of files configurations
      "files": [
        # An array of hashes. Each hash tells what paths to watch and
        # what fields to annotate on events from those paths.
        {
          "paths": [ 
            # single paths are fine
            "/var/log/messages",
            # globs are fine too, they will be periodically evaluated
            # to see if any new files match the wildcard.
            "/var/log/*.log"
          ],

          # A dictionary of fields to annotate on each event.
          "fields": { "type": "syslog" }
        }, {
          # A path of "-" means stdin.
          "paths": [ "-" ],
          "fields": { "type": "stdin" }
        }, {
          "paths": [
            "/var/log/apache/httpd-*.log"
          ],
92
          "fields": { "type": "apache" }
93 94 95 96
        }
      ]
    }

97 98 99 100 101 102 103 104
### Goals

* Minimize resource usage where possible (CPU, memory, network).
* Secure transmission of logs.
* Configurable event data.
* Easy to deploy with minimal moving parts.
* Simple inputs only:
  * Follows files and respects rename/truncation conditions.
105
  * Accepts `STDIN`, useful for things like `varnishlog | logstash-forwarder...`.
106

Jordan Sissel's avatar
Jordan Sissel committed
107 108
## Building it

109
1. Install [go](http://golang.org/doc/install)
110

111
2. Compile logstash-forwarder
112

113 114 115
        git clone git://github.com/elasticsearch/logstash-forwarder.git
        cd logstash-forwarder
        go build
116

117
## Packaging it (optional)
Jordan Sissel's avatar
Jordan Sissel committed
118

119
You can make native packages of logstash-forwarder.
120

121
To build the packages, you will need ruby and fpm installed.
122

123
    gem install fpm
124

125
Now build an rpm:
126

127
        make rpm
sgzijl's avatar
sgzijl committed
128

129 130 131
Or:

        make deb
Jordan Sissel's avatar
Jordan Sissel committed
132

Jordan Sissel's avatar
Jordan Sissel committed
133 134 135
## Installing it (via packages only)

If you don't use rpm or deb make targets as above, you can skip this section.
136

137 138 139
Packages install to `/opt/logstash-forwarder`. 

There are no run-time dependencies.
Jordan Sissel's avatar
Jordan Sissel committed
140 141 142

## Running it

143 144
Generally:

145
    logstash-forwarder -config logstash-forwarder.conf
Jordan Sissel's avatar
Jordan Sissel committed
146

147
See `logstash-forwarder -help` for all the flags
148 149

The config file is documented further up in this file.
150

151
### Key points
Jordan Sissel's avatar
Jordan Sissel committed
152

153
* You'll need an SSL CA to verify the server (host) with.
154 155 156
* You can specify custom fields for each set of paths in the config file. Any
  number of these may be specified. I use them to set fields like `type` and
  other custom attributes relevant to each log.
Jordan Sissel's avatar
Jordan Sissel committed
157

158 159 160 161
### Generating an ssl certificate

Logstash supports all certificates, including self-signed certificates. To generate a certificate, you can run the following command:

162
    $ openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout logstash-forwarder.key -out logstash-forwarder.crt
163

164
This will generate a key at `logstash-forwarder.key` and the certificate at `logstash-forwarder.crt`. Both the server that is running logstash-forwarder as well as the logstash instances receiving logs will require these files on disk to verify the authenticity of messages.
165 166 167 168 169 170

Recommended file locations:

- certificates: `/etc/pki/tls/certs`
- keys: `/etc/pki/tls/private`

171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
## Use with logstash

In logstash, you'll want to use the [lumberjack](http://logstash.net/docs/latest/inputs/lumberjack) input, something like:

    input {
      lumberjack {
        # The port to listen on
        port => 12345

        # The paths to your ssl cert and key
        ssl_certificate => "path/to/ssl.crt"
        ssl_key => "path/to/ssl.key"

        # Set this to whatever you want.
        type => "somelogs"
      }
    }

189 190 191 192 193
## Implementation details 

Below is valid as of 2012/09/19

### Minimize resource usage
Jordan Sissel's avatar
Jordan Sissel committed
194

195 196 197 198 199
* Sets small resource limits (memory, open files) on start up based on the
  number of files being watched.
* CPU: sleeps when there is nothing to do.
* Network/CPU: sleeps if there is a network failure.
* Network: uses zlib for compression.
200

201
### Secure transmission
202

203 204 205
* Uses OpenSSL to verify the server certificates (so you know who you
  are sending to).
* Uses OpenSSL to transport logs.
206

207
### Configurable event data
208

209
* The protocol supports sending a `string:string` map.
Jordan Sissel's avatar
.  
Jordan Sissel committed
210

211
### Easy deployment
Jordan Sissel's avatar
.  
Jordan Sissel committed
212

213 214
* The `make deb` or `make rpm` commands will package everything into a
  single DEB or RPM.
215

216
### Future protocol discussion
217 218 219 220 221

I would love to not have a custom protocol, but nothing I've found implements
what I need, which is: encrypted, trusted, compressed, latency-resilient, and
reliable transport of events.

222
* Redis development refuses to accept encryption support, would likely reject
223
  compression as well.
224 225
* ZeroMQ lacks authentication, encryption, and compression.
* Thrift also lacks authentication, encryption, and compression, and also is an
226
  RPC framework, not a streaming system.
227
* Websockets don't do authentication or compression, but support encrypted
228 229 230 231
  channels with SSL. Websockets also require XORing the entire payload of all
  messages - wasted energy.
* SPDY is still changing too frequently and is also RPC. Streaming requires
  custom framing.
232
* HTTP is RPC and very high overhead for small events (uncompressable headers,
233
  etc). Streaming requires custom framing.
234 235 236 237 238

## License 

See LICENSE file.