README.md 7.31 KB
Newer Older
1
# logstash-forwarder
Jordan Sissel's avatar
-  
Jordan Sissel committed
2

Brandon Burton's avatar
Brandon Burton committed
3 4
o/~ I'm a lumberjack and I'm ok! I sleep when idle, then I ship logs all day! I parse your logs, I eat the JVM agent for lunch! o/~

5
## Questions and support
Jordan Sissel's avatar
Jordan Sissel committed
6 7 8 9 10 11 12 13

If you have questions and cannot find answers, please join the #logstash irc
channel on freenode irc or ask on the logstash-users@googlegroups.com mailing
list.

## What is this?

A tool to collect logs locally in preparation for processing elsewhere!
Jordan Sissel's avatar
Jordan Sissel committed
14

Jordan Sissel's avatar
Jordan Sissel committed
15
### Resource Usage Concerns
Jordan Sissel's avatar
Jordan Sissel committed
16

Jordan Sissel's avatar
Jordan Sissel committed
17 18 19 20
Perceived Problems: Some users view logstash releases as "large" or have a generalized fear of Java.

Actual Problems: Logstash, for right now, runs with a footprint that is not
friendly to underprovisioned systems such as EC2 micro instances; on other
21
systems it is fine. This project will exist until that is resolved.
Jordan Sissel's avatar
Jordan Sissel committed
22 23 24 25 26

### Transport Problems

Few log transport mechanisms provide security, low latency, and reliability.

27 28 29
The lumberjack protocol used by this project exists to provide a network
protocol for transmission that is secure, low latency, low resource usage, and
reliable.
Jordan Sissel's avatar
Jordan Sissel committed
30

31 32
## Configuring

33
logstash-forwarder is configured with a json file you specify with the -config flag:
34

35
`logstash-forwarder -config yourstuff.json`
36 37 38 39 40 41 42 43 44

Here's a sample, with comments in-line to describe the settings. Please please
please keep in mind that comments are technically invalid in JSON, so you can't
include them in your config.:

    {
      # The network section covers network configuration :)
      "network": {
        # A list of downstream servers listening for our messages.
45
        # logstash-forwarder will pick one at random and only switch if
46 47 48 49
        # the selected one appears to be dead or unresponsive
        "servers": [ "localhost:5043" ],

        # The path to your client ssl certificate (optional)
50
        "ssl certificate": "./logstash-forwarder.crt",
51
        # The path to your client ssl key (optional)
52
        "ssl key": "./logstash-forwarder.key",
53 54 55

        # The path to your trusted ssl CA file. This is used
        # to authenticate your downstream server.
56
        "ssl ca": "./logstash-forwarder.crt",
Jordan Sissel's avatar
Jordan Sissel committed
57

58 59 60 61 62
        # Network timeout in seconds. This is most important for
        # logstash-forwarder determining whether to stop waiting for an
        # acknowledgement from the downstream server. If an timeout is reached,
        # logstash-forwarder will assume the connection or server is bad and
        # will connect to a server chosen at random from the servers list.
Jordan Sissel's avatar
Jordan Sissel committed
63
        "timeout": 15
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
      },

      # The list of files configurations
      "files": [
        # An array of hashes. Each hash tells what paths to watch and
        # what fields to annotate on events from those paths.
        {
          "paths": [ 
            # single paths are fine
            "/var/log/messages",
            # globs are fine too, they will be periodically evaluated
            # to see if any new files match the wildcard.
            "/var/log/*.log"
          ],

          # A dictionary of fields to annotate on each event.
          "fields": { "type": "syslog" }
        }, {
          # A path of "-" means stdin.
          "paths": [ "-" ],
          "fields": { "type": "stdin" }
        }, {
          "paths": [
            "/var/log/apache/httpd-*.log"
          ],
89
          "fields": { "type": "apache" }
90 91 92 93
        }
      ]
    }

94 95 96 97 98 99 100 101
### Goals

* Minimize resource usage where possible (CPU, memory, network).
* Secure transmission of logs.
* Configurable event data.
* Easy to deploy with minimal moving parts.
* Simple inputs only:
  * Follows files and respects rename/truncation conditions.
102
  * Accepts `STDIN`, useful for things like `varnishlog | logstash-forwarder...`.
103

Jordan Sissel's avatar
Jordan Sissel committed
104 105
## Building it

106
1. Install [go](http://golang.org/doc/install)
107

108
2. Compile logstash-forwarder
109

110 111 112
        git clone git://github.com/elasticsearch/logstash-forwarder.git
        cd logstash-forwarder
        go build
113

114
## Packaging it (optional)
Jordan Sissel's avatar
Jordan Sissel committed
115

116
You can make native packages of logstash-forwarder.
117

118
To build the packages, you will need ruby and fpm installed.
119

120
    gem install fpm
121

122
Now build an rpm:
123

124
        make rpm
sgzijl's avatar
sgzijl committed
125

126 127 128
Or:

        make deb
Jordan Sissel's avatar
Jordan Sissel committed
129

Jordan Sissel's avatar
Jordan Sissel committed
130 131 132
## Installing it (via packages only)

If you don't use rpm or deb make targets as above, you can skip this section.
133

134 135 136
Packages install to `/opt/logstash-forwarder`. 

There are no run-time dependencies.
Jordan Sissel's avatar
Jordan Sissel committed
137 138 139

## Running it

140 141
Generally:

142
    logstash-forwarder -config lumberjack.conf
Jordan Sissel's avatar
Jordan Sissel committed
143

144
See `logstash-forwarder -help` for all the flags
145 146

The config file is documented further up in this file.
Jordan Sissel's avatar
Jordan Sissel committed
147

148
### Key points
Jordan Sissel's avatar
Jordan Sissel committed
149

150
* You'll need an SSL CA to verify the server (host) with.
151 152 153
* You can specify custom fields for each set of paths in the config file. Any
  number of these may be specified. I use them to set fields like `type` and
  other custom attributes relevant to each log.
Jordan Sissel's avatar
Jordan Sissel committed
154

155 156 157 158
### Generating an ssl certificate

Logstash supports all certificates, including self-signed certificates. To generate a certificate, you can run the following command:

159
    $ openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout logstash-forwarder.key -out logstash-forwarder.crt
160

161
This will generate a key at `logstash-forwarder.key` and the certificate at `logstash-forwarder.crt`. Both the server that is running logstash-forwarder as well as the logstash instances receiving logs will require these files on disk to verify the authenticity of messages.
162 163 164 165 166 167

Recommended file locations:

- certificates: `/etc/pki/tls/certs`
- keys: `/etc/pki/tls/private`

Jordan Sissel's avatar
Jordan Sissel committed
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185
## Use with logstash

In logstash, you'll want to use the [lumberjack](http://logstash.net/docs/latest/inputs/lumberjack) input, something like:

    input {
      lumberjack {
        # The port to listen on
        port => 12345

        # The paths to your ssl cert and key
        ssl_certificate => "path/to/ssl.crt"
        ssl_key => "path/to/ssl.key"

        # Set this to whatever you want.
        type => "somelogs"
      }
    }

186 187 188 189 190
## Implementation details 

Below is valid as of 2012/09/19

### Minimize resource usage
Jordan Sissel's avatar
Jordan Sissel committed
191

192 193 194 195 196
* Sets small resource limits (memory, open files) on start up based on the
  number of files being watched.
* CPU: sleeps when there is nothing to do.
* Network/CPU: sleeps if there is a network failure.
* Network: uses zlib for compression.
197

198
### Secure transmission
199

200 201 202
* Uses OpenSSL to verify the server certificates (so you know who you
  are sending to).
* Uses OpenSSL to transport logs.
203

204
### Configurable event data
205

206
* The protocol supports sending a `string:string` map.
Jordan Sissel's avatar
.  
Jordan Sissel committed
207

208
### Easy deployment
Jordan Sissel's avatar
.  
Jordan Sissel committed
209

210 211
* The `make deb` or `make rpm` commands will package everything into a
  single DEB or RPM.
212

213
### Future protocol discussion
214 215 216 217 218

I would love to not have a custom protocol, but nothing I've found implements
what I need, which is: encrypted, trusted, compressed, latency-resilient, and
reliable transport of events.

219
* Redis development refuses to accept encryption support, would likely reject
220
  compression as well.
221 222
* ZeroMQ lacks authentication, encryption, and compression.
* Thrift also lacks authentication, encryption, and compression, and also is an
223
  RPC framework, not a streaming system.
224
* Websockets don't do authentication or compression, but support encrypted
225 226 227 228
  channels with SSL. Websockets also require XORing the entire payload of all
  messages - wasted energy.
* SPDY is still changing too frequently and is also RPC. Streaming requires
  custom framing.
229
* HTTP is RPC and very high overhead for small events (uncompressable headers,
230
  etc). Streaming requires custom framing.
231 232 233 234 235

## License 

See LICENSE file.