Hi All,
Love the idea of this security solution, and we are actively trialling it in non-production currently.
We are currently hamstrung from contributing our log files to the upstream analysis and analytics, because our access+error logs in NGiNX are in GELF-JSON format, for ingestion into Graylog2.
Has anyone solved this, or a similar problem? We have considered the following approaches -
Write a 2nd log for each vHost in the default “combined” format. There doesn’t seem to be any simple way of doing this, however.
Write a custom GROK pattern filter to override the default “combined” interpretation. Are there any pre existing filters for the GELF JSON format already? Or guides for newbies writing GROK filters?
Example GELF JSON format logs, from nginx.conf -
log_format gelf_json escape=json '{ "timestamp": "$time_iso8601", ' '"remote_addr": "$remote_addr", ' '"connection": "$connection", ' '"connection_requests": $connection_requests, ' '"pipe": "$pipe", ' '"body_bytes_sent": $body_bytes_sent, ' '"request_length": $request_length, ' '"request_time": $request_time, ' '"response_status": $status, ' '"request": "$request", ' '"request_method": "$request_method", ' '"host": "$host", ' '"upstream_cache_status": "$upstream_cache_status", ' '"upstream_addr": "$upstream_addr", ' '"http_x_forwarded_for": "$http_x_forwarded_for", ' '"http_referrer": "$http_referer", ' '"http_user_agent": "$http_user_agent", ' '"http_version": "$server_protocol", ' '"remote_user": "$remote_user", ' '"http_x_forwarded_proto": "$http_x_forwarded_proto", ' '"upstream_response_time": "$upstream_response_time", ' '"nginx_access": true }';
kaa
November 3, 2020, 8:09am
2
Hi,
I believe the produced logs are JSON, aren’t they ?
So we have no built-in parser that can be fed with json logs for now, but we have already some features to do it. You’ll find an example in the unit tests https://github.com/crowdsecurity/crowdsec/tree/master/pkg/parser/tests/base-json-extract
I guess the resulting parser file should look like this:
filter: "evt.Parsed.program startsWith 'nginx'"
onsuccess: next_stage
#debug: true
name: crowdsecurity/nginx-logs
description: "Parse nginx access and error logs"
statics:
- target: evt.StrTime
expression: JsonExtract(evt.Line.Raw, "timestamp8601")
- parsed: "logsource"
value: "gelf-nginx"
- meta: source_ip
expression: JsonExtract(evt.Line.Raw, "remote_addr")
- meta: http_status
expression: JsonExtract(evt.Line.Raw, "response_status")
- meta: http_path
expression: JsonExtract(evt.Line.Raw, "request")
- meta: log_type
value: http_access-log
This file should be put in the /etc/crowdsec/config/parsers/s00-raw
directory but I can’t test it because I don’t have any of your logs. If you want to, you can provide us a sample of your logs and we’ll have a better chance to provide you a ready to use parser file.
Hi Kaa,
Thanks very much for the prompt and informative reply.
Please find an example log file here: https://www.veepshosting.com/ssl-redacted.domain.co.access.log.gz
I’m getting conflicting feedback on whether or not it’s working.
The primary log file ‘/var/log/crowdsec.log’ has the following entry, using the parser above -
time=“04-11-2020 18:06:01” level=debug msg="+ Processing 2 statics" func=“github.com/crowdsecurity/crowdsec/pkg/parser.(*Node ).process” file="/home/runner/work/crowdsec/crowdsec/pkg/parser/node.go:313" id=long-lake name=crowdsecurity/non-syslog stage=s00-raw
time=“04-11-2020 18:06:01” level=debug msg=".Parsed[message] = ‘{ “timestamp”: “2020-11-04T18:06:01+11:00”, “remote_addr”: “54.162.224.1”, “connection”: “50343”, “connection_requests”: 1, “pipe”: “.”, “body_bytes_sent”: 161205, “request_length”: 325, “request_time”: 0.228, “response_status”: 200, “request”: “GET /media/catalog/product/b/e/bella_rosa.jpg HTTP/1.1”, “request_method”: “GET”, “host”: “redacted.domain.co ”, “upstream_cache_status”: “”, “upstream_addr”: “”, “http_x_forwarded_for”: “”, “http_referrer”: “”, “http_user_agent”: “Ruby”, “http_version”: “HTTP/1.1”, “remote_user”: “”, “http_x_forwarded_proto”: “”, “upstream_response_time”: “”, “nginx_access”: true }’" func=github.com/crowdsecurity/crowdsec/pkg/parser.ProcessStatics file="/home/runner/work/crowdsec/crowdsec/pkg/parser/runtime.go:175" id=long-lake name=crowdsecurity/non-syslog stage=s00-raw
Yes the metrics command doesn’t show any lines parsed? -
sudo cscli metrics
INFO[0000] Buckets Metrics:
±-------±--------------±----------±-------------±-------±--------+
| BUCKET | CURRENT COUNT | OVERFLOWS | INSTANCIATED | POURED | EXPIRED |
±-------±--------------±----------±-------------±-------±--------+
±-------±--------------±----------±-------------±-------±--------+
INFO[0000] Acquisition Metrics:
±----------------------------------------------------------±-----------±-------------±---------------±-----------------------+
| SOURCE | LINES READ | LINES PARSED | LINES UNPARSED | LINES POURED TO BUCKET |
±----------------------------------------------------------±-----------±-------------±---------------±-----------------------+
| /var/log/auth.log | 24 | - | 24 | - |
| /var/log/nginx/ssl-obfuscated.domain.co.access.log | 132 | - | 132 | - |
| /var/log/syslog | 11 | - | 11 | - |
±----------------------------------------------------------±-----------±-------------±---------------±-----------------------+
INFO[0000] Parser Metrics:
±------------------------------±-----±-------±---------+
| PARSERS | HITS | PARSED | UNPARSED |
±------------------------------±-----±-------±---------+
| child-crowdsecurity/sshd-logs | 10 | - | 10 |
| crowdsecurity/non-syslog | 132 | 132 | - |
| crowdsecurity/sshd-logs | 2 | - | 2 |
| crowdsecurity/syslog-logs | 35 | 35 | - |
±------------------------------±-----±-------±---------+
kaa
November 5, 2020, 11:19am
5
The nginx-gelp thingy gets only unparsed logs. So this is not working?
OTOH the installation seems functional because syslog did parse 35 log lines. But this parser has to be followed by another one (and nginx-gelp thingy is not eligible). Your configuration seems that beside your nginx-gelp thingy only ssh is enabled thus this one only is able to trigger anything, and it was fed only by two lines of log. This seems actually legit.
I’ll take some time to dig into your logs very soon.
kaa
November 5, 2020, 7:08pm
6
Ok I got the parsing to work.
You’ll have to configure your /etc/crowdsec/config/acquis.yaml
with something like
filenames:
- /var/log/<the generated gelp-nginx log file>
labels:
type: gelp-nginx
---
Then add the folllowing configuration as parsing file to add in /etc/crowdsec/config/parsers/s00-raw
. Whatever name with the extension .yaml will do.
filter: "evt.Line.Labels.type == 'nginx-gelp'"
onsuccess: next_stage
#debug: true
name: crowdsecurity/nginx-logs
description: "Parse nginx access and error logs"
statics:
- target: evt.StrTime
expression: JsonExtract(evt.Line.Raw, "timestamp")
- parsed: "logsource"
value: "gelf-nginx"
- parsed: remote_addr
expression: JsonExtract(evt.Line.Raw, "remote_addr")
- parsed: remote_user
expression: JsonExtract(evt.Line.Raw, "remote_user")
- meta: source_ip
expression: JsonExtract(evt.Line.Raw, "remote_addr")
- meta: http_status
expression: JsonExtract(evt.Line.Raw, "response_status")
- meta: http_path
expression: JsonExtract(evt.Line.Raw, "request")
- meta: log_type
value: http_access-log
- meta: service
value: http
- parsed: http_user_agent
expression: JsonExtract(evt.Line.Raw, "http_user_agent")
- parsed: http_referer
expression: JsonExtract(evt.Line.Raw, "http_referrer")
- parsed: target_fqdn
expression: JsonExtract(evt.Line.Raw, "host")
- parsed: method
expression: JsonExtract(evt.Line.Raw, "request_method")
- parsed: body_bytes_sent
expression: JsonExtract(evt.Line.Raw, "body_bytes_sent")
- parsed: http_version
expression: JsonExtract(evt.Line.Raw, "http_version")
- parsed: status
expression: JsonExtract(evt.Line.Raw, "response_status")
Please keep in mind that the expression "evt.Line.Labels.type == 'nginx-gelp'" as to match the label in the
acquis.yaml` file.
A final step is required to make all the dot connect with the http-related scenarios. Add the following file in /etc/crowdsec/config/parsers/s01-parse
. Whatever name with .yaml
extension will do:
filter: "evt.Meta.service == 'http' && evt.Meta.log_type in ['http_access-log', 'http_error-log']"
onsuccess: next_stage
name: local/gelp-nginx-request
nodes:
- grok:
pattern: '%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}'
apply_on: full_request
1 Like
Thanks Kaa, we are testing it currently, however the log files that match are still sitting in “Unparsed”. No errors relating to the above config are listen in the main crowdsec.log log file, so possibly just a matter of waiting a bit longer.
cscli metrics
INFO[0000] Buckets Metrics:
±-------------------------------±--------------±----------±-------------±-------±--------+
| BUCKET | CURRENT COUNT | OVERFLOWS | INSTANCIATED | POURED | EXPIRED |
±-------------------------------±--------------±----------±-------------±-------±--------+
| crowdsecurity/ssh-bf | - | - | 12 | 13 | 12 |
| crowdsecurity/ssh-bf_user-enum | - | - | 12 | 12 | 12 |
±-------------------------------±--------------±----------±-------------±-------±--------+
INFO[0000] Acquisition Metrics:
±-------------------------------------------------------------±-----------±-------------±---------------±-----------------------+
| SOURCE | LINES READ | LINES PARSED | LINES UNPARSED | LINES POURED TO BUCKET |
±-------------------------------------------------------------±-----------±-------------±---------------±-----------------------+
| /var/log/auth.log | 877 | 13 | 864 | 25 |
| /var/log/nginx/ssl-server.access.log | 1 | - | 1 | - |
| /var/log/nginx/ssl-server.error.log | 1 | - | 1 | - |
| /var/log/nginx/ssl-preprod.site.co.access.log | 1 | - | 1 | - |
| /var/log/nginx/ssl-test.site.co.access.log | 1814 | - | 1814 | - |
| /var/log/syslog | 517 | - | 517 | - |
±-------------------------------------------------------------±-----------±-------------±---------------±-----------------------+
INFO[0000] Parser Metrics:
±-------------------------------±-----±-------±---------+
| PARSERS | HITS | PARSED | UNPARSED |
±-------------------------------±-----±-------±---------+
| child-crowdsecurity/sshd-logs | 836 | 13 | 823 |
| child-local/gelf-nginx | 526 | - | 526 |
| crowdsecurity/dateparse-enrich | 13 | 13 | - |
| crowdsecurity/geoip-enrich | 13 | 13 | - |
| crowdsecurity/nginx-logs | 526 | 526 | - |
| crowdsecurity/non-syslog | 1291 | 1291 | - |
| crowdsecurity/sshd-logs | 171 | 13 | 158 |
| crowdsecurity/syslog-logs | 1394 | 1394 | - |
| crowdsecurity/whitelists | 13 | 13 | - |
| local/gelf-nginx | 526 | - | 526 |
±-------------------------------±-----±-------±---------+
kaa
November 10, 2020, 7:48am
9
Hi @grant-veepshosting ,
You definitely should get more count in the parsed column.
But when I read the config I pasted you I found a nasty typo: type: gelp-nginx
in the 1. should match the nginx-gelp
in the filter lin in the 2. These two entries should match in order the parser to know which type of log it’s working on.
Replacing you /etc/crowdsec/config/acquis.yaml
with
filenames:
- /var/log/<the generated gelp-nginx log file>
labels:
type: nginx-gelp
should do the trick
Sorry for this
Hey Kaa,
I noticed this typo and corrected it yesterday, thanks for following up. Also gelp should be gelf.
According to the debug logging, it’s being processed, but not according to any cscli metrics / statistics.
Example debug log entry, anonymised -
https://www.veepshosting.com/nginx_gelf_debug_example.log.gz
kaa
November 12, 2020, 11:38am
11
hi @grant-veepshosting ,
Yes, indeed. But it stil lacks a step. The s00-raw
step is working fine. But, the s01-parse
step isn’t.
I guess a typo still got through my testing because the apply_on
directive on the last file doesn’t match anything coming from the s00-raw
stage. To fix this you’ll have to add the following line of the file in the s00-parse
stage:
- parsed: full_request
expression: JsonExtract(evt.Line.Raw, "request")
Furthermore, from the last cscli metrics
output you showed us that the base-http-scenario
collection is not installed. I wrote the whole gelf (sorry for me being confused with the p over the f) for being compatible with this collection.
You can do cscli install collection crowdsecurity/base-http-scenarios
to install it.
Thank you kaa!!! It works not, thanks very much for sticking in there.
Would you like me to post the configs in full for reference, or possible addition to the code base?
PS: I had to add the section above “request” to the s00-raw/yaml file, not the s01-parse/ .yaml file for it to work correctly.
kaa
November 17, 2020, 10:28pm
13
HI @grant-veepshosting ,
yes, it would be great to have this. At some point, we may want to add this to the official stuff in the hub.
Thanks for your feedback !
/etc/crowdsec/config/parsers/s00-raw/nginx-gelf.yaml:
filter: "evt.Line.Labels.type == 'gelf-nginx'"
onsuccess: next_stage
debug: true
name: crowdsecurity/nginx-logs
description: "Parse nginx access and error logs"
statics:
- target: evt.StrTime
expression: JsonExtract(evt.Line.Raw, "timestamp")
- parsed: "logsource"
value: "gelf-nginx"
- parsed: remote_addr
expression: JsonExtract(evt.Line.Raw, "remote_addr")
- parsed: remote_user
expression: JsonExtract(evt.Line.Raw, "remote_user")
- meta: source_ip
expression: JsonExtract(evt.Line.Raw, "remote_addr")
- meta: http_status
expression: JsonExtract(evt.Line.Raw, "response_status")
- meta: http_path
expression: JsonExtract(evt.Line.Raw, "request")
- meta: log_type
value: http_access-log
- meta: service
value: http
- parsed: http_user_agent
expression: JsonExtract(evt.Line.Raw, "http_user_agent")
- parsed: http_referer
expression: JsonExtract(evt.Line.Raw, "http_referrer")
- parsed: target_fqdn
expression: JsonExtract(evt.Line.Raw, "host")
- parsed: method
expression: JsonExtract(evt.Line.Raw, "request_method")
- parsed: body_bytes_sent
expression: JsonExtract(evt.Line.Raw, "body_bytes_sent")
- parsed: http_version
expression: JsonExtract(evt.Line.Raw, "http_version")
- parsed: status
expression: JsonExtract(evt.Line.Raw, "response_status")
- parsed: full_request
expression: JsonExtract(evt.Line.Raw, "request")
/etc/crowdsec/config/parsers/s01-parse/nginx-gelf-logs.yaml:
filter: "evt.Meta.service == 'http' && evt.Meta.log_type in ['http_access-log', 'http_error-log']"
onsuccess: next_stage
name: local/gelf-nginx
nodes:
- grok:
pattern: '%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}'
apply_on: full_request