I see a repeated pattern of false positives and I cannot understand.
I put the debug setting to true, but it didn’t seem to change the logging info.
I have a http-crawl-non_statics level of 400 capacity and a 1s leak
I think this means that when the 401st request comes in it will spill and block the IP.
Here is what I see in the logs:
(Note I deleted the ban after the first trigger and you can see the retrigger in the logs)
time="29-09-2021 11:03:15" level=info msg="Ip 69.136.133.107 performed 'crowdsecurity/http-crawl-non_statics' (988 events over 14m15.619972416s) at 2021-09-29 11:03:15.222310069 -0500 CDT m=+7467.250285646"
time="29-09-2021 11:03:15" level=info msg="(e0329a75983f4477ab7afb7f6f09a1094IrAyC6dbDbTOMdU/crowdsec) crowdsecurity/http-crawl-non_statics by ip 69.136.133.107 (US) : 1h ban on Ip 69.136.133.107"
time="29-09-2021 11:06:48" level=info msg="Ip 69.136.133.107 performed 'crowdsecurity/http-crawl-non_statics' (603 events over 3m22.031372879s) at 2021-09-29 11:06:48.755867063 -0500 CDT m=+7680.783842640"
time="29-09-2021 11:06:49" level=info msg="(e0329a75983f4477ab7afb7f6f09a1094IrAyC6dbDbTOMdU/crowdsec) crowdsecurity/http-crawl-non_statics by ip 69.136.133.107 (US) : 1h ban on Ip 69.136.133.107"
Based on the log data, it says there were 603 events over 3 min… I don’t understand why it is logged this way, but in any case, it tripped the IP block… that’s well under 400/sec
Often when I check my haproxy logs, there aren’t the same number of ‘events’ (e.g. log lines) with the suspect IP recorded in my haproxy logs.
What am I missing? We keep blocking people we don’t want to block and it seems our thresholds should not trigger this.
Here is my http-crawl-non_statics.yaml file:
type: leaky
name: crowdsecurity/http-crawl-non_statics
description: "Detect aggressive crawl from single ip"
filter: "evt.Meta.log_type in ['http_access-log', 'http_error-log'] && evt.Parsed.static_ressource == 'false'"
distinct: "evt.Parsed.file_name"
leakspeed: 1s
capacity: 420
debug: true
#this limits the memory cache (and event_sequences in output) to five events
cache_size: 5
groupby: "evt.Meta.source_ip + '/' + evt.Parsed.target_fqdn"
blackhole: 1m
labels:
service: http
type: crawl
remediation: true
Thanks.