Ban each request after overflow

Hello, I would like to try the following behavior with a scenario, but it won’t work as expected.

All requests, matching a filter, should land in the same bucket with a capacity of 5. After the 5th request, each request should cause an overflow. The decision should have “captcha” as the action.

What I understand is, that the value of groupby, is deciding, in which bucket the request will land, so I can’t groupby the ip. It should be another value which is the same for all requests I want to handle with that.

For testing purposes, I set the capacity to 1. So I expect that after the first requests, the second should cause a decision, and the same with the next request with another ip, as long the capacity is reached. But it’s not working as expected.

My scenario looks like the following:

type: leaky
name: bu/foreign-users
description: "detect users outside of eu"
filter: "evt.Meta.log_type == 'http_access-log' && evt.Meta.IsInEU == 'false'"
leakspeed: "20s"
capacity: 1
groupby: evt.Enriched.ASNNumber
blackhole: 0
reprocess: false
debug: true
labels:
 service: http
 type: server-check
 remediation: true

The behavior is, that after the first request I’m getting a decision, the third request won’t cause a decision, but the 4th request is landing again in the decision list.

+----+----------+--------------------+-----------------+---------+---------+---------------+--------+--------------------+----------+
| ID |  SOURCE  |    SCOPE:VALUE     |     REASON      | ACTION  | COUNTRY |      AS       | EVENTS |     EXPIRATION     | ALERT ID |
+----+----------+--------------------+-----------------+---------+---------+---------------+--------+--------------------+----------+
|  3 | crowdsec | Ip:149.154.159.150 | bu/server-guard | captcha | DE      | 9009 M247 Ltd |      2 | 3h59m8.667557561s  |        3 |
|  1 | crowdsec | Ip:149.154.159.148 | bu/server-guard | captcha | DE      | 9009 M247 Ltd |      2 | 3h58m55.141157649s |        1 |
+----+----------+--------------------+-----------------+---------+---------+---------------+--------+--------------------+----------+
2 duplicated entries skipped

What I want to achieve is, that when we get more than a specific count of requests from another country to a specific url within a defined time range, we want to show a captcha in the client. So our bouncer should react on the decision captcha and a special header to the request. Is this possible somehow with using the leaky buckets from CrowdSec?

Thanks in advance!

If the answer to my problem is too difficult, maybe some could tell me, if my expectation like the leaky bucket is working is correct.

So what I understand is that for each scenario a separate bucket will be created. For each unique value of what I defined in the groupby field a separate bucket for the scenario will be created. So if I use the ASN of the provider and not the ip, as usual, every request for this provider will be added to the bucket (if the filter matches). When the capacity is 5 and I receive 6 matching requests within a time that is shorter than the leakspeed, the 6th request causes an overflow. If there’s no overflow filter defined, a decision will follow, if the ip should be banned or maybe a captcha should be shown. What should happen, when the 7th request matches and still 5 requests are in the bucket? In my expectation the same should happen, what happened with the 6th request? Is my assumption correct?

If yes, I’m wondering why my test-scenario wasn’t working. I just defined a capacity of 1 and within a short time (less than the leakspeed) 4 log-entries with different ips. So in my expectation, I would not see the first request, but all the following in the decision list. But it’s not, it was only the second and the 4th request in the decision list. So what is wrong here?

Hello,

Your scenario should works.

Does the IPs in the 4 log entries have the same AS Number?
If you still have the logs lines, you can use cscli explain with the --verbose flag to see what are parsed and enriched with the parsers you have installed: cscli explain | CrowdSec

yes, all ip-addresses have the same ASNNumber.

when I use the replay mode with 4 following ip-addresses starting with 149.154.159.147 I see the following output:

INFO[06-05-2022 10:59:15] Loading 3 scenario files
INFO[06-05-2022 10:59:15] Adding leaky bucket                           cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
INFO[06-05-2022 10:59:15] Adding leaky bucket                           cfg=winter-sun file=/etc/crowdsec/scenarios/non-eu-users.yaml name=bu/non-eu-users
INFO[06-05-2022 10:59:15] Adding trigger bucket                         cfg=shy-lake file=/etc/crowdsec/scenarios/backdoor.yaml name=bu/http-backdoors-attempts
WARN[06-05-2022 10:59:15] Loaded 3 scenarios
INFO[06-05-2022 10:59:15] Adding file /var/log/haproxy/access-serverguard.log to filelist  type="file:///var/log/haproxy/access-serverguard.log"
WARN[06-05-2022 10:59:15] Starting processing data
INFO[06-05-2022 10:59:15] reading /var/log/haproxy/access-serverguard.log at once  type="file:///var/log/haproxy/access-serverguard.log"
DEBU[06-05-2022 10:59:15] eval(evt.Meta.log_type == 'http_access-log' && evt.Parsed.captured_request_headers == 'ServerGuard24 HTTP Plugin') = TRUE  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] eval variables:                               cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Meta.log_type = 'http_access-log'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Parsed.captured_request_headers = 'ServerGuard24 HTTP Plugin'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] Creating TimeMachine bucket                   cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] Leaky routine starting, lifetime : 1m0s       bucket_id=weathered-firefly capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] Created new bucket d35e00b74d4268d4267d47414e67541af4b540fb  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] bucket 'bu/server-guard' is poured            cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] First event, bucket creation time : 2022-05-06 10:07:53 +0000 UTC  bucket_id=weathered-firefly capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] eval(evt.Meta.log_type == 'http_access-log' && evt.Parsed.captured_request_headers == 'ServerGuard24 HTTP Plugin') = TRUE  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] eval variables:                               cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Meta.log_type = 'http_access-log'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Parsed.captured_request_headers = 'ServerGuard24 HTTP Plugin'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] bucket 'bu/server-guard' is poured            cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] Bucket overflow at 2022-05-06 10:07:54 +0000 UTC  bucket_id=weathered-firefly capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] Adding overflow to blackhole (2022-05-06 10:07:53 +0000 UTC)  bucket_id=weathered-firefly capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] eval(evt.Meta.log_type == 'http_access-log' && evt.Parsed.captured_request_headers == 'ServerGuard24 HTTP Plugin') = TRUE  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] eval variables:                               cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
WARN[06-05-2022 10:59:15] Acquisition is finished, shutting down
DEBU[06-05-2022 10:59:15]        evt.Meta.log_type = 'http_access-log'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Parsed.captured_request_headers = 'ServerGuard24 HTTP Plugin'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] Creating TimeMachine bucket                   cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] Leaky routine starting, lifetime : 1m0s       bucket_id=broken-thunder capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] Created new bucket d35e00b74d4268d4267d47414e67541af4b540fb  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] bucket 'bu/server-guard' is poured            cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] First event, bucket creation time : 2022-05-06 10:07:55 +0000 UTC  bucket_id=broken-thunder capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
INFO[06-05-2022 10:59:15] Killing parser routines
DEBU[06-05-2022 10:59:15] eval(evt.Meta.log_type == 'http_access-log' && evt.Parsed.captured_request_headers == 'ServerGuard24 HTTP Plugin') = TRUE  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] eval variables:                               cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Meta.log_type = 'http_access-log'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15]        evt.Parsed.captured_request_headers = 'ServerGuard24 HTTP Plugin'  cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] bucket 'bu/server-guard' is poured            cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard
DEBU[06-05-2022 10:59:15] Bucket overflow at 2022-05-06 10:07:56 +0000 UTC  bucket_id=broken-thunder capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] d35e00b74d4268d4267d47414e67541af4b540fb left blackhole 2s ago  bucket_id=broken-thunder capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
DEBU[06-05-2022 10:59:15] Adding overflow to blackhole (2022-05-06 10:07:55 +0000 UTC)  bucket_id=broken-thunder capacity=1 cfg=dawn-glitter file=/etc/crowdsec/scenarios/server-guard.yaml name=bu/server-guard partition=d35e00b74d4268d4267d47414e67541af4b540fb
INFO[06-05-2022 10:59:16] Ip 2 sources performed 'bu/server-guard' (2 events over 1s) at 2022-05-06 10:07:54 +0000 UTC
INFO[06-05-2022 10:59:16] Ip 2 sources performed 'bu/server-guard' (2 events over 1s) at 2022-05-06 10:07:56 +0000 UTC
INFO[06-05-2022 10:59:16] Bucket routine exiting
INFO[06-05-2022 10:59:17] crowdsec shutdown

Here’s is the output for the alerts and decision list

bash-5.1# cscli alerts list
+----+--------------------+-----------------+---------+---------------+-----------+-------------------------------+
| ID |       VALUE        |     REASON      | COUNTRY |      AS       | DECISIONS |          CREATED AT           |
+----+--------------------+-----------------+---------+---------------+-----------+-------------------------------+
|  4 | Ip:149.154.159.150 | bu/server-guard | DE      | 9009 M247 Ltd | captcha:1 | 2022-05-06 10:07:55 +0000 UTC |
|  3 | Ip:149.154.159.150 | bu/server-guard | DE      | 9009 M247 Ltd | captcha:1 | 2022-05-06 10:07:55 +0000 UTC |
|  2 | Ip:149.154.159.148 | bu/server-guard | DE      | 9009 M247 Ltd | captcha:1 | 2022-05-06 10:07:53 +0000 UTC |
|  1 | Ip:149.154.159.148 | bu/server-guard | DE      | 9009 M247 Ltd | captcha:1 | 2022-05-06 10:07:53 +0000 UTC |
+----+--------------------+-----------------+---------+---------------+-----------+-------------------------------+
bash-5.1# cscli decision list
+----+----------+--------------------+-----------------+---------+---------+---------------+--------+-------------------+----------+
| ID |  SOURCE  |    SCOPE:VALUE     |     REASON      | ACTION  | COUNTRY |      AS       | EVENTS |    EXPIRATION     | ALERT ID |
+----+----------+--------------------+-----------------+---------+---------+---------------+--------+-------------------+----------+
|  4 | crowdsec | Ip:149.154.159.150 | bu/server-guard | captcha | DE      | 9009 M247 Ltd |      2 | 3h3m29.444351341s |        4 |
|  2 | crowdsec | Ip:149.154.159.148 | bu/server-guard | captcha | DE      | 9009 M247 Ltd |      2 | 3h3m27.444285869s |        2 |
+----+----------+--------------------+-----------------+---------+---------+---------------+--------+-------------------+----------+

So the IP I would expect in the list is 149.154.159.149, but it’s not there.

@alteredCoder I just added an answer to your question but it was " Akismet has temporarily hidden your post", so I hope it will be shown soon…But the short answer is, that yes, all ip addresses were from the same ASNNumber

Today I played again with a scenario which is not using the ip for the groupby field. Instead I’m using the IsoCode.

type: leaky
format: 2.0
name: bu/tariffs-by-country
description: "detect too many requests to tariffs endpoint from outside DE"
filter: "evt.Meta.log_type == 'http_access-log' && evt.Enriched.IsoCode != 'DE' && evt.Parsed.request startsWith '/desktopapi/tariffs/'"
leakspeed: "60s"
capacity: 4
groupby: evt.Enriched.IsoCode
blackhole: 0
reprocess: true
debug: true
labels:
 service: http
 type: crawler-check
 remediation: true

What I wanted to achieve is, that all requests from outside Germany to a specific url will be counted and grouped by the country. When more requests than allowed in capacity, the bucket for the current country should overflow and I expect a decision and alert for each IP.

When I run my testscenario and I observe the decision list with watch cscli decision list. I seen an alert for the ip ending with 155, but not for the 154

this is my testdata

May  8 11:51:23 anyserver.intern.bu.xxx.de haproxy[948]: 178.190.110.154:50993 [08/May/2022:11:51:16.264] public~ desktopapi/desktopapi-02 0/0/1/7038/7039 200 5087 - - ---- 212/208/0/0/0 0/0 {Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36} "POST /desktopapi/tariffs/insurances HTTP/2.0"
May  8 11:51:23 anyserver.intern.bu.xxx.de haproxy[948]: 178.190.110.155:50993 [08/May/2022:11:51:16.264] public~ desktopapi/desktopapi-02 0/0/1/7038/7039 200 5087 - - ---- 212/208/0/0/0 0/0 {Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36} "POST /desktopapi/tariffs/insurances HTTP/2.0"

I just duplicated the data multiple time to simulate multiple requests from two ips.

This is the decision list

+----+----------+--------------------+-----------------------+---------+---------+----------------------------+--------+-----------------+----------+
| ID |  SOURCE  |    SCOPE:VALUE     |        REASON         | ACTION  | COUNTRY |             AS             | EVENTS |   EXPIRATION    | ALERT ID |
+----+----------+--------------------+-----------------------+---------+---------+----------------------------+--------+-----------------+----------+
|  2 | crowdsec | Ip:178.190.110.155 | bu/tariffs-by-country | captcha | AT      | 8447 A1 Telekom Austria AG |      5 | 4m54.795402488s |        2 |
+----+----------+--------------------+-----------------------+---------+---------+----------------------------+--------+-----------------+----------+
1 duplicated entries skipped

In the alerts list, I see two entries for IP 155, but not for 154.

I guess it’s using the wrong IP for the second alert. Is there a bug in Crowdsec or am I doing or thinking something completely wrong. I’m using the haproxy parser and cscli inspect shows me correct two different ip addresses. Thanks for support in advance.

Hello,

Just to be sure to understand your need, you want to ban a single IP or multiple IP that triggered a scenario which group_by on the country or the AS ?
You don’t want to ban all the IP belonging to an AS/country after this scenario was triggered ?

Hello @alteredCoder,

Let me try to explain you our use-case. Because of the nature of our products, our customers are mostly based in Germany. So 90 percent of all requests are coming from providers located in Germany.
Just a few requests are coming from other countries. From our experience, if there are too many requests from outside of Germany, the requests are coming from Crawlers either from a single IP, or from a group a IPs from but the same country. So in such case, we don’t want to ban all such requests, but showing them a captcha.

The scenario should be defined in the following way. The leaky bucket should have a capacity, that a normal user should normally never reach alone, but a group of IPs from the same could reach that.
When the capacity of this bucket is reached, the decision for all following IPs from the same country should get a decision with a captcha. If the requests stopped, and because of the leak-speed the bucket would be again below the max-capacity, requests from this country would be again allowed.

So the short answer is yes, I want to show to all IPs of the same country a captcha, after the scenario was triggered, as long there are more requests than allowed with the capacity of the bucket.

Hello !

What you’re asking for as it is currently doesn’t seem to be achievable, however there might be some workarounds :slight_smile:

What bouncer are you using ? Some bouncers (ie. cloudflare or aws waf or php) support geolocalisation blocking, you could thus force the whole country to have a captcha.

Another option would be to apply the scenario to ranges outside of germany, so that you apply captchas not to a list of individual IPs, but to the ranges to which they belong (as you seem to be facing “grouped” attack).

Please let me know,

Hello @thibault, we’re planning to use the HAProxy bouncer, but extending it before, because currently it’s only supporting to react on the ban decision and block the request completely. As I explained, we don’t want to ban them, but instead show them a captcha. So the bouncer would react on decision captcha with extending the original request with a special header, and in the frontend we know that a captcha should be shown.

Could you please tell me, what is happening with the requests that are coming after the first request after the capacity is reached, and the first overflow with a decision was happened? Why do they are not also causing an overflow (as long the bucket is still full?) In my expectation, the behavior should be the same for all of them. We also plan to have a second scenario with groupby the ASN of the provider, since sometimes Crawlers are using the same provider, but from different countries. Those providers have multiple ranges of IPs. For example AS203020 HostRoyale Technologies Pvt Ltd details - IPinfo.io

In the bouncer in HAProxy I only have the IP address from the requests and with that I would like to ask Crowdsec if there is an active decision for this IP or not.

I tried to find out, how the creation of the buckets is really working. What I found out, is that after each overflow, a new instance of the bucket will be created. And then of course it starts again to add the requests that are matching the filter into this new bucket.

That’s why my scenario won’t work, that each request after the first overflow should cause a decision. Unfortunately, this behavior doesn’t help me for my use case. From my perspective, if I want to group by something else as the IP, the overflow handling is questionable, at least for what I want to achieve.

Would it be possible to add it as a feature, that as long the bucket exists and is not expired by an underflow. It should not recreate a bucket for the same partition, instead of refilling it, and causing an overflow for each request after reaching the capacity. Would be great, if I could use Crowdsec in that way! :slight_smile:

@thibault @alteredCoder Could you please give me any feedback about that? It’s very important for me to know if there’s a chance that Crowdsec could work in the expected way soon or not. If not, I need to find another solution. But it would be great if it would work.

Hello,

I think I understand a bit better what you want to achieve but it doesn’t work yet.
However, as it’s a relevant use-case, we can look into it, but I don’t have an ETA yet to provide to you.

I will try to take some time to look how we can achieve this, and I’ll keep you posted !

@thibault Thank you very much for thinking about a solution for our use case. I hope the best. And I think it’s not such an unusual use case, since I think that this is the idea of using something else than the IP address for the group by field. Why should I group by the country, for example, and then have a decision for only one of the affected IP addresses. I’m looking forward to hearing from you…

I was also looking into the code and thinking how the behavior could be changed. So my idea would be to add an additional configuration option for the scenario with the name “overflowBehavior”. The default value wouldn’t change anything, but if the value of the option would be something like “keep-bucket”, the bucket wouldn’t be destroyed after the first overflow. Instead, each IP that would match the filter would cause another overflow, until the leaky-routing would drain the bucket.

If I’m right, the position in the code would be here:

Yes, this would be the main change : allowing a bucket not to die when it overflows, so that further events continue to generate overflows.

If you want to give a take at the PR, let me know and I’ll be happy to help you if needed :slight_smile:

I’m glad to hear, that my assumption about which change has to be made, was true and that you also like my idea. About the PR, I could try it, but I’m not sure if it would work with the tests, since the tests failed on my system without changing anything. I’m not sure, if I have to run all tests after my change, or also any specific since I guess, my changes won’t affect all the code.

We’re currently updating the documentation to make the tests easier to run. However beware to rebase/pull before attempting changes as we just merge a handful of big PRs (windows support :wink: ).

Feel free to drop by on CrowdSec discord and we can help you to deal with it !

I just created this PR Add new optional scenario config overflowBehavior by janbaer · Pull Request #1551 · crowdsecurity/crowdsec · GitHub even I’m aware that it’s not good enough for being acceptable. But maybe we can improve it together…