CPU usage is really higher than Fail2ban

Hi!

I’m a long time fail2ban user, and I’ve already tried to improve fail2ban CPU usage which I find too high. My server is a low-end Atom D2550 so every saved percentage counts :slight_smile:

I’ve tried CrowdSec without any bouncer, using journald acquisition with 2 services filtered (ssh and nginx), and all nginx logs in /var/log/nginx/**/*.log.
I’ve installed the Debian/testing package, 1.4.6, using their default setting. So I’m using the Central API as well.

fail2ban is used with the following jails: dovecot, nsd, nginx, sshd, recidive and postfix. Some of them are using journald, others are using file logs.

I’ve done a simple systemctl restart crowdsec fail2ban, and a few seconds after that, a systemctl status crowdsec fail2ban:

● crowdsec.service - Crowdsec agent
     Loaded: loaded (/lib/systemd/system/crowdsec.service; enabled; preset: enabled)
     Active: active (running) since Sun 2023-04-16 18:51:43 CEST; 11s ago
    Process: 2457875 ExecStartPre=/usr/bin/crowdsec -c /etc/crowdsec/config.yaml -t (code=exited, status=0/SUCCESS)
   Main PID: 2457998 (crowdsec)
      Tasks: 12 (limit: 4671)
     Memory: 74.0M
        CPU: 40.905s
     CGroup: /system.slice/crowdsec.service
             ├─2457998 /usr/bin/crowdsec -c /etc/crowdsec/config.yaml
             └─2458048 journalctl --follow -n 0 _SYSTEMD_UNIT=ssh.service _SYSTEMD_UNIT=nginx.service

● fail2ban.service - Fail2Ban Service
     Loaded: loaded (/lib/systemd/system/fail2ban.service; enabled; preset: enabled)
     Active: active (running) since Sun 2023-04-16 18:51:24 CEST; 31s ago
       Docs: man:fail2ban(1)
   Main PID: 2457888 (fail2ban-server)
      Tasks: 19 (limit: 4671)
     Memory: 123.4M
        CPU: 31.500s
     CGroup: /system.slice/fail2ban.service
             ├─2457888 /usr/bin/python3 /usr/bin/fail2ban-server -xf start
             ├─2458136 /bin/sh -c "nft add element inet f2b-table addr-set-sshd \\{ 157.230.1.224 \\}"
             └─2458137 nft add element inet f2b-table addr-set-sshd { 157.230.1.224 }

After startup, memory is lower for crowdsec, but it has used a bit more CPU. Not a big deal here.

After 5 minutes, the result is different:

● crowdsec.service - Crowdsec agent
     Active: active (running) since Sun 2023-04-16 18:51:43 CEST; 4min 10s ago
[…]
     Memory: 76.7M
        CPU: 54.331s

● fail2ban.service - Fail2Ban Service
     Active: active (running) since Sun 2023-04-16 18:51:24 CEST; 4min 29s ago
[…]
     Memory: 96.7M
        CPU: 1min 21.977s

At first, it seems that Go’s job is clear. But after a much longer time:

● crowdsec.service - Crowdsec agent
     Active: active (running) since Sun 2023-04-16 18:51:43 CEST; 2h 14min ago
[…]
     Memory: 82.6M
        CPU: 7min 56.071s

● fail2ban.service - Fail2Ban Service
     Active: active (running) since Sun 2023-04-16 18:51:24 CEST; 2h 15min ago
[…]
     Memory: 87.5M
        CPU: 2min 13.405s

It has eaten 3 times more CPU than the old Python version. I am bit disappointed, given the fact that the README.md insists on a 60× faster speed.

Am I doing something wrong, e.g. running both fail2ban and crowdsec? Is that an expected result on a CPU lacking some modern features?

This is getting worse:

● crowdsec.service - Crowdsec agent
     Active: active (running) since Sun 2023-04-16 18:51:43 CEST; 13h ago
     Memory: 93.1M
        CPU: 45min 44.254s

● fail2ban.service - Fail2Ban Service
     Active: active (running) since Sun 2023-04-16 18:51:24 CEST; 13h ago
     Memory: 63.2M
        CPU: 7min 485ms

Hi @Glandos,

To compare both, you need to have nearly the same parsers. So if you have a lot of collections installed in crowdsec it’s maybe the reason.
Could you please list your collections and parsers :
cscli collections list && cscli parsers list

cscli collections list && cscli parsers list

COLLECTIONS
──────────────────────────────────────────────────────────────────────────────────────────────────────────────
Name :package: Status Version Local Path
──────────────────────────────────────────────────────────────────────────────────────────────────────────────
crowdsecurity/base-http-scenarios :heavy_check_mark: enabled 0.6
/etc/crowdsec/collections/base-http-scenarios.yaml
crowdsecurity/http-cve :heavy_check_mark: enabled 1.9
/etc/crowdsec/collections/http-cve.yaml
crowdsecurity/linux :heavy_check_mark: enabled 0.2
/etc/crowdsec/collections/linux.yaml
crowdsecurity/nginx :heavy_check_mark: enabled 0.2
/etc/crowdsec/collections/nginx.yaml
crowdsecurity/sshd :heavy_check_mark: enabled 0.2
/etc/crowdsec/collections/sshd.yaml
──────────────────────────────────────────────────────────────────────────────────────────────────────────────

PARSERS
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
Name :package: Status Version Local Path
───────────────────────────────────────────────────────────────────────────────────────────────────────────────
crowdsecurity/dateparse-enrich :heavy_check_mark: enabled 0.2
/etc/crowdsec/parsers/s02-enrich/dateparse-enrich.yaml
crowdsecurity/http-logs :heavy_check_mark: enabled 1.1
/etc/crowdsec/parsers/s02-enrich/http-logs.yaml
crowdsecurity/nginx-logs :heavy_check_mark: enabled 1.3
/etc/crowdsec/parsers/s01-parse/nginx-logs.yaml
crowdsecurity/sshd-logs :heavy_check_mark: enabled 2.0
/etc/crowdsec/parsers/s01-parse/sshd-logs.yaml
crowdsecurity/syslog-logs :heavy_check_mark: enabled 0.8
/etc/crowdsec/parsers/s00-raw/syslog-logs.yaml
crowdsecurity/whitelists :heavy_check_mark: enabled 0.2
/etc/crowdsec/parsers/s02-enrich/whitelists.yaml
───────────────────────────────────────────────────────────────────────────────────────────────────────────────

fail2ban-client status
Status

  • Number of jail: 7
    `- Jail list: dovecot, nginx-botsearch, postfix, postfix-ddos,
    postfix-sasl, recidive, sshd

As far as I understand, the nginx jail in my fail2ban is the major
difference, as it reads from journald, as currently configured. But
given the traffic on my webserver (no more than 10hits/minute), I think
the difference is still too high.

Thanks, Your crowdsec collections seems good (only what you need).

If it’s possible and if we want to understand what’s going on with the CPU, can you please give us a CPU pprof: