CrowdSec event quantity calculation

Hello,

I have written custom parser and scenario, but the issue is that real event count (it is 2) is smaller than capacity configured for the scenario (it is 3). But when notification is sent, it reports 4 events while sending only 2 of them.

Attaching parser, scenario and webhook contents for your reference:

Parser `parsers/s01-parse/endlessh-logs.yaml` (it is customized for my fork of endlessh)
onsuccess: next_stage
filter: "evt.Parsed.program == 'endlessh'"
name: crowdsecurity/endlessh-logs
description: "Parse Endlessh logs"
pattern_syntax:
  ENDLESSH_ACCEPT_V4: "%{TIMESTAMP_ISO8601:timestamp} ACCEPT srcaddr=(::ffff:)?%{IPV4:source_ip} srcport=%{INT:src_port} dstaddr=(::ffff:)?%{IPV4:destination_ip} dstport=%{INT:dest_port} "
  ENDLESSH_ACCEPT_V6: "%{TIMESTAMP_ISO8601:timestamp} ACCEPT srcaddr=%{IPV6:source_ip} srcport=%{INT:src_port} dstaddr=%{IPV6:source_ip} dstport=%{INT:dest_port} "
nodes:
  - grok:
      name: "ENDLESSH_ACCEPT_V4"
      apply_on: Line.Raw
      statics:
        - meta: log_type
          value: endlessh_accept
  - grok:
      name: "ENDLESSH_ACCEPT_V6"
      apply_on: Line.Raw
      statics:
        - meta: log_type
          value: endlessh_accept
statics:
  - meta: service
    value: endlessh
  - target: evt.StrTime
    expression: evt.Parsed.timestamp
  - meta: source_ip
    expression: "evt.Parsed.source_ip"
  - meta: destination_ip
    expression: "evt.Parsed.destination_ip"
  - meta: source_port
    expression: "evt.Parsed.src_port"
  - meta: destination_port
    expression: "evt.Parsed.dest_port"
Scenario `scenarios/endlessh.yaml`
type: leaky
leakspeed: "24h"
capacity: 3
cache_size: 1
name: custom/scenario
description: "Detect all connections caught by Endlessh"
filter: "evt.Meta.log_type == 'endlessh_accept'"
groupby: evt.Meta.source_ip
distinct: evt.Meta.destination_ip + ":" + evt.Meta.destination_port
blackhole: 5m
reprocess: true
labels:
  service: endlessh
  type: scan
  remediation: true
Webhook body (some of the details are redacted for privacy reasons)
[
  {
    "capacity": 3,
    "decisions": [
      {
        "duration": "36h",
        "origin": "crowdsec",
        "scenario": "custom/scenario",
        "scope": "Ip",
        "type": "ban",
        "uuid": "1d146c5f-f0b2-4c44-a297-0098e0546eaa",
        "value": "ATTACKER"
      }
    ],
    "events": [
      {
        "meta": [
          {
            "key": "ASNNumber",
            "value": "AS_NUMBER"
          },
          {
            "key": "ASNOrg",
            "value": "AS_NAME"
          },
          {
            "key": "IsInEU",
            "value": "true"
          },
          {
            "key": "IsoCode",
            "value": "DE"
          },
          {
            "key": "SourceRange",
            "value": "ATTACKER_CIDR"
          },
          {
            "key": "datasource_path",
            "value": "journalctl-_SYSTEMD_UNIT=endlessh@23.service"
          },
          {
            "key": "datasource_type",
            "value": "journalctl"
          },
          {
            "key": "destination_ip",
            "value": "VICTIM_PREFIX.236"
          },
          {
            "key": "destination_port",
            "value": "23"
          },
          {
            "key": "log_type",
            "value": "endlessh_accept"
          },
          {
            "key": "machine",
            "value": "VICTIM_MACHINE"
          },
          {
            "key": "service",
            "value": "endlessh"
          },
          {
            "key": "source_ip",
            "value": "ATTACKER"
          },
          {
            "key": "source_port",
            "value": "56062"
          },
          {
            "key": "timestamp",
            "value": "2023-07-02T16:59:50.595Z"
          }
        ],
        "timestamp": "2023-07-02T16:59:50.595Z"
      },
      {
        "meta": [
          {
            "key": "ASNNumber",
            "value": "AS_NUMBER"
          },
          {
            "key": "ASNOrg",
            "value": "AS_NAME"
          },
          {
            "key": "IsInEU",
            "value": "true"
          },
          {
            "key": "IsoCode",
            "value": "DE"
          },
          {
            "key": "SourceRange",
            "value": "ATTACKER_CIDR"
          },
          {
            "key": "datasource_path",
            "value": "journalctl-_SYSTEMD_UNIT=endlessh@23.service"
          },
          {
            "key": "datasource_type",
            "value": "journalctl"
          },
          {
            "key": "destination_ip",
            "value": "VICTIM_PREFIX.139"
          },
          {
            "key": "destination_port",
            "value": "23"
          },
          {
            "key": "log_type",
            "value": "endlessh_accept"
          },
          {
            "key": "machine",
            "value": "VICTIM_MACHINE"
          },
          {
            "key": "service",
            "value": "endlessh"
          },
          {
            "key": "source_ip",
            "value": "ATTACKER"
          },
          {
            "key": "source_port",
            "value": "59734"
          },
          {
            "key": "timestamp",
            "value": "2023-07-02T17:00:04.379Z"
          }
        ],
        "timestamp": "2023-07-02T17:00:04.379Z"
      }
    ],
    "events_count": 4,
    "labels": null,
    "leakspeed": "24h0m0s",
    "machine_id": "e9f7f4e827034fd59271a062e2d8a2d3AK3KwObeypixPPUb",
    "message": "Ip ATTACKER performed 'custom/scenario' (4 events over 3m28.53867533s) at 2023-07-02 17:00:04.423655844 +0000 UTC",
    "meta": [
      {
        "key": "destination_ip",
        "value": "[\"VICTIM_PREFIX.236\",\"VICTIM_PREFIX.139\"]"
      },
      {
        "key": "destination_port",
        "value": "[\"23\"]"
      }
    ],
    "remediation": true,
    "scenario": "custom/scenario",
    "scenario_hash": "",
    "scenario_version": "",
    "simulated": false,
    "source": {
      "as_name": "AS_NAME",
      "as_number": "AS_NUMBER",
      "cn": "DE",
      "ip": "ATTACKER",
      "latitude": 50.1103,
      "longitude": 8.7147,
      "range": "ATTACKER_CIDR",
      "scope": "Ip",
      "value": "ATTACKER"
    },
    "start_at": "2023-07-02T16:56:35.88498122Z",
    "stop_at": "2023-07-02T17:00:04.42365655Z",
    "uuid": "2978a17b-5d27-49aa-8d4b-f563fe159bef"
  }
]

Thanks for your help!

What happens if you don’t set a cache_size? as that splits the bucket into chunks and when it overflows it has current chunk + overflow chunk event?

If unsettling this value give the desired output we know which part of the codebase to look for the bug

However, 4 events is correct because the capacity of the bucket is the top level limit meaning it needs 4 events to overflow. if you want it to overflow at 3 the capacity should be 2. I am more concerned about the lack of events in the alert object.

I have spoken to the team and this is intended function for cache_size.

Since it will increase the underlying bucket number but not hold the event it will loose these event details. However, since your bucket has a low capacity it should not really cause issues not having this.

I add a information note in the documentation about this to make it clearer.

Added a warning about this within the documentation