How to block by http referer

Hi,
new user here.

I’ve installed crowdsec all ok.
Very easy to get on board. All works fine out of the box.

But I’m having a hard time with some basic tasks - like adding a custom collection, scenario, parser.

Question:
How to block by http referer mentioned in apache error log and/or access log ?

Here is my current attempt:

  1. Created custom collection file:
    /etc/crowdsec/collections/apache2-custom.yaml
name: custom/apache2-custom.yaml
description: "custom apache2 scenarios"
author: custom
parsers:
  - crowdsecurity/http-logs
#generic post-parsing of http stuff
  - crowdsecurity/apache2-logs
scenarios:
  - custom/http-referer-abc-com
tags:
  - linux
  - apache2
  - crawl
  - scan
  1. Created this scenario file:

/etc/crowdsec/scenarios/http-referer-abc-com.yaml

type: trigger
format: 2.0
#debug: true
name: custom/http-referer-abc-com
description: "block hosts that cause referer: abc.com in our logs"
filter: |
  evt.Meta.log_type in ["http_access-log", "http_error-log"] and
    evt.Parsed.message contains ",referer: abc.com"
groupby: "evt.Meta.source_ip"
# blackhole: 2m
labels:
  type: scan
  remediation: true

Example apache log lines are:

  • error.log example:
[Mon Oct 25 04:17:18.547111 2021] [authz_core:error] [pid 17689] [client 12.34.45.56:60815] AH01630: client denied by server configuration: /path/to/website.com/web/.well-known, referer: abc.com
  • access.log example:
website.com:443 12.34.45.56 - - [25/Oct/2021:11:05:52 +0200] "GET /web/.well-known HTTP/1.1" 403 6963 "abc.com" "User-agent-string"
  • abc,com is the referer I want to block
  • 12.34…45.56 is the IP address of the “attacker”

Any help appreciated.

PS. There is no docs about evt.Parsed structure - my guess is evt.Parsed.message is the whole log file line, but I’m not sure.

More info:

I restart crowdsec via:
systemctl restart crowdsec

Which is better - restart or reload - each time I make changes to the custom yaml files?

The ‘cscli metrics’ output seems fine - looks like the apache logs are geting parsed OK. I suppose my error is in the evt.Parsed.messsage part.

Or I’m declaring the wrong parers in my custom collection yaml.

Hello @R67 ,

PS. There is no docs about evt.Parsed structure - my guess is evt.Parsed.message is the whole log file line, but I’m not sure.

Yes it is, and your filter should work if “referer: abc.com” is present in your log line.

Can you try something like this instead:

  evt.Meta.log_type in ["http_access-log", "http_error-log"] and
    evt.Parsed.referrer contains "abc.com"

If this doesn’t work, can you provide an example of a log line with a referer please ?

About the reload/restart question, reloading crowdsec when you modify a configuration file should be enough.

Thanks, @alteredCoder, using these worked

    evt.Parsed.referrer contains "abc.com"

# this also worked:

    evt.Parsed.message contains "abc.com"

This works because of access.log lines match.

Example access.log line:

website.com:443 12.34.45.56 - - [25/Oct/2021:11:05:52 +0200] "GET /web/.well-known HTTP/1.1" 403 6963 "abc.com" "User-agent-string"

Example errorlog line:

[Mon Oct 25 04:17:18.547111 2021] [authz_core:error] [pid 17689] [client 12.34.45.56:60815] AH01630: client denied by server configuration: /path/to/website.com/web/.well-known, referer: abc.com

I just found out, that my error log is not parsed OK (by using cscli metrics) - because I use a non-default format in my apache config for the error log, and crowdsec can’t parse it.

So questions:

  1. how should I tweak, override the default error log parser?
    It seems it is misconfigured and hinders all my default crowdsec collections/scenarios. And the custom one(s) too.
    a) no easy/clean way to do that and so I should revert my apache config to the defaults, and then configre a 2nd custom errror log - for my needs
    b) copy the parser files and edit the collections/scenarios that reference them
    c) edit the parser files

b) and c) seem like bad ideas - hard to maintain in the future
And yet, if I go with a) - in general not having a way to fine-tune the parsers kinda sucks.

  1. Maybe start a new topic in the forum about that one ^ ?

  2. Where can I read some docs/howto-s about creating custom parsers, tweaking/overriding the default ones ?

Hello @R67,

how should I tweak, override the default error log parser?

I think the best way is to create a new parser for that:
1 - create your parser in /etc/crowdsec/parsers/s01-parse/apache_custom_error.yaml
2 - You can use this skeleton for your parser:

filter: "evt.Parsed.program startsWith 'apache2'"
onsuccess: next_stage
name: r67/apache2-custom-error-logs
description: "Parse Apache2 custom error logs"
nodes:
  - grok:
      pattern: <YOUR CUSTOM GROK HERE>
      apply_on: message
      # these ones apply for both grok patterns
      statics:
        - meta: log_type
          value: http_error-log
        - target: evt.StrTime
          expression: evt.Parsed.timestamp
        - meta: service
          value: http
        - meta: source_ip
          expression: evt.Parsed.clientip
    onsuccess: next_stage

and adapt it to your need (don’t forget the modify the grok with your custom format and add/remove the wanted statics)
3 - Then restart crowdsec, your custom errors log should be parsed :slight_smile:

  1. Maybe start a new topic in the forum about that one ^ ?

Yes this can be a good idea if it is not clear on what to do !

  1. Where can I read some docs/howto-s about creating custom parsers, tweaking/overriding the default ones ?

Here is the documentation to create new parser: Creating parsers | CrowdSec

Thanks a lot!

Final question:
I’ve decided to use a local file for the offending referrers - so I created a local file:
/var/lib/crowdsec/data/bad_referers_custom.txt

I put in my scenario:

description: “block hosts using offending referers”
filter: |
evt.Meta.log_type == ‘http_access-log’ and any(File(‘bad_referers_custom.txt’), {evt.Parsed.referrer == #})"
data:

  • source_url: “…”
    dest_file: bad_referers_custom.txt
    type: string

So I don’t want to host the data file at some public URL. How to use a local file in the filter?

This is not working ? Have you some errors ?

The File() function take a filepath relative to the /var/lib/crowdsec/data/ folder. So your filter should work.

at first I had a non-matching double quote and that lead to failing reload/restart.
Then I removed the data: section and now the file looks like:

type: trigger
format: 2.0
#debug: true
name: r2/http-bad-referers-custom
description: "block hosts using offending referers"
filter: |
  evt.Meta.log_type == 'http_access-log' and
  any(File('bad_referers_custom.txt'), {evt.Parsed.referrer contains #})
groupby: "evt.Meta.source_ip"
# blackhole: 2m
labels:
  service: http
  type: scan
  remediation: true

No syntax errors on reload/restart but it does not block my test log entries.

This now works:

filter: |
  evt.Meta.log_type == 'http_access-log' and
  any(File('bad_referers_custom.txt'), {evt.Parsed.referrer contains #})
data:
  - source_url: "..."
    dest_file: bad_referers_custom.txt
    type: string

… but it uses “contains” and what I want is exact match - sth like == or “matches”.
So far I can’t make it work with exact matching though.

Tried ading quotes around #, tried adding quotes in the .txt file itself. No luck.
Tried using expr’s matches with sth like ‘^#$’ - dit not work.

docs here: expr/Language-Definition.md at master · antonmedv/expr · GitHub

The problem is that the referer is captured inside double quote like "abc.com" . I will fix the parser so it will works with ==. I keep you posted when the parser is fixed.

Hello @R67,

The fix has been merged. Can you upgrade the apache2-logs parser (sudo cscli hub update and sudo cscli parsers upgrade crowdsecurity/apache2-logs ) and retry with the == please?