CrowdSec is blocking Discord

Hey,

Today I tried to debug an issue where the discord bot responsible for these preview tiles is unable to connect to my server.

After disabling the CrowdSec bouncer and tracking the request, I’ve seen the following requests from the discord bot:

35.227.62.178 - - [29/Jun/2023:15:07:26 +0000] "GET /my/url HTTP/1.1" 200 34028 "-" "Mozilla/5.0 (compatible; Discordbot/2.0; +https://discordapp.com)"
34.148.205.167 - - [29/Jun/2023:15:07:27 +0000] "GET /my/url/image.png HTTP/1.1" 200 136117 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.6; rv:92.0) Gecko/20100101 Firefox/92.0"
34.148.151.190 - - [29/Jun/2023:15:07:28 +0000] "GET /my/url/image.png HTTP/1.1" 200 136117 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.6; rv:92.0) Gecko/20100101 Firefox/92.0"

(Yes, the 2nd and 3rd requests are also from Discord)

The first request is fetching the metadata of the page, like title and meta tags.
The 2nd and 3rd requests are getting the image set in the og:image meta tag.

Checking the IPs in the CTI revealed that the IP address 35.227.62.178 is listed in the CrowdSec Community Blocklist. (https://app.crowdsec.net/cti/35.227.62.178)

What do I have to do to allow Discord to access my website?
They probably use more than this IP address, so simply whitelisting this one will not solve the issue.

Hi,

This was discussed some time ago on Discord by some users. Unfortunately there is no unique factor to these request, no unique useragent / reverse DNS isnt from dsicord but some AWS fqdn.

However, I do note that there is a unqiue user agent so you could whitelist it locally but there issue stems from it being in the community blocklist

EDIT: i need to drink more coffee it seems before replying in the morning cause i just told you what you already knew :laughing:

Yeah so cause its in community blocklist there is no way to get around it as the CAPI whitelist is only based on IP.

Actually, they have an xxx.ptr.discord.com rdns

Tbh, I really think the IP shouldn’t be blacklisted in the first place.

So, how can we fix that?

We should be marking this in our concensus enricher as we are not marking this as a crawler furthermore us CrowdSec should have a conversation if we mark these are whitelisted from Blocklist (Does not prevent local bans).

Here is a postoverflow you can use to prevent local bans, however, at this moment in time because the IP is in the blocklist you would have to manually remove and hard code whitelist as per instructions

Local whitelisting

name: crowdsecurity/discord-crawler
description: Discord PTR whitelist
whitelist:
  reason: Discord PTR domain
  expression:
  #discord PTR
    - evt.Enriched.reverse_dns endsWith '.ptr.discord.com.'

assuming you have installed the RDNS enricher

1 Like

Also I have just started to add a collection to make it easier for users that want to achieve this.

1 Like

It has been merged and is available via hub

1 Like

Thanks a lot for your quick work!

Guess it would be a good idea to exclude them from the community blocklist
It might be a bot, but I would say a good one which usually enriches discord communities.

If someone wants to block the bot, it probably will make more sense if he “opts in” by blocking the user agent.

For a user to whitelist them afterwards is more or less impossible because Discord did not publish a list of IPs used by their bots.
And dynamically reacting to them by using the rDNS enricher afterwards is also not possible because the request never comes through as it is already blacklisted.

So, it relies on how you build the community blocklist and use the enriched rDNS information to react appropriately.

Exactly, so you can use the new collection to prevent local bans. However, we need to enricher it our side to make a decision to prevent these from going into the community blocklists.

We have a meeting schedule to discuss this next week internally.

1 Like

Hey @iiAmLoz :wave:

Do you have an update on this?

We are still investigating it at the moment. Most likely have more information in following weeks do not expect as fast turnaround as we have lots of tests to ensure the integrity of the consensus engine.