Http-bad-user-agent false positive for censys scans

hi community I added crowdsec a month ago to my self-hosted stack with traefik, Nextcloud and few other services. crowdsec is only configured to parse traefik reverse proxy log. I’m seeing many alerts (~80% of all alerts) generated by

crowdsecurity/http-bad-user-agent for 398324 CENSYS-ARIN-01 (or 398324 CENSYS-ARIN-02).

While on the first glance it seem valid censys is a security researchers project frequently scanning the whole internet. according to their description which matches my logs they really don’t do any harm and perform very limited scans initiated from well-documented IP sources without any visible bad intent.

For me creating “real” alerts for “known good” actors is meaningless and adds lot of noise covering real attacks.

I think following valid reactions exist to address this problem:

  • blocking IP addresses from censys on the ingress level avoiding the to reach the application
  • create a whitelist to mitigate (IMO) false-positives or add their IP to known-good list

what is your opinion - do I miss something in terms of this topic?

So even though I work for CrowdSec I give you my personal opinion

Censys is a security research company they allow pretty much anyone to search their datalake using a wide array of data points, the reason I point this out is lets say you run a specific application which exposes an identifier and a nasty CVE gets released then somebody can quickly find your exposed service by querying censys. (Yes their are others out their that do this also but I block them too)

Now I know “but anyone can just scan the whole internet in 10 minutes using serverless” so it no silver bullet but I rather not have that information stored by them :laughing:

So you can do whatever you want it your own personal take on security if you want them to be able to search and probe your infrastructure then create an ip whitelist or even you could whitelist their ASN if you wanted.

hi @iiAmLoz thank you for a reply. The point is not about allow or block - and also not about if it adds any value to block this research company knowing many good and bad actors scan the internet all the time. Blocking known scanners is fully valid approach.

My intention was to understand why local alerts generate and improve the visibility of “more interesting” scans by reducing the noise. I’m wondering why their IPs are not permanently on a ban list assuming they scan the whole internet few times a day and each crowdsec system generates signals for this scans in a CAPI… I would expect enough signals arrive to block the scanner all the time… Now I have a list of 393 alerts in total and 309 of them belong to CENSYS catched by crowdsecurity/http-bad-user-agent. IMHO this continuous noise could be addressed better e.g. by permanently adding their scanner to some block list - please give me a hint if I don’t get some important point.

It’s because censys reached out to us and followed our whitelist guidelines for security companies to prevent them from entering our community blocklists. However, we didn’t want to make a decision to whitelist them locally for users, so it up to the user if they want to block their ranges by providing a manual decision (that a long duration) or simply whitelist them if they determine them to be “false positives”.

CTI Search (must be logged in) : CrowdSec Cyber Threat Intelligence | CrowdSec Console

However, as you can see we still provide them via other blocklists, it just the community blocklist in particular they are whitelisted from.

1 Like

thank you Laurence really appreciated.

I’m fine to handle this one on my own but in general don’t you feel it counter-intuitive to keep the company/ip range away from the regular community block list but keep blocking their user-agent? for me the decision should be consistent - either treat them equal and act according to general rules or exclude from every blocklist?

for the reference