Nginx parser and scenario for malformed HTTP requests

Hi,

I’m trying to write a custom nginx parser and scenario to detect attacks that are completely invalid HTTP requests. See a sample below. I came across another post that handles many, but not all of these types of requests. As shown below, not all of these requests start with backslash. One is simply HELP and nothing else.

I thought about writing a very generic parser that would parse the entire request payload as just a single field, but then it would also catch valid requests.

Is there any way to write this as a lower priority catchall parser, so it only activates if higher priority parsers, like the standard nginx-logs parser fails?

Sample:

172.105.110.211 - - [29/Jul/2023:10:26:06 -0700] "lv|'|'|VHJvamFuX0M0NkY2RTk=|'|'|MARK|'|'|user|'|'|2013-11-22|'|'||'|'|Win XP|'|'|No|'|'|0.6.4|'|'|..|'|'||'|'|[endof]" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:06 -0700] "Gh0st\xAD\x00\x00\x00\xE0\x00\x00\x00x\x9CKS``\x98\xC3\xC0\xC0\xC0\x06\xC4\x8C@\xBCQ\x96\x81\x81\x09H\x07\xA7\x16\x95e&\xA7*\x04$&g+\x182\x94\xF6\xB000\xAC\xA8rc\x00\x01\x11\xA0\x82\x1F\x5C`&\x83\xC7K7\x86\x19\xE5n\x0C9\x95n\x0C;\x84\x0F3\xAC\xE8sch\xA8^\xCF4'J\x97\xA9\x82\xE30\xC3\x91h]&\x90\xF8\xCE\x97S\xCBA4L?2=\xE1\xC4\x92\x86\x0B@\xF5`\x0CT\x1F\xAE\xAF]" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:06 -0700] "145.ll|'|'|SGFjS2VkX0Q0OTkwNjI3|'|'|WIN-JNAPIER0859|'|'|JNapier|'|'|19-02-01|'|'||'|'|Win 7 Professional SP1 x64|'|'|No|'|'|0.7d|'|'|..|'|'|AA==|'|'|112.inf|'|'|SGFjS2VkDQoxOTIuMTY4LjkyLjIyMjo1NTUyDQpEZXNrdG9wDQpjbGllbnRhLmV4ZQ0KRmFsc2UNCkZhbHNlDQpUcnVlDQpGYWxzZQ==12.act|'|'|AA==" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:06 -0700] "H\x00\x00\x00tj\xA8\x9E#D\x98+\xCA\xF0\xA7\xBBl\xC5\x19\xD7\x8D\xB6\x18\xEDJ\x1En\xC1\xF9xu[l\xF0E\x1D-j\xEC\xD4xL\xC9r\xC9\x15\x10u\xE0%\x86Rtg\x05fv\x86]%\xCC\x80\x0C\xE8\xCF\xAE\x00\xB5\xC0f\xC8\x8DD\xC5\x09\xF4" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:06 -0700] "HELP" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:06 -0700] "\x1B\x84\xD5\xB0]\xF4\xC4\x93\xC50\xC2X\x8C\xDA\xB1\xD7\xAC\xAFn\x1D\xE1\x1E\x1A3*\x85\xB7\x1D'\xB1\xC9k\xBF\xF0\xBC" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:07 -0700] "batman" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:07 -0700] "\x16\x03\x01\x00t\x01\x00\x00p\x03\x01YF}\xF6\x7F3\xD3\xA2'O\xAE\xB6\x041p\x87F\xE5\xA6\xA2\x18\xD1\x0B}\x0C\x9FO)u\xFE\xB1\xD9\x00\x00\x18\xC0\x14\xC0\x13\x005\x00/\xC0" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:07 -0700] "\x01\x82\x00\x00\x00\x01,\xEF:\xE7\x89\xFEH\xAF\xAC\xF8\xC1Pq\xD7\xC3\xE8S\x8A\xD6:\x17\xD93\x14o)S}\xBB\xBB\x97b\xCE\xB6\x0B\x9B\xB97>\x01\xCFv\xAE\xA0E\xB6D\xEA\xE1\xEAA\xC4\xDB\xEE\x09\xAC\xFB\xF0\x84)k\xBBc\x18]V\x85V\xC5_\x05T\x0Bt\xC4\x0B\xBE\xB5w\xBCM=[1\xE1\x06\x9C\xFD\xD3g^\xE3\x01\x9BK\xD7\xFC>\xFFk\xAF\x95\x99\xFB\xDBH\x90\x8BD\x88`k\x92\xF5e\x1C\xAA\xBB{_LP\x15\x85\x1E\x0E\x8F\xDD\xC5J" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:07 -0700] "\xBD\xFF\x9E\xFFE\xFF\x9E\xFF\xBD\xFF\x9E\xFF\xA4\xFF\x86\xFF\xC4\xFF\xBE\xFF\xC7\xFF\xDB\xFF\xEE\xFFx\x5Cd9\xFF\xED\xFF\xA4\xFF\x9D\xFF\xCF\xFF\xD8\xFF\xE5\xFF\x04\xFF\x12\xFF0\xFF\xB1\xFF\xBD\xFF\xE7\xFF\xE2\xFF\xDD\xFF\xDC\xFF\xDE\xFF\xC8\xFF\xCC\xFF\xBE\xFF\xF8\xFF&\xFF\x01\xFF\x0F\xFF\xF5\xFF\x06\xFF\xFF\xFF\xF7\xFF!\xFF\xDE\xFF\x02\xFF&\xFF\x0C\xFF\x01\xFF\xF5\xFF" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:07 -0700] "A\x00\x00\x00\x03fH\xBBd~\x8E\xFC\x94g\xD2\xDB\xFC\xEE\x8D\xFF\x98 \xB1\xBET\xA4\x9AZ\x9A\xA0?\x90\xE0\xF2t0\x5C\xED\xAE\xACX\x98\xDEJ\xEC\xF2\xC8\x9Cl\xD0\x9C\xC0\xE0\x98\x12\x8F\xE7\xCB\x8F\xA1\xA3\x16\xF1J\xA9<\xBD\xDA`" 400 157 "-" "-"
172.105.110.211 - - [29/Jul/2023:10:26:07 -0700] "\x09\x12;Bo3\xA2D\xFD\x01\x86si=\xAE\x12\xBB\xC6\x19\xFD\x1A:\xF3\x11\xC9\xAE\xDA<0\xBC8\x81\x9E\x00\x0F\xCAN\xFB\x05\xC6\xDE\xB7<oN\x01\xA2\x87\x82\xF5/\x8E\xED*\x1F\x0E\xB7C\x0C\xA04]\xBD\x80PVf\x1A\x11\xAF\xF5\xC8\xA3\x16+b\xB1\xD7" 400 157 "-" "-"

If you add the grok pattern to the nodes within the default nginx parser it will be treated in that order so if it pass first or second then it wont go to last which would be your catch all.

That’s great. It worked perfectly, thanks! I added a node to detect generic requests and an accompanying scenario. Here’s the node, in case it helps anyone else:

  - grok:
      pattern: '(%{IPORHOST:target_fqdn} )?%{IPORHOST:remote_addr} - (%{NGUSER:remote_user})? \[%{HTTPDATE:time_local}\] "%{DATA:request}" %{NUMBER:status} %{NUMBER:body_bytes_sent} "%{NOTDQUOTE:http_referer}" "%{NOTDQUOTE:http_user_agent}"( %{NUMBER:request_length} %{NUMBER:request_time} \[%{DATA:proxy_upstream_name}\] \[%{DATA:proxy_alternative_upstream_name}\])?'
      apply_on: message
      statics:
        - meta: log_type
          value: http_malformed-log
        - target: evt.StrTime
          expression: evt.Parsed.time_local