Issue
We are seeing an increase in malicious and/or heavy traffic coming from IP ranges that identify as "Huawei Cloud".
Commonly, these requests may specify user agents noted below:
Mozilla/5.0 (Linux; Android 7.0; FRD-AL00 Build/HUAWEIFRD-AL00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.49 Mobile MQQBrowser/6.2 TBS/043602 Safari/537.36 MicroMessenger/6.5.16.1120 NetType/WIFI Language/zh_CN
Mozilla/5.0(Linux;Android 5.1.1;OPPO A33 Build/LMY47V;wv) AppleWebKit/537.36(KHTML,link Gecko) Version/4.0 Chrome/42.0.2311.138 Mobile Safari/537.36 Mb2345Browser/9.0
Mozilla/5.0(Linux;U;Android 5.1.1;zh-CN;OPPO A33 Build/LMY47V) AppleWebKit/537.36(KHTML,like Gecko) Version/4.0 Chrome/40.0.2214.89 UCBrowser/11.7.0.953 Mobile Safari/537.36
Mozilla/5.0(Linux;Android 5.1.1;OPPO A33 Build/LMY47V;wv) AppleWebKit/537.36(KHTML,link Gecko) Version/4.0 Chrome/43.0.2357.121 Mobile Safari/537.36 LieBaoFast/4.51.3
Resolution
You will need to evaluate options to block this traffic, including rolling your own solution (for example, blocking traffic at .htaccess, or disabling some Drupal features like specific Views/modules/etc.) and/or considering buying a WAF or CDN with security features.
Note that sometimes it is difficult to tell "good" from "bad" traffic; you may need to use information available to you like your logfiles, doing some research and testing, etc. to get a good solution. You may need to temporarily sacrifice blocking some "good" traffic along with the bad if you are in an emergency and just need to get your site back up for most of your users.
Example: Blocking requests from some known Robots to URLs that include query string arguments
Warning: This next snippet is provided as an example; we can not guarantee it will only block "bad" traffic (nor all of it!). We can only say that it has successfully helped some sites that were under a similar traffic spike.
This snippet is meant to block some User Agents that are requesting URLs that have query strings (like Search pages with various filtering or faceting options, listings of items with dozens/hundreds of pages, etc.)
We recommend using it along with your own observation and analysis of traffic in your logs. You may need to make adjustments if you are blocking too strictly. Also you should consider periodically doing this same analysis and blocking cycle.
That said: if you decide to use this block in your .htaccess, please note the instructions. Again, remember this method and User Agent list is not exhaustive nor authoritative. You should evaluate your logs and evaluate how you will use and modify this code to suit your particular traffic pattern(s) and situation.
NOTE: Place these rules immediately after the RewriteEngine On
line in your .htaccess file.
# .htaccess section to block some known robots/crawlers on URLs where
# query arguments are present.
# * DOES allow basic URLs like /news/feed, /node/1 or /rss, etc.
# * ONLY BLOCKS when search arguments are present like
# /news/feed?search=XXX or /rss?page=21.
# Note: You can add more conditions if needed.
# For example, to only block on URLs that begin with '/search', add this
# line before the RewriteRule:
# RewriteCond %{REQUEST_URI} ^/search
#
RewriteCond %{QUERY_STRING} .
RewriteCond %{HTTP_USER_AGENT} 11A465|Ahrefs|ArchiveBot|AspiegelBot|Baiduspider|bingbot|BLEXBot|Bytespider|CCBot|Curebot|Daum|Detectify|DotBot|Grapeshot|heritrix|Kinza|LieBaoFast|Linguee|LMY47V|MauiBot|Mb2345Browser|MegaIndex|MicroMessenger|MJ12bot|MQQBrowser|PageFreezer|PetalBot|PiplBot|Riddler|Screaming.Frog|Search365bot|SearchBlox|Seekport|SemanticScholarBot|SemrushBot|SEOkicks|serpstatbot|Siteimprove.com|Sogou.web.spider|trendictionbot|TurnitinBot|UCBrowser|weborama-fetcher|Vagabondo|VelenPublicWebCrawler|YandexBot|YisouSpider [NC]
RewriteRule ^.* - [F,L]