All non-human traffic to a website is bot traffic, and it might surprise you that close to 50% of all internet traffic comes from bots, and in 2019, 24.1% of the total internet traffic comes from bots performing malicious tasks, the bad bots. This is actually the highest ever since the history of the internet.
When not managed properly, these bad bots can cause various damages to your server from content scraping to serious data breaches, and this is why protecting your server from these bots is essential.
However, protecting your server from bots isn’t as simple as simply blocking all incoming bot traffic. In fact, this can be counterproductive. This is why learning how we should manage these bots is very important, and here we will learn how.
To Block or Not To Block
Why is blocking all bots isn’t a good idea in protecting your server? There are three main reasons:
- Not all bots are bad. In fact, there are good bots that may be beneficial for your website and you wouldn’t want to block them. For example, we wouldn’t want to stop Google’s bot from crawling and indexing our site, or else we won’t be ranked on Google’s SERP.
- Bad bots will mask themselves. Today’s bot programmers are really skillful and are very quick in adopting the latest technologies (including AI and machine learning) to create smart bots that can bypass your detection system. They can mimic human behaviors while rotating between a lot of different IP addresses. If you are not careful, you might end up blocking your valuable human users in the process.
- Attackers can use it against you. When you block a bot and give an error message, the attacker can use the information to modify the bot, so it will come back stronger than ever. The attacker can also send different types of bots to ‘test’ why the bots are being blocked, and will quickly bypass your security measures.
This is why thoroughly blocking all bot activities (or also known as ‘blackholing’), is generally not a good idea unless you are 100% sure that you are only blocking the malicious bots. Instead, we should manage the bot activities accordingly, which we will discuss below.
How Does Bot Management Work?
With the increasing capabilities of today’s malicious bots, bot management techniques have also evolved throughout the years to tackle the above challenges. However, although there can be various bot management techniques used, they can generally be divided into three main approaches:
- Fingerprinting-based (static) approach: in this type of approach, the bot management software gathers as much information from the suspected bot traffic and compares the information with known fingerprints like OS type, browser type, header information, IP address, and so on. This technique is static (or passive) because it can only detect bots with known fingerprints. So, if the bot is using a brand new technique, this approach might not be able to detect it.
- Challenge-based approach: the idea of this approach is to test the client with challenges that are easy enough for humans to solve, but very difficult if not impossible for bots. CAPTCHA is the most common form of challenge-based bot mitigation approach but with the rise of various CAPTCHA farms, they are no longer effective.
- Behavioral-based (dynamic) approach: in this type of approach, the bot management software analyzes the client’s action patterns and compares them with a known baseline pattern to verify whether it is a human or bot. Advanced bot protection solutions like DataDome can use AI and machine learning to continuously improve themselves in recognizing bot behaviors and activities, making it a dynamic and continuously-improving approach.
When the bot is detected to be malicious, there are several different approaches to manage this traffic:
- Block (blackholing)
If the bot management solution is 100% sure that blocking is the best approach, then blocking the traffic altogether is the most effective and cost-efficient approach since you no longer need to process its requests in any way. However, as discussed above this can be counterproductive in combating persistent attackers that will simply modify the bot code and attempt another attack.
You still allow the bot to access your resources, but (severely) limit the processing speed to slow down its operation. Bots are operated with resources, which can be expensive, and if you slow them down enough, the attacker might no longer see your site as a profitable target and move on.
You can redirect the bot to another (fake) page with similar but false information to fool the attacker. This way, the bot won’t be able to access your original content, but the attacker won’t be alerted. You can also feed fake content to the bot with, for example, modified pricing information to fool the attacker.
- Invisible challenge
For example, you can ask the user to move the cursor or type something in a mandatory form field to stop the bot activity in its track.
To really protect your server from bots, especially malicious bots, a proper bot management solution is necessary.
A good bot manager solution should be able to:
- Identify between bots and legitimate human users
- Identify between good bots and bad bots, and the bot’s reputation
- Analyze bot’s behavior, even if it’s a brand new bot
- Add good bots to whitelist/allowlist
- Manage any suspected bots accordingly, including serving alternative content or rate-limit/throttle the bot
Failure to protect your server with a reliable bot management solution might expose your server to various attack vectors like DDoS attacks, credential stuffing, and brute force attacks, content scraping, click fraud, spam attacks, and more.