At the end of last year, I ran a poll that received nearly 100 responses about the percentage of bot traffic on the web.
The results were fascinating - mainly because of how wrong I was.
Whether it reflects my immediate circle or a broader pessimism about the state of the web, the majority of respondents were wildly off the mark.
The Reality of Bot Traffic
According to Cloudflare Radar (see data), bot traffic consistently accounts for about 30% of total web traffic.
This figure is far lower than most people—at least within the confines of my small poll—seem to believe.
It’s worth acknowledging there are likely some caveats about how bot traffic is measured and categorised, but however you frame it, this number is surprising to many in the SEO and broader web industry.
30% Bot - A Small Percentage or a Big Problem?
On the one hand, 30% might feel like a relatively small slice of the pie. On the other, when you think about the implications of nearly a third of all web traffic being non-human, the scope starts to sink in.
Every interaction—every server request—is consuming bandwidth and computational power to serve data that, in many cases, will only be partially used. Much of this is consumed by bots and scrapers, delivering minimal value to the majority of human users.
The Good Bots
Not all bot traffic is bad, of course. Search engine crawlers and AI providers scraping the web play crucial roles. These bots power the tools and services that most of us rely on daily—from accurate search results to AI models that enhance productivity. In these cases, the benefit of bot traffic is evident and widely distributed.
The Questionable Bots
Beyond these essential functions, you have to wonder: what purpose does the rest of bot traffic serve? What value does it provide? And is it ultimately worth the cost? Questionable bots could fall into many categories, but here are a few:
Scraping for unauthorised data collection: Bots that harvest data without permission, often repurposing it for commercial or malicious ends. Think prices, stock data, personal information etc.
Credential stuffing or brute force attacks: Malicious bots attempting to gain unauthorised access to user accounts.
Ad fraud bots: Bots generating fake clicks on advertisements, draining marketing budgets without delivering value.
Spam bots: Bots that post irrelevant or harmful content on forums, comment sections, or social media platforms.
Search Manipulation: Bots that generate fake traffic to manipulate rankings, often causing harm to genuine websites.
Excessive monitoring or scraping: Even well-intentioned bots, such as competitor monitoring tools, can overload servers with unnecessary requests.
There are hundreds/thousands of other potential bot activity you should be aware of, but aren't specifically for any kind of greater good.
Potential Mitigation of Bad Bots
There is something a well-intentioned webmaster/website owner can do to take targeted measures to mitigate the impact specifically to you.
To address these questions, here are some actionable steps to mitigate questionable bot activity:
Monitor bot traffic: Use analytics tools to understand the scale and nature of bot interactions on your site.
Implement bot management solutions: Consider tools like bot detection services to filter out non-essential bots.
Encourage the use of APIs: Provide accessible APIs to reduce the need for scraping and to streamline data sharing. This is a more advanced case for people that have heavily-scraped sites AND well-structured data.
Set up rate limiting: Restrict the frequency of requests from specific IP addresses or agents to protect your resources. BUT BE CAREFUL of which bots you limit here.
By taking these steps, we can collectively work towards a more efficient and sustainable internet.
Being Mindful of the Cost of Bot Activity
Years ago, I attended a talk where a speaker demonstrated how to scrape the Days of the Year website for content marketing ideas. During the Q&A session, Jono Alderson, who happened to be in the audience, politely pointed out an alternative: “Please use the free API; don’t scrape the site!”
This exchange was a pivotal moment for me. It was the first time I truly considered the “cost” of scraping—not just in terms of server load but also in terms of ethical use of resources. It made me question how often we default to scraping when structured data, APIs, or data feeds are readily available. How different might the web be if we embraced these more efficient tools?
Who Feels the Burden of Bot Traffic?
For large websites, bot traffic’s costs are glaringly apparent. These sites often deploy sophisticated tools to manage and mitigate bot activity.
Smaller websites, on the other hand, experience such low volumes of bot traffic that it barely registers as an issue. For those outside the web development or digital marketing worlds, the idea that bot traffic has any “cost” might be an entirely foreign concept.
Yet, the cost is not nothing, consider:
That LinkedIn post you just viewed?
Your out-of-office email that gets pinged?
The rank tracker collecting keyword positions?
The automated Screaming Frog crawl set to fully render JavaScript?
All of these interactions, powered by bots, consume bandwidth and server resources—and they all come at a cost.
Are We Doing Enough to Tame the Bots?
Bot traffic isn’t inherently good or bad, but its prevalence raises important questions about the way we build and use the web. Are we doing enough to minimise unnecessary bot activity?
Are we leveraging APIs, structured data, and other efficient tools instead of defaulting to scraping?
For those who create or manage digital ecosystems, these questions aren’t just theoretical; they’re critical to the sustainability of the web.
The next time you set up an automated process, pause and consider the cost—not just to your own operations, but to the ecosystem as a whole. The web, after all, is a shared resource. Let’s treat it with care.