
Introduction
There is a category of digital-marketing problem that almost no one talks about because it is invisible by design. Your site ranks. Your content is good. Your structured data is clean. And yet ChatGPT, Claude, and Perplexity will not cite you — and you have no idea why. In 2026, one of the most common causes is your hosting provider quietly rate-limiting AI crawlers at the firewall layer, before those crawlers ever see a page on your site. You did not opt in. You cannot see the block in your dashboard. And depending on which managed WordPress host you use, you may not be able to turn it off.
A May 6 investigation by Search Influence founder Will Scott, published in Search Engine Land's report on managed WordPress hosts blocking AI bots, confirmed what some technical SEOs had suspected for months. Several major managed WordPress platforms — most prominently WP Engine — silently apply platform-level blocks to AI training crawlers. The blocks operate at the infrastructure level, override customer-side allow rules, and are explicitly described by WP Engine support as something that “can't be selectively disabled per bot.” For the small businesses we work with across Northeast Indiana, this is a real problem hiding inside otherwise-good websites.
This post does three things. First, it explains why this is happening and which hosts are doing it as of May 2026. Second, it walks through a non-technical detection method any business owner can run in twenty minutes. Third, it gives an honest, host-by-host playbook for what to do — including the cases where some bot blocking is reasonable security and should stay on. We will be specific about the trade-offs because the sourced data is complicated and the right answer is rarely “open everything.”
Key Takeaways
- WP Engine applies platform-wide rate limits to AI bots like ClaudeBot, GPTBot, Amazonbot, and Bytespider; the policy is not customer-controllable on standard plans
- A seven-day analysis of 29,099 bot requests on a sample WP Engine site found 65.8% came from AI bots, with ClaudeBot and GPTBot rate-limited at 29% and Bytespider blocked outright at 61%
- Not every managed host does this — Kinsta's CTO publicly said in March 2026 they will not block platform-wide, and Pressable and Pantheon explicitly do not blanket-block identified bots
- Citation presence on the affected site was 37.8% in Google AI Mode but 0% in Claude and 0% in Meta AI — the pattern matches the bots that were rate-limited
- A four-step detection audit using
curlwith a bot user-agent string can confirm or rule out platform blocking in under 30 minutes - Some bot blocking is legitimate infrastructure protection; the goal is informed control, not blanket allow-listing of every crawler
- For Fort Wayne small businesses on managed WordPress, this is now the first thing to check when AI search visibility flatlines
Why Managed WordPress Hosts Are Blocking AI Bots in the First Place

Before we get to the detection steps, it helps to understand why this is happening, because the fix depends on whether the block is solving a real problem.
AI crawlers got loud, fast. According to Akamai's SOTI Security Insight Series report on the AI bot era, AI bot activity rose roughly 300% during 2025, and Search Engine Land's reporting on the publisher-side surge documented the same pattern from the publisher's perspective. We covered the small-business angle in our breakdown of the 300% AI bot traffic surge, but the short version is that bandwidth, CPU, and database load on shared infrastructure all went up at once. Hosting providers got the support tickets. Performance degraded for sites in noisy neighborhoods. Cost-of-goods-sold for managed hosts got worse without a corresponding price increase.
The numbers behind the load problem are striking. Cloudflare's published data, summarized in their AI bot audit overview, shows that ClaudeBot makes about 20,583 crawl requests for every one referral it sends back, GPTBot is about 1,255 to 1, PerplexityBot is roughly 111 to 1, and Google sits at around 5 to 1. Those ratios are not opinions; they describe an asymmetric trade. Your server pays the cost of being crawled, and your site receives a vanishingly small fraction of that cost back as user-facing traffic.
For a managed WordPress host running thousands of sites on shared infrastructure, the math points one direction: aggressive default rate limits on AI training bots, applied at the platform level, before the request ever reaches a customer's WordPress install. The Search Engine Land investigation found exactly that pattern. On a sample WP Engine site over April 4–10, 2026, 29,099 bot requests arrived in seven days. AI bots accounted for 65.8% of them. ClaudeBot was rate-limited (HTTP 429 responses) on 29% of its requests, GPTBot on 29%, Amazonbot on 51%, and Bytespider was blocked outright with 520-class errors on 61% of its attempts. PerplexityBot, ChatGPT-User, and the older anthropic-ai user-agent — all of which fetch on behalf of a live user — were not blocked.
That distinction matters, because it tells you the host is not blocking AI broadly. It is blocking the training crawlers and letting the retrieval crawlers through. As Search Influence founder Will Scott wrote in the investigation, this is “a deliberate policy, not an accident.” The policy makes sense from the host's perspective. It is also the policy that quietly removes your business from the training data of the next generation of language models. The full asymmetry is worth sitting with: as Search Engine Land's earlier guidance on log file analysis for AI crawlers and our companion log file analysis for AI crawlers post discussed, training-bot exclusion has a long tail because models are retrained on cycles of months, not days.
How Do You Tell If Your Host Is Blocking AI Bots?
You do not need to be a developer to run this audit. You need a terminal, fifteen minutes, and the user-agent strings of the major AI crawlers. Mac and Linux ship with curl built in; on Windows, the same command works in PowerShell or in WSL.
The technique published in the Search Engine Land investigation is a controlled comparison. You hit your homepage thirty times in rapid succession with a normal browser user-agent, then thirty times with the ClaudeBot user-agent string, and you compare HTTP response codes. If the browser run returns 200s and the ClaudeBot run returns a wall of 429s, your host is rate-limiting AI bots at the platform level. The exact ClaudeBot string is documented in Anthropic's published crawler information, and the GPTBot string is published in OpenAI's bot documentation. The user-agents are public on purpose — that is how site owners are supposed to identify them.
Here is the practical four-step audit:
- Confirm your host. Run
curl -I https://yourdomain.comand inspect the response headers. Thex-powered-byandserverheaders will usually name the platform. WP Engine sites typically includewpemarkers; Kinsta sites include Cloudflare and GCP markers; SiteGround and Bluehost expose their own server signatures. - Run a baseline. Send thirty rapid requests with a normal browser user-agent. Record how many return 200. On a healthy site, all thirty should succeed.
- Run the bot test. Send thirty rapid requests with the ClaudeBot user-agent string, then repeat with GPTBot, then with Amazonbot. Count 429s and 5xx errors. A platform-level block looks like a sudden jump from near-zero failures on the browser run to 25–60% failures on the bot runs.
- Verify customer-side controls are off. Inside your hosting dashboard, confirm that any “block scrapers” or “bot redirect” toggles are disabled and that no firewall rules in tools like Wordfence or Cloudflare are catching the bot user-agents. If the failures persist with all customer-side controls off, the block is platform-level.
Two cautions. First, do not run this against production at peak hours; the rapid-fire requests can themselves trigger rate limits unrelated to the bot policy. Second, even a clean test does not prove your site is being seen well by AI systems — it only proves the bots are not being blocked at the door. After detection, the next step is server-log analysis to confirm the bots are actually crawling depth and frequency that match indexing. We walk through the log-side workflow in log file analysis for AI crawlers.
Which Managed WordPress Hosts Block AI Bots — and Which Do Not?

The investigation found that “managed WordPress” is not a uniform category in 2026. The behavior splits roughly along three lines, and the differences are worth knowing before you decide whether to migrate, escalate, or stay put.
| Host | AI bot policy as of May 2026 | Customer control? |
|---|---|---|
| WP Engine | Platform-wide rate limits on ClaudeBot, GPTBot, Amazonbot, Bytespider | No (standard plans); described as “exceptional use case” |
| Flywheel (WP Engine-owned since 2019) | Same parent infrastructure; no documented difference | No |
| Kinsta | No platform-level blocking; opt-in Bot Protection with four levels | Yes (customer-controllable) |
| Pressable | No blanket bot disallow; defers to robots.txt | Yes |
| Pantheon | Explicitly does not block identified bot traffic | Yes |
| SiteGround | Blocks training bots by default, distinguishes from user-action bots | Partial; documented policy |
| Bluehost / GoDaddy Managed WP | Variable; default firewall rules often catch AI UAs | Partial |
The Search Engine Land piece quotes the WP Engine support agent directly: “WP Engine does enforce platform-wide rate limiting on certain high-impact bots to protect overall server performance, and that part can't be selectively disabled per bot.” A second support response added that “allowing AI bot IPs via Web Rules Engine does not override WP Engine's platform-wide rate limiting rules, which operate at the infrastructure level.” A third quoted line from WP Engine's support documentation declined to elaborate, citing security: “Further information cannot be provided around our firewall, as this can compromise its secure integrity.” The investigation noted that opening an internal product-engineering ticket may be possible for “exceptional” cases, but the standard customer experience is that the block stays on.
By contrast, Kinsta's CTO publicly said in Kinsta's March 2026 statement on AI bot policy that the company will not implement platform-level blocking and will not bill customers for AI bot bandwidth, while offering an opt-in Bot Protection feature with four customer-selectable levels. Pressable and Pantheon also confirmed in the investigation that they do not blanket-block identified bots. SiteGround sits in the middle: it blocks training crawlers by default but documents the policy publicly and distinguishes between training bots and user-action bots — the latter, like ChatGPT-User and PerplexityBot, are allowed through.
The citation impact is the part that should make any business owner pause. The same investigation reported that on the sample WP Engine site, citation presence was 37.8% in Google AI Mode but only 9.6% in ChatGPT, 7.8% in Perplexity, 0% in Claude, and 0% in Meta AI. Google's crawler runs on a different fetch path and was not blocked, which is why Google AI Mode citation held up. The 0% in Claude and 0% in Meta AI mirror exactly which bots were being rate-limited. Correlation is not causation, but the pattern is hard to ignore.
What Should You Actually Do About It?

The honest answer depends on how much AI search visibility matters to your specific business and how much risk you are willing to accept. Here is the framework we use with clients, in priority order.
If you can stay on your current host, prefer that. Migrating WordPress sites is expensive, error-prone, and rarely worth doing for a single visibility issue. Confirm the block exists, then file a support ticket asking for a documented allow-list at your account level. WP Engine has handled this for “exceptional use cases,” per the investigation; the request is more likely to succeed if you can attach business-impact data — for example, a Search Console export showing your AI-referrer traffic before and after a known crawl-rate change.
Consider migrating only if AI search is core to your business model. A B2B SaaS, a publisher, or a content-marketing-led service business stands to lose meaningful pipeline if the next two model generations are trained without their content. A local plumber probably does not. We covered the structural argument in your website as the source of truth in local AI search; the more your business depends on being summarized by an AI rather than clicked, the more this matters.
Make sure customer-side controls are clean before escalating. In the dashboard, disable any “bot blocker” or “scraper protection” features that catch known AI user-agents. Audit Cloudflare WAF rules, Wordfence configs, and .htaccess includes for user-agent matches. If you have an LLMs.txt for AI discoverability file, confirm it is reachable. None of these will defeat a platform-level rate limit, but they will eliminate the customer-side noise so the conversation with support is about the platform layer specifically.
Do not blanket-allow every bot. Some bots are not AI-related, and some AI bots are abusive. The Bytespider behavior of being blocked at 61% on the WP Engine sample is a reasonable response to a crawler that has historically ignored robots.txt. The goal here is informed control over the four or five bots that materially affect AI citation presence — primarily GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, and Google-Extended — not a wholesale removal of bot defenses.
Pair the fix with structural AEO work. Even with crawlers allowed in, AI systems still need extractable content. We unpacked the structural side in our answer engine optimization fundamentals post, and the no-JavaScript fallbacks for AI crawlers piece walks through the rendering issue that is the second most common cause of AI-citation gaps. Allowing the bot in is necessary but not sufficient.
A Worked Detection Example: What the Output Actually Looks Like

The detection method described in the Search Engine Land investigation returns specific patterns. Here is what each result looks like in practice and what it means.
A clean run, where no platform blocking is present, looks like 30 successful 200 responses on the browser user-agent and 28–30 successful 200s on the ClaudeBot user-agent. There may be one or two 429s as the request volume spikes — that is normal rate limiting kicking in temporarily — but the failure rate is similar across user-agents. This is what Kinsta- and Pantheon-hosted sites typically return.
A platform-blocked run looks different. The browser user-agent returns 30 clean 200s, and the ClaudeBot run returns a wall of 429s at the same rate the investigation observed — roughly one in three requests on WP Engine's sample site. The same pattern, often more severe, shows up for Amazonbot and Bytespider. If you see this asymmetry, the block is not at your WordPress install or your plugins; it is upstream, at the host or its CDN/WAF layer.
A configuration error looks similar but is different in kind. If a WAF rule on your account is matching the ClaudeBot user-agent and returning 403 instead of 429, the block is yours to remove. The 403-vs-429 distinction is the telltale: rate limits typically return 429, while access-denied rules typically return 403 or, on hosts that prefer to obscure, return a generic 5xx.
There is one more category worth flagging: legitimate intermittent failure. AI training crawlers are not gentle, and a small site with a 1-vCPU PHP-FPM pool can buckle under an aggressive ClaudeBot crawl. If you see 5xx errors that scale with bot request rate but not with overall traffic, the issue is server capacity, not policy. The fix is infrastructure, not a support ticket.
What This Means for Fort Wayne and Northeast Indiana Small Businesses
Most of the small businesses we work with across Auburn, Fort Wayne, and Allen County run their websites on managed WordPress. The reasons are reasonable: managed WordPress takes care of updates, security patches, and backups so a business owner can focus on running the business. The trade-off has always been that you give up some control over the infrastructure — and in 2026, that trade-off now includes a quiet say in whether AI systems can train on your content.
The practical implication is local. A DeKalb County HVAC contractor's home page, a Fort Wayne dental practice's services pages, a Northeast Indiana law firm's attorney bios — these are the exact pages that AI assistants pull from when someone asks “best HVAC company in Auburn” or “dentist near Aboite Township.” If your host is rate-limiting the bots that build those answers, you are slowly disappearing from the layer of search that small businesses cannot afford to lose. The audit in this post takes thirty minutes and is the cheapest first step.
We also want to be honest about the limits. Not every Fort Wayne small business needs to optimize for ChatGPT or Claude citation. A trade business that runs almost entirely on Google Local Pack rankings and Google Maps presence is not getting much from AI training crawlers, and the marginal benefit of unblocking ClaudeBot is small. Talk to your team — or to us — about which channels actually drive business before deciding how much to invest in the fix.
How Button Block Helps
If you are not sure whether your managed WordPress host is blocking AI bots, we run the audit as a free first step on engagements that include AEO scope. We will check your host configuration, run the curl-based detection sequence, pull thirty days of server logs to confirm crawler depth, and write you a one-page report that says either “your host is blocking [bots], here is what to do” or “your host is not the issue; the visibility problem is structural.” If structural work is needed, our Next.js web development team builds AI-friendly site architectures from the ground up — server-rendered, machine-readable, and fast — so you do not have to fight your host for crawl access in the first place. Reach out and we will tell you what we find.
Want to Know If Your Host Is Quietly Blocking AI Crawlers?
Button Block runs the four-step managed-WordPress bot audit as a free first step for clients in Auburn, Fort Wayne, and Northeast Indiana. If your AI search visibility has plateaued, the cause may be upstream of your content. Let us check.
Frequently Asked Questions
Frequently Asked Questions
- Is my managed WordPress host blocking AI bots?
- The fastest way to find out is to send 30 rapid HTTPS requests to your homepage with the ClaudeBot user-agent string and compare the response codes to a 30-request run with a normal browser user-agent. If the bot run returns a wall of 429 responses while the browser run returns 200s, your host is applying a platform-level rate limit. The full four-step audit takes about 30 minutes and does not require any coding.
- Which managed WordPress hosts block AI bots in 2026?
- As of May 2026, WP Engine applies platform-wide rate limits to ClaudeBot, GPTBot, Amazonbot, and Bytespider that customers cannot disable on standard plans. SiteGround blocks training bots by default but documents the policy. Kinsta, Pressable, and Pantheon do not blanket-block identified bots and offer customer-controllable bot protection instead.
- Is blocking AI bots always wrong?
- No. Some AI crawlers, like Bytespider, have a documented history of ignoring robots.txt and creating significant server load. A reasonable policy is informed control: allow GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, and Google-Extended for AI search visibility, while still defending against abusive scrapers. The problem is when the policy is invisible to you and overrides your preferences.
- Will allowing AI bots increase my server costs?
- It can, especially on shared or budget plans. Cloudflare’s published data shows ClaudeBot crawls roughly 20,583 times for every one referral it sends back, so you pay the bandwidth and CPU cost without a proportional traffic return. On a managed plan with metered bandwidth, monitor the cost for a billing cycle after unblocking before deciding whether the visibility benefit is worth the operating cost.
- Does this affect Google rankings?
- Generally no. The Google bots that drive traditional search rankings — Googlebot and the related image and video crawlers — are not the same as Google-Extended (the AI training bot) or any of the third-party AI crawlers. Hosts blocking AI training bots have generally not been blocking Googlebot, so traditional organic rankings stay intact. AI Mode citations, however, do appear to be affected because Google AI Mode pulls from a different fetch path that often is not platform-blocked.
- What if my host says they cannot disable the block?
- That is the standard WP Engine response on standard plans. Three options remain. Ask for a documented account-level allow-list as an "exceptional use case" — the investigation found this has been granted in some cases. Migrate to a host that does not platform-block, such as Kinsta or Pantheon. Or accept the block and focus on the AI search channels (Google AI Mode, in particular) that your current host’s policy does not affect.
- What does this mean for a Fort Wayne small business on managed WordPress?
- Most Fort Wayne and Northeast Indiana small businesses run on managed WordPress, so the blocking pattern in this post applies directly. The 30-minute audit is the cheapest first step: if your host is rate-limiting ClaudeBot and GPTBot, you are slowly disappearing from the AI answers your future customers see when they ask "best HVAC in Auburn" or "dentist near Aboite." Pair the bot fix with the upstream AEO work — LLMs.txt, structured data, and clean entity definitions tell AI systems what to do with your content once they reach it, but only after the host stops blocking the door.
Sources & Further Reading
- Search Engine Land: searchengineland.com/managed-wordpress-blocking-ai-bots-476510 — Is your managed WordPress host blocking AI bots without telling you?
- Search Engine Land: searchengineland.com/ai-bot-traffic-surged-publishers-report-473900 — AI bot traffic surged 300% in 2025, publishers report
- Search Engine Land: searchengineland.com/log-file-analysis-ai-crawlers-search-visibility-474428 — Log file analysis for AI crawlers and search visibility
- Akamai Technologies: akamai.com/resources/state-of-the-internet — SOTI Security Insight Series: Navigating the AI Bot Era
- OpenAI: platform.openai.com/docs/bots — OpenAI GPTBot crawler documentation
- Anthropic: support.anthropic.com/en/articles/8896518 — Anthropic ClaudeBot crawler information
- Cloudflare: blog.cloudflare.com/ai-audit-overview — AI bot crawl-to-referral ratio data
- Kinsta: kinsta.com/blog/ai-bots — Kinsta on AI bot policy and bot protection
