Robots.txt Tester
Inspect how 100cuci.io controls crawler access, blocked paths, sitemap references and AI crawler rules.
Preview
Score: 92
- Sitemap URL appears to be unreachable: https://100cuci.io/sitemap.xml
Get your full report + exact fixes
See what’s hurting your SEO and how to fix it step by step.
- Full breakdown
- Actionable fixes
- Prioritized next steps
Robots.txt Status
Robots.txt Status
Present
Score
92
/100
· Strong
View Full Robots.txt
Robots.txt Content Preview
# ============================================================== # robots.txt â 100cuci.io # Last updated: April 2026 # Purpose: Maximum search engine visibility across all crawlers # ============================================================== # ââ Google (main + image + video + news crawlers) ââââââââââââââ User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: Googlebot-Video Allow: / User-agent: Googlebot-News Allow: / # ââ Bing & Microsoft âââââââââââââââââââââââââââââââââââââââââââ User-agent: Bingbot Allow: / User-agent: msnbot Allow: / User-agent: msnbot-media Allow: / User-agent: BingPreview Allow: / # ââ Yahoo ââââââââââââââââââââââââââââââââââââââââââââââââââââââ User-agent: Slurp Allow: / # ââ Yandex âââââââââââââââââââââââââââââââââââââââââââââââââââââ User-agent: YandexBot Allow: / User-agent: YandexImages Allow: / # ââ DuckDuckGo âââââââââââââââââââââââââââââââââââââââââââââââââ User-agent: DuckDuckBot Allow: / # ââ Baidu ââââââââââââââââââââââââââââââââââââââââââââââââââââââ User-agent: Baiduspider Allow: / # ââ Apple (Spotlight / Siri suggestions) âââââââââââââââââââââââ User-agent: Applebot Allow: / # ââ Facebook / Meta link preview ââââââââââââââââââââââââââââââ User-agent: facebookexternalhit Allow: / # ââ Twitter / X link preview ââââââââââââââââââââââââââââââââââ User-agent: Twitterbot Allow: / # ââ LinkedIn link preview âââââââââââââââââââââââââââââââââââââ User-agent: LinkedInBot Allow: / # ââ WhatsApp link preview âââââââââââââââââââââââââââââââââââââ User-agent: WhatsApp Allow: / # ââ Telegram link preview âââââââââââââââââââââââââââââââââââââ User-agent: TelegramBot Allow: / # ââ SEO audit tools (keeps your rankings data accurate) âââââââ User-agent: AhrefsBot Allow: / User-agent: SemrushBot Allow: / User-agent: MJ12bot Allow: / User-agent: DotBot Allow: / # ââ Catchall: allow every other legitimate crawler ââââââââââââ User-agent: * Allow: / # Block only these private / system paths from ALL bots Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-login.php Disallow: /wp-register.php Disallow: /xmlrpc.php Disallow: /wp-cron.php Disallow: /wp-json/ Disallow: /feed/ Disallow: /comments/feed/ Disallow: /?s= # internal search result pages Disallow: /*?*replytocom= # comment reply URLs (duplicate content) Disallow: /cdn-cgi/ # Cloudflare internal endpoints # ââ Crawl-delay hint for less aggressive bots âââââââââââââââââ # (Googlebot & Bingbot ignore this â they self-regulate) Crawl-delay: 5 # ââ Sitemap locations âââââââââââââââââââââââââââââââââââââââââ Sitemap: https://100cuci.io/sitemap.xml Sitemap: https://100cuci.io/sitemap_index.xml Sitemap: https://100cuci.io/sitemap-posts.xml Sitemap: https://100cuci.io/sitemap-pages.xml Sitemap: https://100cuci.io/sitemap-categories.xml
User-agent Rules
| User-agent(s) | Allowed paths | Disallowed paths |
|---|---|---|
| googlebot |
|
No explicit Disallow rules. |
| googlebot-image |
|
No explicit Disallow rules. |
| googlebot-video |
|
No explicit Disallow rules. |
| googlebot-news |
|
No explicit Disallow rules. |
| bingbot |
|
No explicit Disallow rules. |
| msnbot |
|
No explicit Disallow rules. |
| msnbot-media |
|
No explicit Disallow rules. |
| bingpreview |
|
No explicit Disallow rules. |
| slurp |
|
No explicit Disallow rules. |
| yandexbot |
|
No explicit Disallow rules. |
| yandeximages |
|
No explicit Disallow rules. |
| duckduckbot |
|
No explicit Disallow rules. |
| baiduspider |
|
No explicit Disallow rules. |
| applebot |
|
No explicit Disallow rules. |
| facebookexternalhit |
|
No explicit Disallow rules. |
| twitterbot |
|
No explicit Disallow rules. |
| linkedinbot |
|
No explicit Disallow rules. |
|
No explicit Disallow rules. | |
| telegrambot |
|
No explicit Disallow rules. |
| ahrefsbot |
|
No explicit Disallow rules. |
| semrushbot |
|
No explicit Disallow rules. |
| mj12bot |
|
No explicit Disallow rules. |
| dotbot |
|
No explicit Disallow rules. |
| * |
|
|
Blocked and Allowed Paths
| Blocked paths |
|
|---|---|
| Allowed paths |
|
| Crawl-delay | 5.0 seconds |
Sitemaps Detected
AI Crawler Policy
No explicit blocks were detected for common AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended).
Issues Found
- Sitemap URL appears to be unreachable: https://100cuci.io/sitemap.xml
Recommendations
- Document your AI crawler policy explicitly in robots.txt so future bots know how to treat your content.
- Ensure important pages, CSS and JavaScript assets are crawlable so search engines can fully render your site.