Robots.txt Tester

Inspect how 100cuci.io controls crawler access, blocked paths, sitemap references and AI crawler rules.

Preview

Score: 92

  • Sitemap URL appears to be unreachable: https://100cuci.io/sitemap.xml

Get your full report + exact fixes

See what’s hurting your SEO and how to fix it step by step.

  • Full breakdown
  • Actionable fixes
  • Prioritized next steps

No spam. One email with your report and next steps.

Robots.txt Status

Robots.txt Status Present Score 92 /100 · Strong
Domain 100cuci.io
Last analyzed May 30, 2026

View Full Robots.txt

Robots.txt Content Preview

# ==============================================================
# robots.txt — 100cuci.io
# Last updated: April 2026
# Purpose: Maximum search engine visibility across all crawlers
# ==============================================================
 
 
# ── Google (main + image + video + news crawlers) ──────────────
User-agent: Googlebot
Allow: /
 
User-agent: Googlebot-Image
Allow: /
 
User-agent: Googlebot-Video
Allow: /
 
User-agent: Googlebot-News
Allow: /
 
 
# ── Bing & Microsoft ───────────────────────────────────────────
User-agent: Bingbot
Allow: /
 
User-agent: msnbot
Allow: /
 
User-agent: msnbot-media
Allow: /
 
User-agent: BingPreview
Allow: /
 
 
# ── Yahoo ──────────────────────────────────────────────────────
User-agent: Slurp
Allow: /
 
 
# ── Yandex ─────────────────────────────────────────────────────
User-agent: YandexBot
Allow: /
 
User-agent: YandexImages
Allow: /
 
 
# ── DuckDuckGo ─────────────────────────────────────────────────
User-agent: DuckDuckBot
Allow: /
 
 
# ── Baidu ──────────────────────────────────────────────────────
User-agent: Baiduspider
Allow: /
 
 
# ── Apple (Spotlight / Siri suggestions) ───────────────────────
User-agent: Applebot
Allow: /
 
 
# ── Facebook / Meta link preview ──────────────────────────────
User-agent: facebookexternalhit
Allow: /
 
 
# ── Twitter / X link preview ──────────────────────────────────
User-agent: Twitterbot
Allow: /
 
 
# ── LinkedIn link preview ─────────────────────────────────────
User-agent: LinkedInBot
Allow: /
 
 
# ── WhatsApp link preview ─────────────────────────────────────
User-agent: WhatsApp
Allow: /
 
 
# ── Telegram link preview ─────────────────────────────────────
User-agent: TelegramBot
Allow: /
 
 
# ── SEO audit tools (keeps your rankings data accurate) ───────
User-agent: AhrefsBot
Allow: /
 
User-agent: SemrushBot
Allow: /
 
User-agent: MJ12bot
Allow: /
 
User-agent: DotBot
Allow: /
 
 
# ── Catchall: allow every other legitimate crawler ────────────
User-agent: *
Allow: /
 
# Block only these private / system paths from ALL bots
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-login.php
Disallow: /wp-register.php
Disallow: /xmlrpc.php
Disallow: /wp-cron.php
Disallow: /wp-json/
Disallow: /feed/
Disallow: /comments/feed/
Disallow: /?s=                  # internal search result pages
Disallow: /*?*replytocom=        # comment reply URLs (duplicate content)
Disallow: /cdn-cgi/              # Cloudflare internal endpoints
 
 
# ── Crawl-delay hint for less aggressive bots ─────────────────
# (Googlebot & Bingbot ignore this — they self-regulate)
Crawl-delay: 5
 
 
# ── Sitemap locations ─────────────────────────────────────────
Sitemap: https://100cuci.io/sitemap.xml
Sitemap: https://100cuci.io/sitemap_index.xml
Sitemap: https://100cuci.io/sitemap-posts.xml
Sitemap: https://100cuci.io/sitemap-pages.xml
Sitemap: https://100cuci.io/sitemap-categories.xml

User-agent Rules

User-agent(s) Allowed paths Disallowed paths
googlebot
  • /
No explicit Disallow rules.
googlebot-image
  • /
No explicit Disallow rules.
googlebot-video
  • /
No explicit Disallow rules.
googlebot-news
  • /
No explicit Disallow rules.
bingbot
  • /
No explicit Disallow rules.
msnbot
  • /
No explicit Disallow rules.
msnbot-media
  • /
No explicit Disallow rules.
bingpreview
  • /
No explicit Disallow rules.
slurp
  • /
No explicit Disallow rules.
yandexbot
  • /
No explicit Disallow rules.
yandeximages
  • /
No explicit Disallow rules.
duckduckbot
  • /
No explicit Disallow rules.
baiduspider
  • /
No explicit Disallow rules.
applebot
  • /
No explicit Disallow rules.
facebookexternalhit
  • /
No explicit Disallow rules.
twitterbot
  • /
No explicit Disallow rules.
linkedinbot
  • /
No explicit Disallow rules.
whatsapp
  • /
No explicit Disallow rules.
telegrambot
  • /
No explicit Disallow rules.
ahrefsbot
  • /
No explicit Disallow rules.
semrushbot
  • /
No explicit Disallow rules.
mj12bot
  • /
No explicit Disallow rules.
dotbot
  • /
No explicit Disallow rules.
*
  • /
  • /wp-admin/
  • /wp-includes/
  • /wp-login.php
  • /wp-register.php
  • /xmlrpc.php
  • /wp-cron.php
  • /wp-json/
  • /feed/
  • /comments/feed/
  • /?s= # internal search result pages
  • /*?*replytocom= # comment reply urls (duplicate content)
  • /cdn-cgi/ # cloudflare internal endpoints

Blocked and Allowed Paths

Blocked paths
  • /wp-admin/
  • /wp-includes/
  • /wp-login.php
  • /wp-register.php
  • /xmlrpc.php
  • /wp-cron.php
  • /wp-json/
  • /feed/
  • /comments/feed/
  • /?s= # internal search result pages
  • /*?*replytocom= # comment reply urls (duplicate content)
  • /cdn-cgi/ # cloudflare internal endpoints
Allowed paths
  • /
Crawl-delay 5.0 seconds

Sitemaps Detected

AI Crawler Policy

No explicit blocks were detected for common AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended).

Issues Found

  • Sitemap URL appears to be unreachable: https://100cuci.io/sitemap.xml

Recommendations

  • Document your AI crawler policy explicitly in robots.txt so future bots know how to treat your content.
  • Ensure important pages, CSS and JavaScript assets are crawlable so search engines can fully render your site.

Analyze this site with other tools

Want a website that actually generates leads?

Start a conversion-focused website project with a team that builds fast, SEO-optimized sites for real businesses.