Robots.txt Tester

Inspect how richproxy.app controls crawler access, blocked paths, sitemap references and AI crawler rules.

Preview

Score: 92

At least one user-agent has Disallow: / which blocks the entire site.

Get your full report + exact fixes

See what’s hurting your SEO and how to fix it step by step.

Full breakdown
Actionable fixes
Prioritized next steps

Robots.txt Status Present Score 92 /100 · Strong

Domain richproxy.app

robots.txt URL https://richproxy.app/robots.txt

Last analyzed May 24, 2026

View Full Robots.txt

Open robots.txt in a new tab

Download robots.txt

Robots.txt Content Preview

# As a condition of accessing this website, you agree to abide by the following
# content signals:

# (a)  If a Content-Signal = yes, you may collect content for the corresponding
#      use.
# (b)  If a Content-Signal = no, you may not collect content for the
#      corresponding use.
# (c)  If the website operator does not include a Content-Signal for a
#      corresponding use, the website operator neither grants nor restricts
#      permission via Content-Signal with respect to the corresponding use.

# The content signals and their meanings are:

# search:   building a search index and providing search results (e.g., returning
#           hyperlinks and short excerpts from your website's contents). Search does not
#           include providing AI-generated search summaries.
# ai-input: inputting content into one or more AI models (e.g., retrieval
#           augmented generation, grounding, or other real-time taking of content for
#           generative AI search answers).
# ai-train: training or fine-tuning AI models.

# ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
# RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
# AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.

# BEGIN Cloudflare Managed content

User-agent: *
Content-Signal: search=yes,ai-train=no
Allow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CloudflareBrowserRenderingCrawler
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

# END Cloudflare Managed Content

# Rich Proxy - Optimized robots.txt
# For help, see: https://developers.google.com/search/docs/crawling-indexing/robots/intro

User-agent: *
Allow: /

# EXCLUDE SYSTEM AND AUTOMATION FILES
Disallow: /*.py$
Disallow: /*.json$
Disallow: /*.log$
Disallow: /*.bak$
Disallow: /*.txt$
Allow: /robots.txt
Allow: /llms.txt

# EXCLUDE SCRIPTS AND INTERNAL DIRECTORIES
Disallow: /__pycache__/
Disallow: /node_modules/
Disallow: /scripts/
Disallow: /tmp/

# Keep public rendering assets crawlable so search engines can render pages correctly.
Allow: /assets/

# EXCLUDE DEBUG & BACKUP HTML FILES
Disallow: /*_debug.html
Disallow: /*_test.html
Disallow: /*_final_v*.html
Disallow: /index_debug.html
Disallow: /light-theme-test.html
Disallow: /404.html

# PROTECT SENSITIVE FILES
Disallow: /package.json
Disallow: /package-lock.json
Disallow: /webhook.php

# SET SITEMAP PATH
Sitemap: https://richproxy.app/sitemap.xml

# PREVENT AI CRAWLERS FROM SCRAPING DATA (Optional, but recommended for niche data businesses)
# User-agent: GPTBot
# Disallow: /

User-agent Rules

User-agent(s)	Allowed paths	Disallowed paths
*	/	No explicit Disallow rules.
amazonbot	No explicit Allow rules.	/
applebot-extended	No explicit Allow rules.	/
bytespider	No explicit Allow rules.	/
ccbot	No explicit Allow rules.	/
claudebot	No explicit Allow rules.	/
cloudflarebrowserrenderingcrawler	No explicit Allow rules.	/
google-extended	No explicit Allow rules.	/
gptbot	No explicit Allow rules.	/
meta-externalagent	No explicit Allow rules.	/
*	/ /robots.txt /llms.txt /assets/	/.py$ /.json$ /.log$ /.bak$ /.txt$ /__pycache__/ /node_modules/ /scripts/ /tmp/ /_debug.html /_test.html /_final_v*.html /index_debug.html /light-theme-test.html /404.html /package.json /package-lock.json /webhook.php

Blocked and Allowed Paths

Blocked paths	/ /.py$ /.json$ /.log$ /.bak$ /.txt$ /__pycache__/ /node_modules/ /scripts/ /tmp/ /_debug.html /_test.html /_final_v*.html /index_debug.html /light-theme-test.html /404.html /package.json /package-lock.json /webhook.php
Allowed paths	/ /robots.txt /llms.txt /assets/
Crawl-delay	No Crawl-delay directive detected.

Sitemaps Detected

https://richproxy.app/sitemap.xml

AI Crawler Policy

At least one AI crawler (such as GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot or Google-Extended) appears to be blocked by robots.txt.

Issues Found

At least one user-agent has Disallow: / which blocks the entire site.

Recommendations

Avoid blocking the entire site (Disallow: /); restrict only sensitive or low-value paths instead.
Review your AI crawler policy for GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot and Google-Extended to ensure it matches your content strategy.
Ensure important pages, CSS and JavaScript assets are crawlable so search engines can fully render your site.