# ============================================================================ # robots.txt — Zain Ul Abideen Portfolio # https://zainulabideen-portfolio.netlify.app # ============================================================================ # ─── Sitemap (must come first for some crawlers) ──────────────────────────── Sitemap: https://zainulabideen-portfolio.netlify.app/sitemap.xml # ─── Default: allow everything for all crawlers ────────────────────────────── User-agent: * Allow: / Disallow: /api/ Disallow: /*?*utm_ Disallow: /*?*ref= Disallow: /*?*fbclid= Disallow: /*?*gclid= # ─── Explicit allowlists for public media (helps page-rendering bots) ──────── Allow: /images/ Allow: /models/ Allow: /assets/ Allow: /poppins.woff2 Allow: /sitemap.xml Allow: /robots.txt Allow: /site.webmanifest Allow: /humans.txt Allow: /.well-known/security.txt # ============================================================================ # Search-engine crawlers — priority bots # ============================================================================ # Google User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: /images/ Allow: /assets/ User-agent: Googlebot-Mobile Allow: / User-agent: Googlebot-News Allow: / User-agent: AdsBot-Google Allow: / User-agent: Mediapartners-Google Allow: / # Bing User-agent: Bingbot Allow: / User-agent: AdIdxBot Allow: / # Yahoo / Slurp User-agent: Slurp Allow: / # DuckDuckGo User-agent: DuckDuckBot Allow: / # Yandex User-agent: YandexBot Allow: / # Baidu User-agent: Baiduspider Allow: / # Naver User-agent: Yeti Allow: / # Seznam User-agent: SeznamBot Allow: / # Apple Spotlight / Siri User-agent: Applebot Allow: / # ============================================================================ # Social / preview crawlers (for OG cards, rich previews) # ============================================================================ User-agent: LinkedInBot Allow: / User-agent: facebookexternalhit Allow: / User-agent: Facebot Allow: / User-agent: Twitterbot Allow: / User-agent: WhatsApp Allow: / User-agent: TelegramBot Allow: / User-agent: Discordbot Allow: / User-agent: SkypeUriPreview Allow: / User-agent: Slackbot Allow: / User-agent: Slackbot-LinkExpanding Allow: / User-agent: Pinterest Allow: / User-agent: redditbot Allow: / # ============================================================================ # AI / LLM crawlers — explicitly allowed (helps surface portfolio in AI answers) # ============================================================================ User-agent: GPTBot Allow: / User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: Claude-Web Allow: / User-agent: anthropic-ai Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: Google-Extended Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Meta-ExternalFetcher Allow: / User-agent: cohere-ai Allow: / User-agent: Bytespider Allow: / User-agent: Amazonbot Allow: / User-agent: Diffbot Allow: / User-agent: DuckAssistBot Allow: / User-agent: YouBot Allow: / User-agent: Kagibot Allow: / # ============================================================================ # Archive crawlers # ============================================================================ User-agent: ia_archiver Allow: / User-agent: archive.org_bot Allow: / # ============================================================================ # Block known aggressive / low-value scrapers # ============================================================================ User-agent: AhrefsBot Crawl-delay: 10 User-agent: SemrushBot Crawl-delay: 10 User-agent: MJ12bot Crawl-delay: 10 User-agent: DotBot Crawl-delay: 10