# robots.txt for media.schoolreadinglist.co.uk (Cloudflare R2) # Allow all major search engine crawlers User-agent: Googlebot Disallow: User-agent: Bingbot Disallow: User-agent: DuckDuckBot Disallow: User-agent: Applebot Disallow: User-agent: CCBot Disallow: User-agent: GPTBot Disallow: User-agent: anthropic-ai Disallow: User-agent: ClaudeBot Disallow: User-agent: Sogou Disallow: User-agent: Baiduspider Disallow: User-agent: Yeti Disallow: # Allow all major social media bots User-agent: Twitterbot Disallow: User-agent: facebookexternalhit Disallow: User-agent: LinkedInBot Disallow: User-agent: Pinterestbot Disallow: User-agent: Slackbot Disallow: User-agent: PerplexityBot Disallow: # Default rule for all others (allow) User-agent: * Disallow: # Sitemaps Sitemap: https://media.schoolreadinglist.co.uk/sitemap.xml Sitemap: https://media.schoolreadinglist.co.uk/free-reading-resources/sitemap.xml Sitemap: https://media.schoolreadinglist.co.uk/files-sitemap.xml # Block known malicious, scraping, or non-beneficial bots User-agent: Nuclei User-agent: WikiDo User-agent: Riddler User-agent: PetalBot User-agent: Zoominfobot User-agent: Go-http-client User-agent: Node/simplecrawler User-agent: CazoodleBot User-agent: dotbot/1.0 User-agent: Gigabot User-agent: Barkrowler User-agent: BLEXBot User-agent: magpie-crawler User-agent: MJ12bot User-agent: AhrefsBot User-agent: SemrushBot User-agent: SeznamBot User-agent: Yandex User-agent: Exabot User-agent: Qwantify User-agent: trendictionbot User-agent: spbot User-agent: VelenPublicWebCrawler Disallow: /