robots.txt

robots.txt is a plain-text file in the site root that tells search bots which sections to crawl or skip.

robots.txt blocks crawling ONLY, not indexing. A disallowed page can still enter the index via external links, even without a snapshot. To exclude from the index, use noindex.

User-agent: *
Allow: /
Disallow: /api/
Disallow: /dashboard/
Sitemap: https://example.com/sitemap.xml

See also

robots.txt — SEO glossary