Tabelog Robots.txt [cracked] May 2026

| Want to crawl? | Allowed? | |----------------|----------| | Restaurant detail pages | ✅ (implicitly, via no explicit block) | | Search results | ❌ | | Review pages | ❌ | | Photo galleries | ❌ | | Regional index pages | ❌ | | Ranking lists | ❌ | For a site built on user contributions and openness, Tabelog’s robots.txt is remarkably closed. But that’s the point. In a market where restaurant data is a strategic asset (competitors include Google Maps, Retty, and Gurunavi), a robots.txt becomes a legal-engineering hybrid: “We’ve told you not to crawl these paths. If you do, you’re violating our terms and potentially the Unfair Competition Prevention Act of Japan.” Final take If you’re building a crawler for Tabelog, don’t bother negotiating with robots.txt — it’s not a negotiation. It’s a warning. Real access requires official APIs or commercial partnerships. The robots.txt is just the polite “Keep Out” sign before the electric fence.

The list of Disallow: /tokyo/ , /osaka/ , /kyoto/ , etc., is unusual. Most sites want their city landing pages indexed. Tabelog explicitly blocks them. Why? Possibly because those pages are thin, auto-generated, or contain internal navigation that leads to disallowed content. More likely: Tabelog prefers to control how its regional authority is presented — via their own sitemap and internal linking, not via open-ended crawler access. tabelog robots.txt

A surprising omission. A robots.txt often points to sitemap.xml . Tabelog’s doesn’t. Either they rely on Google Search Console’s submitted sitemaps, or they deliberately avoid publicizing their URL structure. Given the number of blocked paths, the latter feels intentional. The subtext: Defensive design Tabelog’s robots.txt is not about politeness. It’s about asymmetry . They want Google to index their restaurant detail pages (the core content users need), but not the scaffolding that makes those pages discoverable in bulk. | Want to crawl