What is an orphan page in SEO?

An orphan page is a page on your website that has no internal links pointing to it from any other page on the same site. Because search engine crawlers navigate sites by following links, a page with no internal links is difficult for Googlebot to discover and recrawl regularly. Orphan pages also receive no PageRank from your site's internal link structure, which limits their ability to rank for competitive keywords.

Do orphan pages hurt SEO?

Yes, in two ways. First, they receive no internal PageRank - they are isolated from your site's link equity distribution, which limits their ranking potential. Second, they are crawled infrequently or inconsistently, meaning updates and new content on those pages take longer to be discovered and indexed by Google. On a site with many valuable orphan pages, fixing the internal linking structure is one of the highest-return SEO improvements available.

How do I find orphan pages without paid tools?

The manual method is to compare your sitemap URL list against a crawl of your site. Export all URLs from your sitemap.xml, then run a free crawl with Screaming Frog (free up to 500 URLs) starting from your homepage. Any URL in the sitemap that does not appear in the crawl results is an orphan page. Also check Google Search Console - the Indexing -> Pages report shows indexed pages that are not in your sitemap, some of which may be orphans discovered through external links.

Is it okay to have some orphan pages?

Some pages are legitimately standalone - privacy policy, terms of service, some legal or administrative pages that are linked from a footer rather than contextual content. Whether these appear as orphans depends on whether your crawler picks up footer links. For content pages, orphan status is rarely intentional and usually indicates a publishing workflow gap. A small number of orphans on a large site is normal; a high proportion (more than 15-20% of indexed pages) indicates a structural problem worth addressing.

What is the fastest way to fix a large number of orphan pages?

The fastest approach is to add links to orphan pages from your most relevant hub or pillar pages. If you have a blog index or category page that should logically include these posts, add them there. For topic clusters, find the pillar page for each orphan's topic and add a contextual link. Avoid adding links in bulk to an artificial 'all posts' page just to eliminate orphan status - the links should be contextually relevant to pass meaningful equity and provide value to readers.

How to Find Orphan Pages on Your Website

March 10, 2026 • 13 min read

#seo #orphan-pages #internal-linking #technical-seo #how-to

An orphan page is any page on your website that has no internal links pointing to it.

It might exist in your sitemap. It might even be indexed by Google. But because nothing on your site links to it, crawlers can only find it by going directly to the URL - there is no navigable path from the rest of your site.

For SEO, this creates two problems. First, orphan pages receive no internal PageRank. Every page that earns authority through backlinks and content quality can pass that authority to other pages through internal links - but not to orphans. They sit isolated, disconnected from your site’s link graph.

Second, Googlebot relies heavily on internal links to discover and recrawl content. A page with no internal links will be recrawled far less frequently than one embedded in your site’s structure. New content on that page takes longer to be indexed. Updates take longer to surface.

The good news is that orphan pages are findable, and fixing them is mostly a matter of adding the right internal links.

What Causes Orphan Pages

They accumulate in predictable ways:

Content published without being linked from anywhere. A blog post goes live, gets submitted to the sitemap, but no one added a link to it from a related post, a relevant guide, or a category page. This is the most common cause on content-heavy sites.

Pages created for specific campaigns that were never removed. A landing page for a seasonal promotion, an event page, a PPC-specific page. The campaign ends, the page stays, the link from the campaign is removed but the page itself is never redirected or deleted.

Old content left behind during a redesign or navigation update. When nav menus are rebuilt, some pages that were previously reachable through the old navigation fall off. They stay in the sitemap but nothing in the new structure links to them.

Pagination pages and filtered views. Page 2, page 3, filtered search result pages - these are often generated dynamically but not explicitly linked from anywhere.

Tags and categories with very few posts. A tag page with one post, no description, and no links pointing to it is essentially an orphan that adds little value.

Step 1: Crawl Your Site to Map Reachable Pages

Start by running a full-site crawl that begins from your homepage and follows every internal link it finds. The output is the list of pages reachable through your internal link structure.

Using redCacti: Add your site and run a crawl.

The pages report shows every URL discovered through internal links. Export this as a CSV.

Using Screaming Frog (free up to 500 URLs): Enter your homepage URL and run a standard crawl. Export all crawled URLs from the Internal tab.

What this crawl represents: This is essentially what Googlebot sees when it crawls your site starting from your homepage. Every URL in this list has at least one internal link pointing to it. Every URL NOT in this list is a candidate orphan.

Step 2: Export Your Sitemap URL List

Your sitemap is your declared inventory of pages - the URLs you have told Google and other crawlers about.

Getting the sitemap URL list:

Most sites publish their sitemap at yoursite.com/sitemap.xml. Open this in a browser and you will see the list of URLs, or it may point to a sitemap index with multiple child sitemaps.

For a site with a large sitemap, use a tool to extract all URLs:

# Extract all URLs from a sitemap using curl and grep
curl -s https://yoursite.com/sitemap.xml | grep -oP '(?<=<loc>)[^<]+'

Or paste the sitemap URL into a sitemap parser tool to get a clean list.

Export this list to a separate column in the same spreadsheet as your crawl results.

Step 3: Compare Crawl Results Against the Sitemap

The orphan pages are the URLs that appear in your sitemap but do not appear in your crawl results.

In a spreadsheet:

Column A: all URLs from your site crawl
Column B: all URLs from your sitemap
Use VLOOKUP or COUNTIF to find URLs in column B that have no match in column A

=IF(COUNTIF($A:$A, B2)=0, "ORPHAN", "OK")

Any row marked “ORPHAN” is a page that exists in your sitemap but cannot be reached by following internal links from your homepage.

In redCacti:

The orphan pages report does this comparison automatically. Pages flagged as orphans are those present in your sitemap or discovered through other means but not reachable through your internal link graph.

This is the easiest available way to identify orphan pages regularly and get the maximum out of your SEO efforts.

If you would like to try out redCacti for free, sign up here.

Step 4: Cross-Reference with Google Search Console

GSC adds a third data set: pages Google has actually indexed. This is valuable because it can reveal orphan pages that are not even in your sitemap.

Finding indexed pages not in your sitemap:

In GSC -> Indexing -> Pages, look at the “Indexed, not submitted in sitemap” list. These pages are indexed but you have not declared them in your sitemap. Some may have internal links (fine), some may be genuine orphans that Google found through external links or old cached data.

Finding indexed orphans:

The overlap you want to identify:

In GSC index: yes
In your sitemap: yes (or no)
In your crawl results: no

These are the highest-priority orphans - Google knows they exist but cannot reliably reach them through your site’s link structure.

Step 5: Classify Each Orphan

Once you have your list, classify each orphan before deciding what to do. Not every orphan is the same kind of problem.

Type A - Valuable content that should be linked: Good pages that simply got published without being linked from anywhere. These should get internal links added pointing to them from relevant pages. This is the most common type and the most straightforward fix.

Type B - Intentionally standalone pages: Privacy policy, terms of service, some legal pages. These may legitimately have no contextual internal links beyond a footer link. Check whether a footer or navigation link counts as an internal link in your crawl - it often does. If these pages show as orphans, it may mean your footer links are not being followed by the crawler.

Type C - Old campaign and landing pages: Pages that served a past purpose but have no ongoing value. Assess whether to: delete and redirect to a relevant page, keep with a noindex tag, or link from a relevant archive or resources page.

Type D - Duplicate or thin content: Near-duplicate pages, very thin pages with no unique content, or generated pages (empty tag/category pages). These are candidates for noindex or deletion rather than internal linking.

Type E - Pages that should not be indexed: Internal tools, admin-adjacent pages, draft content that was published by mistake. These should have a noindex tag added or be password protected, not fixed with internal links.

A Simpler Mental Model

Think of your website as a building. Internal links are the corridors. A page with no internal links pointing to it is a room with no corridor connecting it to the rest of the building. A visitor who finds the room from outside (a direct URL) can enter, but anyone walking through the building will never discover it exists.

Fixing orphan pages means adding corridors - placing internal links on relevant pages that naturally lead readers (and crawlers) to the orphaned content.

How Many Orphan Pages Is Normal?

Our analysis of 95 enterprise SaaS companies found orphan pages across virtually every site audited - even well-maintained ones. The issue is not whether you have orphan pages (you almost certainly do) but the proportion.

A site with 200 blog posts and 15 orphans is in reasonable shape. A site with 200 posts and 80 orphans has a structural problem - it means a significant portion of its content is disconnected from the internal link graph.

As a rough benchmark: if more than 15-20% of your indexed pages have no internal links, orphan page cleanup should be a priority.

Summary Checklist

Run full-site crawl starting from homepage - export URL list
Export sitemap URL list
Compare the two lists to identify URLs in sitemap but not in crawl
Cross-reference with GSC indexed pages for additional orphans
Classify each orphan: valuable content / intentional standalone / old campaign / thin content / should not be indexed
Prioritise orphans with existing backlinks or historical traffic
Plan internal linking fixes for Type A orphans
Plan deletion or noindex for Type C, D, E orphans

Orphan pages are one of the easier SEO wins available on most sites because the fix - adding internal links - does not require content creation, technical changes, or external outreach. It is purely a matter of connecting what you already have.

The challenge is finding them systematically. Manual browsing will not surface orphans, because by definition there is no link to follow to them.

Find orphan pages on your site ->

The free sitemap audit compares your sitemap against crawl data to surface orphan pages alongside broken links.

Also in this series: How to Fix Orphan Pages That Google Can’t Find - How to Identify Pages Missing from Your Sitemap

Newsletter

Weekly SEO teardowns

Internal linking, broken links & orphan pages — straight to your inbox, every week.

Subscribe free

redCacti Team

The team behind redCacti - helping websites improve their SEO through better internal linking.