External link rot is unavoidable; you don't control the destinations. Internal link rot is entirely your fault, and entirely fixable. A well-run internal link audit catches broken links, finds orphan pages that have no inbound paths, identifies redirect chains, and reveals which content the rest of your site is — or isn't — supporting with link signals.
This guide is a practical workflow for sites with at least a few hundred pages. Smaller sites can do most of this manually with Broken Link Finder; the workflow below scales to sites with tens of thousands of pages.
What an Internal Link Audit Actually Covers
A complete audit looks at five things:
- Broken internal links — links pointing to URLs that 404, 5xx, or fail to resolve.
- Redirect chains and unnecessary redirects — internal links that should point directly to the final URL but don't.
- Orphan pages — pages that exist in your sitemap but have no inbound internal links.
- Link depth distribution — how many clicks from the homepage to reach each page. Pages 5+ clicks deep are essentially invisible to crawlers.
- Link equity concentration — which pages have the most internal inbound links, and whether that matches your content strategy.
The first three are concrete problems. The last two are diagnostic — they tell you whether your site's architecture is helping or hurting.
The Tooling
For small sites or one-off checks: Broken Link Finder is enough. Free, instant, no signup, scans every link on a page.
For full-site crawls of larger sites:
- Screaming Frog SEO Spider — desktop app, one-time fee, handles up to ~500k URLs comfortably.
- Sitebulb — similar to Screaming Frog with friendlier reporting.
- Ahrefs Site Audit — cloud-based, includes link equity flow visualisation.
- Custom crawler — for huge sites or specific needs, a Python/Go crawler around 200 lines does the job.
For all of them, the output you want is a CSV of every link found, with source URL, destination URL, anchor text, and HTTP status.
Step 1: Crawl the Site
Configure the crawler to:
- Start from the homepage.
- Follow only internal links (don't fetch external destinations to check status).
- Respect
robots.txtexclusions (they're real pages from a UX perspective, but Googlebot won't see them either). - Crawl with a polite delay (1–2 requests per second is plenty for most sites).
- Capture HTTP status, redirect chain, anchor text, and link source URL.
- Render JavaScript if the site uses client-side routing — without rendering, an SPA looks like one page.
For sites with tens of thousands of pages, crawl during off-peak hours and split into batches by directory if needed.
Step 2: Find Broken Internal Links
Filter the link table to rows where the destination's HTTP status is 4xx or 5xx. Sort by source URL frequency — broken links that appear on many source pages (typically because they're in a global header/footer/sidebar) are highest priority.
For each broken link:
- If the destination still exists at a different URL, update the link to point directly there. Don't add a redirect; fix the link.
- If the destination is permanently gone, remove the link or replace it with a different relevant link.
- If the destination is in an "in progress" template (e.g. a placeholder page that hasn't been published yet), either publish the destination or remove the link.
Re-crawl after fixes and verify the broken count has dropped to zero.
Step 3: Eliminate Redirect Chains
Filter to rows where the destination's redirect chain length is greater than 0. These are internal links pointing to URLs that redirect.
The fix is to update each link to point directly to the final destination, not to the redirect. The redirect itself can stay (visitors arriving from external sources still need it) but no internal link should rely on it.
If the same redirect appears in many internal links, the fix is usually a global find-and-replace in your CMS or templates. WordPress: search the database for the old URL and replace with the new. Custom CMS: same idea.
See our redirect chains guide for more detail on flattening chains.
Step 4: Find Orphan Pages
Cross-reference your URL list (from sitemap.xml or CMS export) with the list of URLs reached by the crawler. Any URL in the first list but not in the second is an orphan — it exists, but no internal link points to it.
Common reasons for orphans:
- Pages that were unpublished from menus but still exist in the CMS.
- Old archive content that was meant to be retired but wasn't.
- Pages that are only linked from external sources (intentional in some contexts; usually unintentional).
- Pages whose only inbound links were broken (now fixed in Step 2, or deleted).
- Pages reachable only through search results or sitemaps.
For each orphan, decide: should this page exist? If yes, link to it from somewhere appropriate. If no, 410 Gone or 301 to a related page.
Step 5: Analyse Link Depth
For each page, count the minimum number of clicks from the homepage to reach it. The crawler should provide this directly.
Distribution targets for a healthy site:
- Top-tier content (cornerstone pages, key landing pages): depth 1 — directly linked from the homepage.
- Mid-tier content (category pages, important blog posts): depth 2 — one click in.
- Long-tail content (older blog posts, niche pages): depth 3 — usually fine.
- Anything at depth 4+ is hard for Googlebot to find regularly. At depth 5+ it may not be crawled at all.
If you have important content at depth 4+, the fix is usually to add internal links — from the homepage, from category pages, from related blog posts — that bring it closer to the top of the site's hierarchy.
Step 6: Check Internal Link Equity Flow
Count the inbound internal links to each page. Sort the list. The pages at the top should be the ones you want to rank for the most.
Common findings:
- Privacy policy and terms pages dominate the list. Common because they're linked from every page's footer. Not a problem unless real content is being out-linked.
- Old content has more inbound links than new. Often older blog posts have accumulated internal links over time. New posts need promotion.
- Important commercial pages are weakly linked. If your "Pricing" or "Book a Demo" page only has 3 inbound internal links, you're undermining your own conversion path.
- Important resource pages are weakly linked. Cornerstone content that should rank well often has no internal links — it lives only as a destination of external promotion, with no internal scaffolding.
The fix is editorial: add contextual links from related content to the pages that should be ranking. Avoid stuffing — every link should make sense in context.
Step 7: Anchor Text Audit
Group internal links by destination URL, then look at the distribution of anchor texts pointing at each.
What healthy looks like:
- A mix of exact-match, partial-match, and natural anchor text.
- No single page has hundreds of identical "click here" anchors.
- Generic anchors ("read more", "click here") are minimised in favour of descriptive ones.
What to fix:
- Pages with all-identical anchor text often signal a CMS template issue — every link from the same template uses the same anchor. Vary the anchor text where possible.
- Pages with mostly generic anchors lose contextual relevance. "Read more" links are clear in context but tell crawlers nothing about the destination.
Step 8: Build a Punch List
From the audit, build a prioritised list:
- P0: Broken internal links. Fix immediately.
- P1: Redirect chains. Update the source links to the final destination.
- P1: Important orphan pages. Add internal links bringing them into the site graph.
- P2: Important pages at depth 4+. Add internal links to bring them shallower.
- P2: Important pages with weak inbound link counts. Add contextual internal links.
- P3: Anchor text improvements. Replace generic anchors with descriptive ones over time.
Don't try to fix everything at once. P0 in week 1, P1 in weeks 2–4, P2 over the following month, P3 as content gets edited.
Cadence
For most sites:
- Quarterly — full crawl + audit. New broken links surface; new content needs to be linked into existing structure.
- After every migration or significant URL change — full audit immediately to catch chains and breakage.
- Monthly spot checks with Broken Link Finder — run your top 10 most-trafficked pages through the tool. Catches issues fast without the overhead of a full crawl.
The Compound Effect
One internal link audit is useful. Four audits a year, run consistently for two years, transform a site's link health. The cumulative effect of fixing broken links, flattening chains, surfacing orphans, and concentrating link equity on the right pages is one of the most reliable SEO improvements available — and it requires no new content, no link building, no algorithm guesswork.
Start small. Run Broken Link Finder against your homepage today, fix what it finds, and schedule the next pass for a month from now. Compound from there.