A soft 404 is a page that returns a successful HTTP status (usually 200 OK) but contains no actual content — typically a "page not found" message rendered with the same template as the rest of the site. To a server log it looks fine. To a casual link checker it looks fine. To Google, the page looks like duplicate or empty content masquerading as a real page, and over time the entire site's quality score takes a hit.
Soft 404s are sneaky because the symptoms are invisible. Real 404s show up in error logs, Search Console reports, and uptime monitors. Soft 404s look identical to working pages from outside, and you only find out about them when Google tells you in Search Console — by which point they've been hurting you for months.
What Causes a Soft 404
1. CMS misconfiguration
The most common cause. A WordPress, Drupal, or custom CMS doesn't have a route for the requested URL, so its handler returns a "page not found" template — but the application returns HTTP 200 instead of 404 because the handler succeeded in generating output. The HTTP layer thinks everything's fine; the content layer is showing an error.
WordPress is a particular offender here. If 404.php in your theme isn't returning a proper status code, every URL that doesn't match a post or page silently produces a soft 404. Verify with curl:
curl -sI https://example.com/this-url-definitely-does-not-exist | head -1
If the response is HTTP/2 200, you have a soft 404. The fix is to make sure your 404 handler explicitly sets the status — in WordPress, that means status_header(404); at the top of 404.php.
2. Single-page apps without SSR
React, Vue, and other SPAs typically serve index.html for every URL and rely on JavaScript to render the right content. If the JS router lands on a non-existent route and shows a "Not Found" component, the HTTP response was already 200 — JavaScript can't go back and change it.
Mitigation: server-side rendering or pre-rendering, with the server actually returning 404 for non-existent routes. Frameworks like Next.js, Nuxt, and SvelteKit do this correctly out of the box; vanilla SPAs need explicit work.
3. Empty product or category pages
Ecommerce sites often serve "0 results" pages with HTTP 200 when a category is empty or a product is out of stock. Google sometimes flags these as soft 404s, especially if the page contains nothing but boilerplate ("No products available in this category").
Fix: either return 404 (if the category will permanently have no products), 301 to a parent or related page, or actually serve useful content (related categories, a "back in stock" form, recently viewed products).
4. Stale internal links to deleted pages
If a page was deleted but the route still exists in your CMS — perhaps because the slug was reused or the post was switched to draft instead of deleted — the URL might serve an empty template. Same effect as a soft 404.
5. Server-side error fallbacks
A try/catch in application code that catches all errors and returns a generic "something went wrong" page with status 200, instead of letting the error propagate as a 5xx response. The page exists but conveys no useful content.
How Google Detects Soft 404s
Google doesn't rely on HTTP status alone. Their crawler heuristically detects soft 404s by looking at:
- Pages with very little unique content (the body is mostly nav, sidebar, and footer).
- Pages whose body matches phrases like "not found", "no results", "page does not exist", "sorry, we couldn't find".
- Pages on patterns that historically have been 404s (e.g. URLs that don't match the site's URL structure).
- Pages whose content is identical to other pages on the site (often the case with template-rendered "not found" responses).
Once Google decides a page is a soft 404, it's typically dropped from the index entirely. Search Console reports this under Pages → Soft 404 with a list of affected URLs.
Why Soft 404s Are Worse Than Real 404s
A real 404 is honest. The server says "this page doesn't exist", search engines drop it from the index, link equity that flowed to it disappears (which is fine — the page is gone). A soft 404, by contrast:
- Wastes crawl budget — Googlebot follows the link, requests the page, gets a 200, processes the content, and only then decides it's a soft 404. That's a much heavier crawl than a quick HTTP-level 404.
- Damages quality signals — Sites with many soft 404s look like they're filled with thin content. Google's quality classifiers don't differentiate between "this whole site is low value" and "this site has a bunch of accidentally-empty pages".
- Confuses link maintenance — Tools that just check HTTP status report the soft 404 as healthy, so you never even know there's a problem.
- Lingers in the index — Even after detection, Google may keep the soft 404 URL in the index for weeks, continuing to send traffic to a useless page.
Finding Soft 404s on Your Site
Three layered approaches:
1. Google Search Console
The authoritative source. Search Console → Pages → Soft 404 lists every URL Google has classified as a soft 404. This is the most important data, but it's also the slowest — there's a delay of days to weeks between when Google detects a soft 404 and when it appears in the report.
2. Crawl with status check + content check
A crawler that checks both HTTP status and page content can flag suspect pages itself. Broken Link Finder reports HTTP 200 responses with very small content size or "not found"-pattern text in the body — these are likely soft 404s even before Google notices them.
3. Server log analysis
Look for URLs that return 200 but with very small response sizes (under, say, 5KB compressed for an HTML page on a typical CMS). These are likely empty templates. Also look for URLs with high request volume but no engagement (no subsequent requests for assets or APIs from the same session).
Fixing a Soft 404
For each detected soft 404, you have four options:
- Return a real 404. If the page genuinely doesn't exist, fix the application so it returns HTTP 404. This is the right answer most of the time — search engines drop the URL from the index and the problem disappears.
- 301 to a relevant page. If the missing content has been moved or merged, redirect to the new location. Preserves any inbound link equity.
- Restore the content. If the page was deleted by mistake, put the content back. Especially worth doing for URLs that have inbound links from other sites.
- Add real content. For empty category or "no results" pages, add something useful — related categories, recently viewed items, a search form, related blog posts. The page now has a reason to exist.
Whatever you pick, verify with curl (status code) and visually (real content shown). Then submit the URL for re-crawling in Search Console so Google sees the fix sooner rather than later.
Preventing Soft 404s in New Code
- Test your 404 handler explicitly. Every CI run should make a request to a known-bad URL and confirm the response is HTTP 404.
- Use the framework's status helpers correctly. Don't return 200 from an exception handler unless you really mean it.
- Avoid all-200 SPAs. Use SSR or pre-rendering for content pages. Reserve client-only routing for app routes that don't need to be indexed.
- Add monitoring. A regular automated scan that hits a known-non-existent URL and asserts HTTP 404 catches regressions immediately.
- Track 200-with-thin-content in your analytics. If you have access to server logs or a CDN dashboard, alert on URLs returning HTTP 200 with sub-2KB responses.
Soft 404s are one of the easiest SEO problems to ignore and one of the most embarrassing to find years late. Spend ten minutes verifying your 404 handler works correctly today, and run Broken Link Finder against your homepage and key landing pages — it'll flag any 200-but-empty responses on the spot.