SEO Foundations for Custom Web Apps That Actually Rank

If Googlebot hits your React route and gets a mostly empty HTML shell, you do not have an “SEO problem.” You have a rendering, crawl control, and release-process problem—and it shows up fast in Search Console: “Discovered, currently not indexed,” sudden index bloat from parameter URLs, or crawl stats that spike while rankings sit still.

Custom B2B web apps fail SEO in repeatable ways because the defaults are working against you. Client-side rendering hides content until JavaScript runs. Security rules block CSS, JS, or API responses that Google needs to render. Facets, filters, and tracking parameters generate infinite URL spaces that burn crawl budget and create duplicates. Canonicals drift when templates reuse the wrong base URL or point at staging. Then performance finishes the job: heavy bundles, long tasks, hydration delays, and third-party scripts that drag Core Web Vitals down.

This guide turns those failure modes into engineering acceptance criteria you can ship against—how Googlebot actually renders JavaScript, how to control crawl and indexation, what to fix for Core Web Vitals, how to wire internal links so routes get discovered, and how to launch or migrate without waking up to a traffic drop.

What Is Technical SEO for Modern Web Apps?

If technical SEO is acceptance criteria, you need a shared definition of “passes.” For modern web apps, SEO is the engineering work that makes every important route discoverable, renderable, indexable, and worth ranking. It is less about keywords and more about whether Google can fetch, understand, and trust what your app shows.

Technical SEO for SPAs and app-like sites boils down to four jobs: (1) let crawlers reach URLs that matter (crawlability), (2) deliver meaningful HTML when they arrive (rendering), (3) control what lands in the index (indexation), and (4) send clean signals about the preferred version and its importance (signals).

  • Crawlability: Googlebot needs stable, linkable URLs and access to required assets (JS, CSS, API responses where applicable). If your navigation lives behind onClick handlers, discovery collapses.
  • Rendering: If content appears only after client-side JavaScript runs, you risk partial indexing, delayed indexing, or missing content. SSR (server-side rendering) and SSG (static site generation) reduce that risk for public pages.
  • Indexation: You decide what should index using robots.txt, meta robots noindex, canonical tags, and sane URL patterns. This is where you prevent parameter spam, faceted crawl traps, and duplicate “state” pages.
  • Signals: Titles, meta descriptions, internal links, structured data, hreflang (if needed), and performance signals like Core Web Vitals help Google choose your page over competitors.

What “Good” Looks Like in 2026

Good technical SEO in 2026 looks boring in the best way. A new route ships with a canonical URL, server-rendered or pre-rendered HTML for public content, consistent status codes, and an internal link path from a hub page. Google Search Console shows steady indexing, no spikes in “Duplicate, Google chose different canonical,” and no explosions in crawled URLs from filters.

Teams that win treat SEO checks like unit tests. In practice, that means developers and SEO agree on route-level requirements, then verify them with Google Search Console’s URL Inspection and Lighthouse in Chrome DevTools. Google’s own guidance on JavaScript indexing is the baseline, not optional: Google Search Central: JavaScript SEO.

How Do Googlebot and Rendering Actually Work With JavaScript?

URL Inspection in Google Search Console often surprises teams because it shows two different realities: the raw HTML Googlebot fetches, and the rendered DOM after JavaScript runs. For SEO on app-like sites, that gap decides whether Google indexes a real page or a blank shell.

Googlebot crawls a URL, downloads HTML, then may render it with a headless Chromium environment. Rendering is slower and more resource-intensive than crawling, so JavaScript-heavy pages can lag in indexation, miss content, or drop internal links if your app only creates them after client-side code executes. Google documents this two-step process in Google Search Central’s JavaScript SEO guidance (Google: Understand the JavaScript SEO basics).

SSR, SSG, CSR, and Dynamic Rendering for JavaScript SEO

Pick a rendering strategy based on how much of the page must exist in the initial HTML for discovery and ranking.

  • SSR (Server-Side Rendering): Your server returns HTML with the main content and links already present. SSR is the least risky default for B2B marketing pages and indexable product pages because Googlebot can understand the page before JavaScript runs. Framework examples include Next.js (React) and Nuxt (Vue).
  • SSG (Static Site Generation): Your build process outputs HTML files ahead of time. SSG works well for docs, help centers, and landing pages where content changes on a predictable schedule. Tools include Next.js SSG, Gatsby, and Astro.
  • CSR (Client-Side Rendering): The server returns a thin HTML shell and JavaScript fills it in. CSR can rank, but it increases indexation risk: delayed rendering, missing metadata, and links that only appear after hydration. Use CSR for authenticated app areas you plan to keep out of the index anyway.
  • Dynamic Rendering: You serve pre-rendered HTML to bots and a JavaScript app to users. Google has said dynamic rendering is a workaround, not a long-term solution (Google: Dynamic rendering). It can still make sense when you cannot ship SSR quickly, but it adds an extra system to maintain.

The safest pattern for most custom B2B builds is simple: SSR or SSG for every public, indexable route, CSR for logged-in workflows, and a hard line between the two in routing, headers, and robots rules.

Crawl and Indexation Controls That Prevent SEO Disasters

SSR or SSG gets you renderable HTML. SEO still fails when crawlers hit the wrong URLs, index the wrong states, or burn crawl budget on junk. Crawl control is engineering, not a plugin: you decide what Googlebot can fetch, what Google can index, and which version counts.

  • robots.txt: Block true non-public areas (for example, /admin/, /api/, /auth/). Do not block CSS or JS needed to render public pages. Keep staging on a separate host and block it at the edge (IP allowlist or auth), then add noindex as a backup.
  • XML sitemaps: Include only canonical, indexable URLs that return 200. Split large sets into multiple sitemaps and reference them from a sitemap index. Update timestamps when content changes, not on every deploy.
  • Canonical tags: Every indexable route needs a self-referencing canonical. Normalize trailing slashes, lowercase rules, and parameter order. Never canonicalize to staging, and never canonicalize a filter state to itself unless you truly want it indexed.
  • noindex rules: Use <meta name="robots" content="noindex,follow"> for thin states (empty search results, internal sort orders, user-specific views). Prefer noindex over robots.txt when you need Google to crawl links but keep pages out of indexation.
  • Pagination: Keep paginated URLs crawlable. Put a canonical to the paginated URL itself, then strengthen the series with internal links and a view-all or hub page when it makes sense. Avoid infinite scroll without crawlable paginated URLs.
  • Faceted navigation: Decide which facets deserve landing pages, then block or noindex the rest. Limit indexable combinations aggressively, or you create an infinite URL space.
  • URL hygiene: Kill tracking parameters at the source. Keep one URL per state. Enforce 301s for http to https, www to non-www (or the reverse), and any legacy patterns.

How to Catch Crawl Traps Early

Watch Google Search Console for spikes in Crawled pages and “Duplicate, Google chose different canonical.” Use URL Inspection on a few filtered URLs to confirm canonicals and robots behavior. If you have access to server logs, validate what Googlebot actually hits, not what the app routes suggest. Google documents the basics in Search Central robots.txt guidance and XML sitemap guidance.

Core Web Vitals for App-Like Sites: Fix the Usual Bottlenecks

When Google Search Console shows a page as “indexed” but it still underperforms, Core Web Vitals is often the missing piece of SEO. App-like sites ship heavy JavaScript, long task chains, and layout shifts from late-loading UI. Those problems map directly to the three CWV metrics Google reports: LCP (Largest Contentful Paint), INP (Interaction to Next Paint), and CLS (Cumulative Layout Shift).

Use a two-source workflow: PageSpeed Insights for field data from the Chrome UX Report (CrUX), and Lighthouse in Chrome DevTools for repeatable lab runs. If CrUX has “No Data,” your traffic is too low for field reporting, so treat Lighthouse plus real-user monitoring as your guardrails.

Core Web Vitals Fixes That Usually Move the Needle

  1. Fix LCP at the template level: Make the LCP element predictable. Preload the hero image or primary font, keep it in initial HTML for SSR/SSG routes, and remove client-only skeletons that delay real content. Serve images as AVIF or WebP via a CDN like Cloudflare Images or ImageKit, and set explicit width and height.
  2. Reduce hydration and main-thread work for INP: Split bundles by route, remove unused code, and defer non-critical components. In Next.js, use next/dynamic for client-only widgets. In React, use React.lazy where it fits. Avoid long tasks from analytics and chat widgets by loading them after user interaction or after the page becomes idle.
  3. Stop layout shifts that cause CLS: Reserve space for images, embeds, and ad slots. Avoid inserting banners above existing content. Use CSS aspect-ratio for media containers, and keep font swaps controlled with font-display: swap plus sane fallback metrics.
  4. Cache like an app, not a brochure: Put static assets behind long-lived Cache-Control with immutable hashes. Cache HTML where safe, then revalidate. Use a CDN (Cloudflare or Fastly) and compress responses with Brotli.
  5. Audit third-party scripts ruthlessly: Tag Manager sprawl is a CWV killer. In Google Tag Manager, remove unused tags, block duplicate pixels, and restrict triggers. If a script does not tie to revenue or compliance, cut it.

Core Web Vitals work best as release criteria: fail the build when Lighthouse regresses, then confirm improvements in PageSpeed Insights field data over the following weeks.

Why Your IA and Internal Links Matter More Than Your Framework

If you treat Lighthouse regressions as release blockers, apply the same discipline to information architecture. Technical SEO dies quietly when Google cannot discover pages, or when PageRank bleeds into dead ends. Framework debates (Next.js vs Nuxt vs SvelteKit) matter less than whether your routes form a crawlable graph.

Information architecture (IA) is the set of page types and link paths that explain your site. Internal links are the wiring that moves discovery and authority through that structure. A React app with clean hubs will outrank a “perfect” SSR build that strands pages behind search boxes and filters.

Internal Linking SEO: The Patterns That Actually Move Rankings

Good IA looks boring: clear categories, stable URLs, and repeated link paths that Googlebot can crawl without executing app state. Build around a few durable patterns.

  • Hub pages: Create indexable hubs for each product line, integration, industry, or use case. Hubs should link to every child page with plain <a href> links in the initial HTML, not onClick navigation.
  • Breadcrumbs: Add breadcrumbs on detail pages and keep them consistent with your URL structure. Breadcrumbs reduce orphan risk and reinforce hierarchy. If you use schema, follow Google’s Breadcrumb structured data rules.
  • Orphan-page prevention: Treat “no internal links” as a bug. A page that exists only in an XML sitemap often indexes slowly and ranks poorly. Enforce a rule: every indexable route needs at least one link from an indexable hub.
  • Pagination With Real Links: If you have collections, ship crawlable paginated URLs (for example, ?page=2 or /page/2/) with server-rendered links. Infinite scroll can stay for users, but it cannot be the only path.
  • Facets With Intent: Promote a small set of filter combinations into curated landing pages, then link them from hubs. Keep the rest noindex or canonicalized, as covered in crawl controls.

Want a quick audit? In Google Search Console, pick a page that should rank and check who links to it in the Links report. If the answer is “almost nobody,” your IA is the problem, not your framework.

SEO Launch and Migration Checklist for Custom Builds

A page can have perfect internal links and still lose SEO the day you ship a new build. Launches and migrations fail for boring reasons: redirect gaps, missing content, staging leaks, and Google Search Console staying quiet until traffic drops. Treat release day like an engineering change with measurable acceptance criteria.

Technical SEO Launch Checklist (Pre-Launch)

  1. Freeze and map URLs: export every indexable URL from the old site (CMS export, XML sitemap, and a crawl with Screaming Frog SEO Spider). Decide which URLs stay, which merge, and which die.
  2. Redirect with intent: implement 301 redirects for every retired URL to the closest equivalent page, not the homepage. Avoid chains and loops. Validate at the edge (Cloudflare, Fastly, or your load balancer) before app routing runs.
  3. Confirm content parity: compare titles, H1s, canonical tags, meta robots, and main body content between old and new templates. Migrations often “simplify” pages into thin shells that render fine in the browser and index poorly.
  4. Lock down staging: put staging on a separate host, require authentication or IP allowlisting, and add noindex as a backup. Do not rely on robots.txt alone for access control.
  5. Ship crawl controls: production robots.txt, XML sitemaps with only canonical 200 URLs, and consistent canonicals (host, protocol, trailing slash rules).

Before you flip traffic, run a small set of “money URLs” through Google Search Console URL Inspection in a test property if possible, and through Lighthouse for performance regressions.

Post-Launch Monitoring (First 72 Hours, Then 30 Days)

  • Google Search Console: watch Indexing reports for spikes in “Duplicate, Google chose different canonical” and “Crawled, currently not indexed.” Check Crawl stats for sudden URL volume increases that suggest parameter traps.
  • Server logs or CDN logs: confirm Googlebot hits the expected routes and receives 200 HTML with the right canonicals. If you use Cloudflare, review Web Analytics and firewall events for blocked bot requests.
  • Redirect validation: re-crawl the old URL list and confirm 301 to the intended destination, then a single 200.

Write a rollback plan before launch: which deploy tag you revert to, which database migrations are reversible, and which edge rules (redirects, headers, WAF) you must restore. If you want a practical next step, pick 20 high-intent URLs, build a redirect map, and make passing that map a release gate.