SEO Technical Foundations Checklist for Custom Web Apps
Google Search Console doesn’t say “your app is invisible.” It says “Excluded,” “Duplicate,” “Crawled, currently not indexed,” and it leaves your team to guess which release caused it. In custom web apps, those failures are usually self-inflicted: a route that only exists after JavaScript runs, a canonical that points to the wrong URL, a parameter that explodes into duplicates, or a redirect chain that turns a migration into a crawl trap.
This checklist is the engineering layer of SEO. It gives you concrete checks you can ship and verify so crawlers see the same site your users see: crawl and index controls (robots, canonicals, sitemaps, status codes), rendering choices that don’t hide primary content (SSR vs CSR and common hydration gotchas), internal linking patterns that keep routes discoverable, and Core Web Vitals work that shows up in CrUX field data.
You won’t find keyword research or content planning here. This is for the moments where good content still loses because the app returns the wrong status code, navigation links aren’t crawlable, or a “small” routing change creates a second version of every page.
If you own a custom build—product, engineering, or marketing—use this as an acceptance checklist. The payoff is measurable: cleaner index coverage in Google Search Console, fewer surprises after deploys, and performance you can track in PageSpeed Insights.
Which Pages Can Google Crawl and Index? (Robots, Canonicals, Sitemaps)
“Excluded” in Google Search Console usually means Google found a URL, then decided it could not or should not index it. This is where SEO becomes mechanical: you control what crawlers can fetch, what they should index, and which URL is the main version.
Run these checks on every indexable template (homepage, category, product, article, docs, app landing pages), not on a single example URL.
- robots.txt: Confirm you are not blocking CSS/JS or key routes (common mistake:
Disallow: /apporDisallow: /*?). Use Google’s robots.txt guidance and test patterns carefully. (Google Search Central: robots.txt) - Meta robots and X-Robots-Tag: Verify production pages do not ship
noindex. Check HTML<meta name="robots">and server headers for PDFs or API-rendered pages. - Canonicals: Every indexable page should self-canonicalize, unless you intentionally consolidate duplicates. Watch for canonicals pointing to staging, HTTP, or a generic route like
/. Google treats canonicals as a hint, so keep them consistent with internal links and sitemaps. - XML sitemaps: Include only canonical, 200-status URLs. Exclude parameter URLs, search results, and paginated pages unless you have a reason. Submit in Google Search Console. (Google Search Central: Sitemaps)
Indexability Checks: Status Codes, Redirects, And Parameter Duplicates
- Status codes: Serve 200 for real pages, 404 for gone pages, 410 for intentionally removed content. Avoid soft 404s where you return 200 with “Not found” UI.
- Redirect strategy: Use 301 for permanent moves, 302 only for truly temporary cases. Avoid redirect chains and loops; they waste crawl budget and slow users.
- URL parameters: Inventory parameter patterns (UTM tags, sorting, filtering, session IDs). For faceted navigation, decide which combinations deserve indexing, then enforce it with canonical rules, internal linking rules, and sometimes
noindex,followon thin variants. - Duplicate hosts and protocols: Force one version (https, one hostname). Redirect http to https and non-preferred host to preferred host.
How Do JavaScript Apps Get Indexed? (SSR vs CSR vs Hybrid)
Run those template-level checks in a browser with JavaScript turned off, because that is where SEO breaks for many custom apps. Google can render JavaScript, but indexing often happens in two steps: Googlebot crawls the HTML, then a rendering service executes JS later. If your primary content only exists after client-side code runs, you risk partial indexing, delayed updates, and thin snippets.
SSR (server-side rendering) returns real HTML for each URL. CSR (client-side rendering) returns a shell and relies on JS to build the page. Hybrid approaches (Next.js, Nuxt, SvelteKit, Remix) mix SSR, static generation, and client hydration per route. For technical SEO, SSR or hybrid with strong content parity is the safer default for marketing pages, docs, and category pages.
JavaScript SEO Checklist for SSR, CSR, and Hybrid Apps
- View source test: In Chrome, use View Page Source. Confirm the title, H1, primary copy, and key internal links exist in the initial HTML for indexable routes.
- Rendered HTML test: Compare View Source vs Elements. Large differences mean bots may see less than users.
- Google URL Inspection: In Google Search Console, run URL Inspection and open “View crawled page” to confirm Google renders the same content you expect.
- Hydration stability: Watch for layout shifts and content flicker during hydration. These often correlate with mismatched SSR and client state, and they can break internal links or headings.
- Client-side routing: Every route must be a real URL that returns a 200 with a unique canonical. Avoid hash routing (#/path) for indexable content.
- Infinite scroll: Provide paginated URLs (for example, ?page=2) with crawlable links, and keep a stable canonical strategy. Google cannot reliably discover items that only load on scroll.
- Metadata rendering: Ensure
<title>, meta description, canonical, and robots meta render server-side. Client-only head tags often fail in crawlers and social scrapers.
If you need a neutral reference point while debugging, Google’s documentation on JavaScript SEO is the baseline: Google Search Central: JavaScript SEO.
Internal Linking That Scales in Custom Apps (URLs, Nav, Breadcrumbs)
Google can execute JavaScript, but discovery still depends on links. In custom apps, SEO often fails because your internal linking only exists inside components, not in crawlable HTML, or because routes exist with no path from the main navigation.
Use this checklist to make your URL system and internal links predictable as the app grows.
- Define a stable URL pattern per template: pick one canonical shape for each content type (for example,
/solutions/{slug},/docs/{product}/{topic}). Avoid mixing IDs and slugs across the same template unless you 301 one to the other. - Keep URLs boring: lowercase, hyphenated, no tracking parameters in internal links, no session IDs, no client-only state (for example,
?tab=) on indexable pages. - Enforce trailing slash rules: choose slash or no-slash and redirect the other version. Make sure canonicals match.
- Prevent orphan routes: every indexable route needs at least one static, crawlable link from an indexable page. Verify this with Screaming Frog SEO Spider, a site crawler used for internal link audits.
- Build “hub” pages on purpose: category, use-case, and docs landing pages should link to all child pages you expect to rank. Do not rely on an internal search box for discovery.
- Use real anchor text: link labels should describe the destination (“API authentication errors”) instead of “Learn more.”
- Paginate with links: for long lists, render
<a href>links to page 2, 3, 4. Infinite scroll can exist for users, but keep a paginated URL path for crawlers.
Navigation And Breadcrumbs for Scalable SEO
Navigation should match your information architecture. If the app uses role-based menus, create a public, crawlable equivalent for public content so bots do not hit a dead end.
Implement breadcrumbs in HTML and add BreadcrumbList structured data. Google documents the required properties and testing approach in Google Search Central: Breadcrumb structured data. Breadcrumbs reduce internal link depth and make canonicals easier to reason about during URL changes.
Finally, treat internal link modules as product features. When engineering ships a new template, require a link plan (nav placement, hub page inclusion, breadcrumbs) in the same ticket. That is how you keep technical SEO maintainable in a fast-moving codebase.
Core Web Vitals Checklist for Custom Builds (LCP, INP, CLS)
Internal links only help if pages load fast enough for users to stay and convert. In technical SEO, Core Web Vitals (CWV) is the user-experience layer Google measures at scale through field data in the Chrome User Experience Report (CrUX). CWV focuses on three metrics: Largest Contentful Paint (LCP) for loading, Interaction to Next Paint (INP) for responsiveness, and Cumulative Layout Shift (CLS) for visual stability.
- Set targets per template: Define budgets for marketing pages, docs, and app landing pages. Measure each template type separately in PageSpeed Insights and Lighthouse.
- Use field data first: Check CrUX and Google Search Console (Core Web Vitals report) to see real-user failures before chasing lab-only issues.
Core Web Vitals SEO Checklist: LCP, INP, CLS Levers
- LCP (loading): Optimize the hero element. Serve images as AVIF or WebP, set
width/height, and preload the LCP image withrel="preload"when it is stable. Use a CDN like Cloudflare or Fastly for static assets. Cache HTML with a reverse proxy like NGINX or a platform cache (Vercel, Netlify) when pages are public. - INP (responsiveness): Reduce main-thread work. Split bundles (Webpack, Vite), defer non-critical scripts, and remove heavy third-party tags. Audit with Chrome DevTools Performance and WebPageTest to find long tasks. Prefer server components or partial hydration where your framework supports it (Next.js, Nuxt, SvelteKit).
- CLS (stability): Reserve space for images, video, and embeds. Avoid injecting banners above the fold after first paint. Use
font-display: swapand preconnect to font origins when you self-host or use Google Fonts.
QA CWV like a regression test. For each template, run Lighthouse in CI (GitHub Actions), track trends in SpeedCurve or Calibre, and alert on real-user spikes in Sentry or Datadog RUM. Treat new components, chat widgets, and A/B testing scripts (Google Tag Manager, Optimizely) as performance changes that need review.
Reference: web.dev: Core Web Vitals.
The Contrarian Fix: Stop “Shipping Features” That Create SEO Debt
Core Web Vitals failures are visible fast in PageSpeed Insights. SEO debt is quieter: it accumulates release after release until Google Search Console fills with “Duplicate,” “Alternate page with proper canonical,” and “Crawled, currently not indexed.” In custom apps, that debt usually comes from feature work that changes URLs, routing, or crawl rules without an SEO acceptance test.
- Migrations without a redirect map: When you change slugs, folders, or the IA, ship 301s for every old URL to its closest new equivalent. Validate with Screaming Frog SEO Spider exports and server logs. A “we’ll redirect later” migration becomes permanent ranking loss.
- Blocked staging that leaks to production: Teams often add
Disallow: /in robots.txt ornoindexheaders on staging, then copy the config to prod. Put robots and meta robots under environment-specific config, and add a deploy gate that fails if production returnsnoindex. - Broken canonicals: Canonicals that point to the homepage, HTTP, or a staging host create mass de-duplication. Make canonicals deterministic per route, and unit test them like any other output.
- Faceted navigation explosion: Filters like
?color=,?size=,?sort=can generate millions of crawlable URLs. Decide which facets get indexable landing pages, then force all other combinations to a canonical, and keep them out of XML sitemaps. - Client-side routing mistakes: Hash routes (
#/), inconsistent trailing slashes, and 200 responses for “not found” states create index bloat. Every indexable route needs a stable 200, a unique canonical, and a real 404 for missing content.
Assign Ownership So SEO Debt Does Not Ship
Engineering owns crawlability mechanics: status codes, redirects, rendering mode (SSR/CSR), robots, canonicals, and sitemap generation. Product owns URL change control: any route change requires a redirect map and a launch checklist item. Marketing owns intent and content, plus validation in Google Search Console URL Inspection and Index Coverage.
Use Google’s guidance as the shared baseline for disputes about canonicals, sitemaps, and crawling behavior: Google Search Central: Crawling and Indexing.
Pre-Launch and Ongoing SEO QA Workflow (Tests, Logs, GSC)
Google Search Central gives you the rules. Your release process decides whether the app follows them. Treat SEO QA as a pipeline: every deploy proves crawlability, indexability, rendering, and performance stayed intact.
Use this pre-launch checklist on staging and again on production after go-live:
- Lock down staging correctly: Block staging with HTTP auth or IP allowlists, not with a blanket
Disallow: /that can leak into production. Add a staging-only banner so nobody screenshots the wrong environment. - Run an automated crawl: Use Screaming Frog SEO Spider or Sitebulb to crawl your canonical host. Fail the release if you find indexable pages with
noindex, missing canonicals, 4xx spikes, redirect chains, or parameter URLs in internal links. - Verify templates, not sample URLs: Check each template type (home, solution, docs, blog, category, product) for title, H1, canonical, robots meta, structured data, and 200 status.
- Check rendering parity: In Google Search Console URL Inspection, compare “View crawled page” HTML to what users see. Catch client-side routing bugs, empty SSR shells, and missing internal links.
- Validate sitemaps and robots: Confirm XML sitemaps list only canonical 200 URLs. Fetch
/robots.txtin production and confirm it allows CSS, JS, and key routes. - Performance smoke test per template: Run Lighthouse in CI (GitHub Actions) with budgets for LCP, INP, and CLS. Re-test any page that gained third-party scripts in Google Tag Manager.
Ongoing SEO Monitoring: GSC, Analytics, And Server Logs
After launch, monitor what bots and users actually experience:
- Google Search Console: Watch Indexing and Core Web Vitals reports, then spot check releases with URL Inspection. Set email alerts for coverage and manual actions.
- Analytics: Use Google Analytics 4 to annotate releases and watch landing-page sessions and conversions by template.
- Log-based validation: Use NGINX access logs, Cloudflare logs, or AWS CloudFront logs to confirm Googlebot hits your sitemap URLs, receives 200s, and does not waste time in redirect loops.
If you implement one habit this week, make every PR that changes routing, canonicals, robots, or rendering include an automated crawl report. It costs minutes and prevents months of SEO debt.