Most technical SEO problems do not announce themselves. They accumulate quietly in robots.txt files, canonical tags, and bloated index counts until rankings drop and no one can explain why. This checklist covers every layer of technical SEO that matters in 2026, including what has changed since the last time your audit was run.
of Google searches now trigger an AI Overview (Conductor, Q1 2026)
of all searches end without a click in 2026, driven by AI Overviews and snippets (Incremys)
of LLM citations come from the first 30% of a page's text (Position Digital, April 2026)
Technical SEO falls into eight distinct layers. A problem in any one of them can suppress pages that are otherwise well-optimised. Work through these sections in order: crawlability issues at the top of the list prevent fixes further down from having any effect.
Crawlability and Indexing
Crawlability is the foundation of every other technical SEO effort. Google cannot rank a page it cannot find, and it cannot benefit from schema on a page it cannot render. Most ranking problems in 2026 trace back to indexation issues, not content quality.
- Verify your
robots.txtis not blocking any page you want indexed. Checkyourdomain.com/robots.txtdirectly and test specific URLs using Google Search Console's robots.txt Tester. - Submit your XML sitemap to Google Search Console and Bing Webmaster Tools. Resubmit after any major site restructure.
- Ensure your sitemap contains only canonical, indexable URLs returning 200 status. Sitemaps with noindex pages or redirects confuse crawlers and waste crawl budget.
- Check all important pages return 200 OK status codes. Soft 404s (pages returning 200 but displaying "not found" content) are one of the most common and least-noticed indexation problems.
- Review the Coverage report in Google Search Console weekly. Investigate every page in the "Excluded" tab, especially "Crawled - currently not indexed" and "Discovered - currently not indexed."
- Noindex or consolidate thin, near-duplicate, and low-value pages to protect your index budget. In 2026, most large sites suffer from index bloat rather than crawl budget issues.
- Use the URL Inspection tool in Google Search Console to confirm how Google actually renders your key pages. The rendered HTML view reveals JavaScript-rendered content Google can and cannot see.
- Check that important navigation links are present in the DOM at load time, not dependent on JavaScript interactions. Hamburger menus that only appear after a click may be invisible to Googlebot.
- Run a log file analysis quarterly for large sites. Server logs show which URLs Googlebot actually crawled versus what your sitemap claims exists. The gap between the two reveals wasted crawl activity.
robots.txt. Fix index budget issues with noindex and canonical tags.
Core Web Vitals
Core Web Vitals are confirmed ranking signals. When two pages have equivalent content quality and authority, the one with better Core Web Vitals ranks higher. The thresholds below are measured at the 75th percentile of real user data from CrUX, not from PageSpeed Insights lab scores.
| Metric | Good | Needs Work | Poor | Main Culprits |
|---|---|---|---|---|
| LCP (Largest Contentful Paint) | Under 2.5s | 2.5s to 4s | Over 4s | Unoptimised hero images, no CDN, slow TTFB |
| INP (Interaction to Next Paint) | Under 200ms | 200ms to 500ms | Over 500ms | Heavy JavaScript, third-party scripts, long tasks |
| CLS (Cumulative Layout Shift) | Under 0.1 | 0.1 to 0.25 | Over 0.25 | Images without dimensions, late-loading fonts, ads |
| TTFB (Time to First Byte) | Under 800ms | 800ms to 1.8s | Over 1.8s | Slow hosting, no CDN, unoptimised server logic |
- Confirm you are measuring INP, not First Input Delay (FID). FID was retired in March 2024. If your dashboards still show FID data, update your monitoring tools before auditing.
- Add
fetchpriority="high"to your LCP element (usually the hero image or primary H1). This signals to the browser to load it before other resources. - Preload the LCP image using
<link rel="preload" as="image">in the document head to avoid it being discovered late in the render chain. - Set explicit
widthandheightattributes on every image and video embed. This prevents CLS by reserving layout space before the resource loads. - Defer non-critical JavaScript with
deferorasyncattributes. Long JavaScript tasks on the main thread are the primary cause of poor INP scores. - Reduce third-party scripts. Each analytics tag, chat widget, and ad script adds main thread execution time. Audit third-party impact in PageSpeed Insights under "Reduce the impact of third-party code."
- Serve assets from a CDN. TTFB over 800ms almost always indicates a server response problem, and a CDN is the fastest fix for globally distributed traffic.
- Measure using CrUX field data in Google Search Console, not just PageSpeed Insights lab scores. Lab scores measure a synthetic environment. Field data measures real users on real devices and networks.
Struggling to pass Core Web Vitals on mobile?
We identify the exact scripts, images, and render patterns holding back your field data scores.
HTTPS and Security
- Enforce HTTPS site-wide with a 301 redirect from all HTTP URLs. Check for any pages still accessible over HTTP by crawling with Screaming Frog using the HTTP prefix.
- Fix all mixed content warnings. Mixed content (HTTP resources on an HTTPS page) triggers browser warnings and can depress rankings. Audit with Chrome DevTools Security panel or a crawler.
- Verify your SSL certificate is valid, covers all subdomains you use, and has at least 60 days remaining. Expired certificates cause catastrophic drops in organic traffic within 24 hours.
Mobile-First Indexing
Google has indexed using the mobile version of pages since 2020. If your mobile and desktop experiences differ in content, structured data, or internal links, Google indexes the mobile version and your desktop-only content is invisible to search.
- Use responsive design rather than a separate
m.subdomain. If you still run a separate mobile site, ensure content parity is exact and hreflang annotations are correct. - Confirm the mobile version of your pages contains the same body content, headings, and structured data as the desktop version. In 2026, verify using a Googlebot Smartphone user agent, not a desktop agent.
- Pass Google's Mobile Usability report in Search Console with zero errors. Common failures: text too small to read, clickable elements too close together, content wider than the screen.
- Verify navigation elements are present in the DOM at page load. Navigation inside JavaScript that only renders on interaction may be invisible to Googlebot's mobile crawler.
URL Structure and Canonicals
Canonical tags are the most commonly misconfigured technical SEO element on large sites. A self-referencing canonical on a page that should be pointing to a consolidation URL pushes duplicate content signals exactly the wrong direction.
- Ensure all URLs are lowercase, hyphen-separated, and free of tracking parameters in their canonical form. Use canonical tags or parameter handling in Search Console for URLs with UTM, session ID, or filter parameters.
- Add a self-referencing canonical tag to every page that should be indexed independently. This protects against duplicate content from URL variations (HTTP vs HTTPS, trailing slash vs no trailing slash, www vs non-www).
- Canonical tags on paginated category pages should point to page 1 only if the paginated content is truly duplicate. If each page surfaces unique products, a self-referencing canonical on each page is correct.
- Resolve www and non-www inconsistency. Choose one preferred version, 301 redirect the other, and set the preferred version in Google Search Console and Bing Webmaster Tools.
- Check for canonical chains: Page A canonicals to Page B, which canonicals to Page C. Google may ignore the chain entirely. All canonicals should point directly to the final preferred URL.
Site Architecture and Internal Linking
Site architecture determines how crawl equity flows across your domain. A well-structured site concentrates authority on the pages that matter most. A poorly structured one dilutes it across thousands of orphaned pages, tag archives, and auto-generated facet URLs.
- No important page should be more than three clicks from the homepage. Pages buried at depth 6 or deeper receive significantly less crawl attention and rank proportionally lower.
- Implement breadcrumb navigation on all interior pages. Breadcrumbs create shallow internal link paths, help Google understand your hierarchy, and are eligible for breadcrumb rich results in SERPs.
- Audit for orphaned pages: pages with no internal links pointing to them. A page without internal links receives no crawl equity from the rest of your site. Run a crawl, export all internal links, and cross-reference against your URL inventory.
- Use descriptive anchor text on all internal links. "Read our guide to local SEO" passes relevance signals. "Click here" and "Learn more" pass nothing.
- Limit the number of links on any single page to a reasonable amount. Pages with hundreds or thousands of links dilute the equity passed per link and may trigger quality concerns from Google's systems.
Structured Data and Schema
Structured data has become more strategically important in 2026 than at any previous point. It is now the primary mechanism by which AI search engines, including Google AI Mode and Perplexity, extract factual information from your pages to cite in generated answers. A site with comprehensive, error-free schema is architecturally positioned to appear in AI-generated responses regardless of traditional ranking position.
- Implement the schema type that most precisely describes each page:
Articlefor editorial content,Productfor product pages,LocalBusinessfor location pages,FAQPagefor FAQ sections,HowTofor instructional guides. - Use JSON-LD format for all structured data, placed in the
<head>or just before the closing</body>tag. Google recommends JSON-LD. Avoid microdata on new implementations. - Validate every schema implementation with Google's Rich Results Test before and after deployment. Zero errors is the requirement. Warnings are acceptable but should be investigated.
- Ensure schema data matches visible page content exactly. If your
Productschema states a price of $49 but the page shows $79, Google may suppress the rich result and flag a policy violation. - Add
FAQPageschema to your highest-traffic informational pages. FAQ-formatted content with schema is among the content patterns most frequently cited in AI Overviews and AI Mode responses.
Want to know which of your pages are citation-ready for AI search?
We map your schema coverage, content structure, and AI Overview eligibility across your full site.
Page Speed and Image Optimisation
- Convert all images to WebP or AVIF format. AVIF achieves 50 to 80% smaller file sizes than JPEG at equivalent quality. Use WebP as a fallback for browsers that do not yet support AVIF.
- Apply
loading="lazy"to all below-fold images andfetchpriority="high"to your LCP element. Never apply lazy loading to the hero or first visible image. - Minify CSS, JavaScript, and HTML. Most modern build tools do this automatically. If you are on WordPress, verify your caching and minification plugin is active and configured correctly.
- Enable browser caching and compression (Gzip or Brotli) at the server level. These are typically server configuration or CDN settings, not application-level changes.
- Use
content-visibility: autoon long-page sections below the fold. This defers rendering of off-screen content and can meaningfully reduce initial page load work without affecting user experience.
Duplicate Content and Hreflang
- Audit for duplicate title tags and meta descriptions across your site using Screaming Frog or Sitebulb. Duplicate titles signal that pages may have duplicate content. Each page needs a unique title.
- Handle faceted navigation and filter URLs deliberately. Decide for each filter type whether it warrants indexing (real keyword demand exists) or should be noindexed (near-duplicate of parent category). Apply your decision consistently across the template.
- Noindex or consolidate thin content: tag pages, author archive pages, date archive pages, and empty category pages. These accumulate in CMS installs and dilute index quality at scale.
- For multilingual or multiregional sites, implement
hreflangtags on every page variant. Hreflang must be bidirectional: each language version must reference all others. A one-way hreflang is ignored by Google. - Handle discontinued or out-of-stock product pages with a 301 redirect to the nearest live replacement. Never redirect to the homepage. Google treats homepage redirects from product or content URLs as soft 404s.
- Print-friendly versions of pages indexed without a canonical pointing to the original
- Session IDs appended to URLs (e.g.
?sessionid=abc123) generating unique crawlable URLs - Both
yourdomain.comandwww.yourdomain.comserving identical content with no redirect - HTTP and HTTPS versions of pages both accessible without a 301 redirect in place
- Boilerplate content in CMS category descriptions shared across multiple category pages
Technical SEO for AI Search Visibility
AI Overviews appear on 25% of Google searches in Q1 2026, and the rate continues rising. Being cited in an AI Overview earns significantly more clicks than ranking organically on the same query without citation. The technical foundation for AI citation is distinct from traditional ranking and requires deliberate attention.
What the Technical Audit Looks Like for AI Visibility
- Structure your most important pages in BLUF format (Bottom Line Up Front): state the direct answer in the first paragraph, then support it with detail. 44% of LLM citations come from the first 30% of page text.
- Use HTML elements that AI systems parse efficiently: definition lists (
<dl>), tables, and ordered lists. Unstructured narrative paragraphs are extracted less reliably by RAG systems. - Add FAQ schema to every page where users commonly ask specific questions. FAQ content with schema is one of the highest-citation formats in Google AI Mode and AI Overviews.
- Verify your site is not blocking AI crawlers in
robots.txt. If you have a blanketUser-agent: *disallow rule anywhere, check it is not catching Googlebot-News, Googlebot-Image, or Google-Extended (the AI training crawler). - Ensure entity consistency across your site. If your brand or main topic is referenced inconsistently (abbreviations, alternate spellings, old names), AI systems may not correctly associate all your content with the target entity.
Full 47-Point Master Checklist
Use this as your audit template. Run it quarterly for maintained sites and immediately before and after any major migration, platform change, or redesign.
- robots.txt not blocking any crawlable page you want indexed
- XML sitemap submitted to Google Search Console and Bing Webmaster Tools
- Sitemap contains only canonical, indexable, 200-status URLs
- All important pages return 200 OK (no soft 404s on key URLs)
- Coverage report reviewed weekly in Google Search Console
- Thin, duplicate, and low-value pages noindexed or consolidated
- URL Inspection used to verify actual Google render of key pages
- Navigation links present in DOM at load time (not JS-gated)
- Log file analysis run quarterly (large sites)
- Monitoring INP, not FID (FID was retired March 2024)
- LCP element uses fetchpriority="high"
- LCP image preloaded in document head
- All images and video embeds have explicit width and height attributes
- Non-critical JavaScript deferred or async
- Third-party script impact reviewed and minimised
- Static assets served from a CDN
- Performance measured via CrUX field data, not only lab scores
- HTTPS enforced site-wide with 301 redirect from HTTP
- No mixed content warnings on any page
- SSL certificate valid with 60+ days remaining
- Responsive design implemented (not a separate mobile subdomain)
- Mobile content identical to desktop content (verified with Googlebot Smartphone agent)
- Mobile Usability report shows zero errors in Search Console
- Navigation accessible in DOM without JavaScript interaction
- URLs lowercase, hyphen-separated, no tracking parameters in canonical form
- Self-referencing canonical on every independently indexable page
- No canonical chains (A to B to C). All canonicals point directly to final URL
- www vs non-www resolved with 301 redirect and Search Console preference set
- Pagination handled with self-referencing canonicals or proper rel=next where applicable
- No important page more than 3 clicks from homepage
- Breadcrumb navigation on all interior pages
- No orphaned pages (every important page has at least one internal link)
- Internal links use descriptive anchor text
- No page with an unreasonable number of outgoing links diluting equity
- Correct schema type applied to each page (Article, Product, FAQ, LocalBusiness, HowTo)
- All schema implemented in JSON-LD format
- Rich Results Test returns zero errors on every implemented schema
- Schema data matches visible page content (price, rating, availability)
- FAQPage schema added to high-traffic informational pages
- All images in WebP or AVIF format
- Below-fold images use loading="lazy", hero image uses fetchpriority="high"
- CSS, JavaScript, and HTML minified
- Browser caching and Gzip or Brotli compression enabled at server level
- content-visibility: auto applied to long off-screen page sections
- No duplicate title tags or meta descriptions across the site
- Faceted navigation and filter URLs managed with noindex or canonicals
- Thin content pages (tag, author, date archives) noindexed or consolidated
- Hreflang tags bidirectional and consistent across all language variants
- Discontinued pages 301 redirected to nearest live replacement (not to homepage)
Frequently Asked Questions
Want an Expert Eye on Your Technical SEO?
We run the full 47-point audit against your site and deliver a prioritised fix list with estimated traffic impact per issue.