Top AI SEO Experts - Header
Boost your rankings with AI-powered SEO - Get your free audit today and start ranking higher!

The 47-Point Technical SEO Checklist Every Site Needs in 2026

The 47-Point Technical SEO Checklist Every Site Needs in 2025

Most technical SEO problems do not announce themselves. They accumulate quietly in robots.txt files, canonical tags, and bloated index counts until rankings drop and no one can explain why. This checklist covers every layer of technical SEO that matters in 2026, including what has changed since the last time your audit was run.

25%

of Google searches now trigger an AI Overview (Conductor, Q1 2026)

60%

of all searches end without a click in 2026, driven by AI Overviews and snippets (Incremys)

44%

of LLM citations come from the first 30% of a page's text (Position Digital, April 2026)

Technical SEO falls into eight distinct layers. A problem in any one of them can suppress pages that are otherwise well-optimised. Work through these sections in order: crawlability issues at the top of the list prevent fixes further down from having any effect.

Crawlability and Indexing

Crawlability is the foundation of every other technical SEO effort. Google cannot rank a page it cannot find, and it cannot benefit from schema on a page it cannot render. Most ranking problems in 2026 trace back to indexation issues, not content quality.

9 Crawlability Checks
  • Verify your robots.txt is not blocking any page you want indexed. Check yourdomain.com/robots.txt directly and test specific URLs using Google Search Console's robots.txt Tester.
  • Submit your XML sitemap to Google Search Console and Bing Webmaster Tools. Resubmit after any major site restructure.
  • Ensure your sitemap contains only canonical, indexable URLs returning 200 status. Sitemaps with noindex pages or redirects confuse crawlers and waste crawl budget.
  • Check all important pages return 200 OK status codes. Soft 404s (pages returning 200 but displaying "not found" content) are one of the most common and least-noticed indexation problems.
  • Review the Coverage report in Google Search Console weekly. Investigate every page in the "Excluded" tab, especially "Crawled - currently not indexed" and "Discovered - currently not indexed."
  • Noindex or consolidate thin, near-duplicate, and low-value pages to protect your index budget. In 2026, most large sites suffer from index bloat rather than crawl budget issues.
  • Use the URL Inspection tool in Google Search Console to confirm how Google actually renders your key pages. The rendered HTML view reveals JavaScript-rendered content Google can and cannot see.
  • Check that important navigation links are present in the DOM at load time, not dependent on JavaScript interactions. Hamburger menus that only appear after a click may be invisible to Googlebot.
  • Run a log file analysis quarterly for large sites. Server logs show which URLs Googlebot actually crawled versus what your sitemap claims exists. The gap between the two reveals wasted crawl activity.
⚠️
Index budget vs. crawl budget These are different problems. Crawl budget is how many pages Googlebot will crawl in a given period. Index budget is how many pages Google considers worth retaining in its index. Most small and medium sites have no crawl budget problem. Most large sites have an index budget problem caused by thousands of near-duplicate filter URLs, thin tag pages, and paginated content. Fix crawl budget issues with robots.txt. Fix index budget issues with noindex and canonical tags.

Core Web Vitals

Core Web Vitals are confirmed ranking signals. When two pages have equivalent content quality and authority, the one with better Core Web Vitals ranks higher. The thresholds below are measured at the 75th percentile of real user data from CrUX, not from PageSpeed Insights lab scores.

MetricGoodNeeds WorkPoorMain Culprits
LCP (Largest Contentful Paint) Under 2.5s 2.5s to 4s Over 4s Unoptimised hero images, no CDN, slow TTFB
INP (Interaction to Next Paint) Under 200ms 200ms to 500ms Over 500ms Heavy JavaScript, third-party scripts, long tasks
CLS (Cumulative Layout Shift) Under 0.1 0.1 to 0.25 Over 0.25 Images without dimensions, late-loading fonts, ads
TTFB (Time to First Byte) Under 800ms 800ms to 1.8s Over 1.8s Slow hosting, no CDN, unoptimised server logic
8 Core Web Vitals Checks
  • Confirm you are measuring INP, not First Input Delay (FID). FID was retired in March 2024. If your dashboards still show FID data, update your monitoring tools before auditing.
  • Add fetchpriority="high" to your LCP element (usually the hero image or primary H1). This signals to the browser to load it before other resources.
  • Preload the LCP image using <link rel="preload" as="image"> in the document head to avoid it being discovered late in the render chain.
  • Set explicit width and height attributes on every image and video embed. This prevents CLS by reserving layout space before the resource loads.
  • Defer non-critical JavaScript with defer or async attributes. Long JavaScript tasks on the main thread are the primary cause of poor INP scores.
  • Reduce third-party scripts. Each analytics tag, chat widget, and ad script adds main thread execution time. Audit third-party impact in PageSpeed Insights under "Reduce the impact of third-party code."
  • Serve assets from a CDN. TTFB over 800ms almost always indicates a server response problem, and a CDN is the fastest fix for globally distributed traffic.
  • Measure using CrUX field data in Google Search Console, not just PageSpeed Insights lab scores. Lab scores measure a synthetic environment. Field data measures real users on real devices and networks.

Struggling to pass Core Web Vitals on mobile?

We identify the exact scripts, images, and render patterns holding back your field data scores.

Request a Performance Audit →

HTTPS and Security

3 HTTPS Checks
  • Enforce HTTPS site-wide with a 301 redirect from all HTTP URLs. Check for any pages still accessible over HTTP by crawling with Screaming Frog using the HTTP prefix.
  • Fix all mixed content warnings. Mixed content (HTTP resources on an HTTPS page) triggers browser warnings and can depress rankings. Audit with Chrome DevTools Security panel or a crawler.
  • Verify your SSL certificate is valid, covers all subdomains you use, and has at least 60 days remaining. Expired certificates cause catastrophic drops in organic traffic within 24 hours.

Mobile-First Indexing

Google has indexed using the mobile version of pages since 2020. If your mobile and desktop experiences differ in content, structured data, or internal links, Google indexes the mobile version and your desktop-only content is invisible to search.

4 Mobile Checks
  • Use responsive design rather than a separate m. subdomain. If you still run a separate mobile site, ensure content parity is exact and hreflang annotations are correct.
  • Confirm the mobile version of your pages contains the same body content, headings, and structured data as the desktop version. In 2026, verify using a Googlebot Smartphone user agent, not a desktop agent.
  • Pass Google's Mobile Usability report in Search Console with zero errors. Common failures: text too small to read, clickable elements too close together, content wider than the screen.
  • Verify navigation elements are present in the DOM at page load. Navigation inside JavaScript that only renders on interaction may be invisible to Googlebot's mobile crawler.

URL Structure and Canonicals

Canonical tags are the most commonly misconfigured technical SEO element on large sites. A self-referencing canonical on a page that should be pointing to a consolidation URL pushes duplicate content signals exactly the wrong direction.

5 URL and Canonical Checks
  • Ensure all URLs are lowercase, hyphen-separated, and free of tracking parameters in their canonical form. Use canonical tags or parameter handling in Search Console for URLs with UTM, session ID, or filter parameters.
  • Add a self-referencing canonical tag to every page that should be indexed independently. This protects against duplicate content from URL variations (HTTP vs HTTPS, trailing slash vs no trailing slash, www vs non-www).
  • Canonical tags on paginated category pages should point to page 1 only if the paginated content is truly duplicate. If each page surfaces unique products, a self-referencing canonical on each page is correct.
  • Resolve www and non-www inconsistency. Choose one preferred version, 301 redirect the other, and set the preferred version in Google Search Console and Bing Webmaster Tools.
  • Check for canonical chains: Page A canonicals to Page B, which canonicals to Page C. Google may ignore the chain entirely. All canonicals should point directly to the final preferred URL.
💡
Canonical tags are hints, not directives Google treats canonical tags as strong hints, not absolute instructions. If your canonical tag points to a page that is noindexed, blocked by robots.txt, or has significantly different content, Google may override it. The canonical and the target page need to be consistent for the hint to be respected.

Site Architecture and Internal Linking

Site architecture determines how crawl equity flows across your domain. A well-structured site concentrates authority on the pages that matter most. A poorly structured one dilutes it across thousands of orphaned pages, tag archives, and auto-generated facet URLs.

5 Architecture Checks
  • No important page should be more than three clicks from the homepage. Pages buried at depth 6 or deeper receive significantly less crawl attention and rank proportionally lower.
  • Implement breadcrumb navigation on all interior pages. Breadcrumbs create shallow internal link paths, help Google understand your hierarchy, and are eligible for breadcrumb rich results in SERPs.
  • Audit for orphaned pages: pages with no internal links pointing to them. A page without internal links receives no crawl equity from the rest of your site. Run a crawl, export all internal links, and cross-reference against your URL inventory.
  • Use descriptive anchor text on all internal links. "Read our guide to local SEO" passes relevance signals. "Click here" and "Learn more" pass nothing.
  • Limit the number of links on any single page to a reasonable amount. Pages with hundreds or thousands of links dilute the equity passed per link and may trigger quality concerns from Google's systems.

Structured Data and Schema

Structured data has become more strategically important in 2026 than at any previous point. It is now the primary mechanism by which AI search engines, including Google AI Mode and Perplexity, extract factual information from your pages to cite in generated answers. A site with comprehensive, error-free schema is architecturally positioned to appear in AI-generated responses regardless of traditional ranking position.

5 Structured Data Checks
  • Implement the schema type that most precisely describes each page: Article for editorial content, Product for product pages, LocalBusiness for location pages, FAQPage for FAQ sections, HowTo for instructional guides.
  • Use JSON-LD format for all structured data, placed in the <head> or just before the closing </body> tag. Google recommends JSON-LD. Avoid microdata on new implementations.
  • Validate every schema implementation with Google's Rich Results Test before and after deployment. Zero errors is the requirement. Warnings are acceptable but should be investigated.
  • Ensure schema data matches visible page content exactly. If your Product schema states a price of $49 but the page shows $79, Google may suppress the rich result and flag a policy violation.
  • Add FAQPage schema to your highest-traffic informational pages. FAQ-formatted content with schema is among the content patterns most frequently cited in AI Overviews and AI Mode responses.
Structured data for AI citation Structured content (headings, lists, tables, FAQ sections) is the most effective format for AI search visibility. 44% of all LLM citations come from the first 30% of a page's text. Lead with your most important, directly answerable content. Bury the context after the answer, not before.

Want to know which of your pages are citation-ready for AI search?

We map your schema coverage, content structure, and AI Overview eligibility across your full site.

Get Your Structured Data Audit →

Page Speed and Image Optimisation

5 Page Speed Checks
  • Convert all images to WebP or AVIF format. AVIF achieves 50 to 80% smaller file sizes than JPEG at equivalent quality. Use WebP as a fallback for browsers that do not yet support AVIF.
  • Apply loading="lazy" to all below-fold images and fetchpriority="high" to your LCP element. Never apply lazy loading to the hero or first visible image.
  • Minify CSS, JavaScript, and HTML. Most modern build tools do this automatically. If you are on WordPress, verify your caching and minification plugin is active and configured correctly.
  • Enable browser caching and compression (Gzip or Brotli) at the server level. These are typically server configuration or CDN settings, not application-level changes.
  • Use content-visibility: auto on long-page sections below the fold. This defers rendering of off-screen content and can meaningfully reduce initial page load work without affecting user experience.

Duplicate Content and Hreflang

5 Duplicate Content Checks
  • Audit for duplicate title tags and meta descriptions across your site using Screaming Frog or Sitebulb. Duplicate titles signal that pages may have duplicate content. Each page needs a unique title.
  • Handle faceted navigation and filter URLs deliberately. Decide for each filter type whether it warrants indexing (real keyword demand exists) or should be noindexed (near-duplicate of parent category). Apply your decision consistently across the template.
  • Noindex or consolidate thin content: tag pages, author archive pages, date archive pages, and empty category pages. These accumulate in CMS installs and dilute index quality at scale.
  • For multilingual or multiregional sites, implement hreflang tags on every page variant. Hreflang must be bidirectional: each language version must reference all others. A one-way hreflang is ignored by Google.
  • Handle discontinued or out-of-stock product pages with a 301 redirect to the nearest live replacement. Never redirect to the homepage. Google treats homepage redirects from product or content URLs as soft 404s.
⚠️
Common mistakes that create duplicate content without anyone noticing
  • Print-friendly versions of pages indexed without a canonical pointing to the original
  • Session IDs appended to URLs (e.g. ?sessionid=abc123) generating unique crawlable URLs
  • Both yourdomain.com and www.yourdomain.com serving identical content with no redirect
  • HTTP and HTTPS versions of pages both accessible without a 301 redirect in place
  • Boilerplate content in CMS category descriptions shared across multiple category pages

Technical SEO for AI Search Visibility

AI Overviews appear on 25% of Google searches in Q1 2026, and the rate continues rising. Being cited in an AI Overview earns significantly more clicks than ranking organically on the same query without citation. The technical foundation for AI citation is distinct from traditional ranking and requires deliberate attention.

What the Technical Audit Looks Like for AI Visibility

  • Structure your most important pages in BLUF format (Bottom Line Up Front): state the direct answer in the first paragraph, then support it with detail. 44% of LLM citations come from the first 30% of page text.
  • Use HTML elements that AI systems parse efficiently: definition lists (<dl>), tables, and ordered lists. Unstructured narrative paragraphs are extracted less reliably by RAG systems.
  • Add FAQ schema to every page where users commonly ask specific questions. FAQ content with schema is one of the highest-citation formats in Google AI Mode and AI Overviews.
  • Verify your site is not blocking AI crawlers in robots.txt. If you have a blanket User-agent: * disallow rule anywhere, check it is not catching Googlebot-News, Googlebot-Image, or Google-Extended (the AI training crawler).
  • Ensure entity consistency across your site. If your brand or main topic is referenced inconsistently (abbreviations, alternate spellings, old names), AI systems may not correctly associate all your content with the target entity.

Full 47-Point Master Checklist

Use this as your audit template. Run it quarterly for maintained sites and immediately before and after any major migration, platform change, or redesign.

9 Crawlability and Indexing
  • robots.txt not blocking any crawlable page you want indexed
  • XML sitemap submitted to Google Search Console and Bing Webmaster Tools
  • Sitemap contains only canonical, indexable, 200-status URLs
  • All important pages return 200 OK (no soft 404s on key URLs)
  • Coverage report reviewed weekly in Google Search Console
  • Thin, duplicate, and low-value pages noindexed or consolidated
  • URL Inspection used to verify actual Google render of key pages
  • Navigation links present in DOM at load time (not JS-gated)
  • Log file analysis run quarterly (large sites)
8 Core Web Vitals
  • Monitoring INP, not FID (FID was retired March 2024)
  • LCP element uses fetchpriority="high"
  • LCP image preloaded in document head
  • All images and video embeds have explicit width and height attributes
  • Non-critical JavaScript deferred or async
  • Third-party script impact reviewed and minimised
  • Static assets served from a CDN
  • Performance measured via CrUX field data, not only lab scores
3 HTTPS and Security
  • HTTPS enforced site-wide with 301 redirect from HTTP
  • No mixed content warnings on any page
  • SSL certificate valid with 60+ days remaining
4 Mobile-First Indexing
  • Responsive design implemented (not a separate mobile subdomain)
  • Mobile content identical to desktop content (verified with Googlebot Smartphone agent)
  • Mobile Usability report shows zero errors in Search Console
  • Navigation accessible in DOM without JavaScript interaction
5 URL Structure and Canonicals
  • URLs lowercase, hyphen-separated, no tracking parameters in canonical form
  • Self-referencing canonical on every independently indexable page
  • No canonical chains (A to B to C). All canonicals point directly to final URL
  • www vs non-www resolved with 301 redirect and Search Console preference set
  • Pagination handled with self-referencing canonicals or proper rel=next where applicable
5 Site Architecture and Internal Linking
  • No important page more than 3 clicks from homepage
  • Breadcrumb navigation on all interior pages
  • No orphaned pages (every important page has at least one internal link)
  • Internal links use descriptive anchor text
  • No page with an unreasonable number of outgoing links diluting equity
5 Structured Data and Schema
  • Correct schema type applied to each page (Article, Product, FAQ, LocalBusiness, HowTo)
  • All schema implemented in JSON-LD format
  • Rich Results Test returns zero errors on every implemented schema
  • Schema data matches visible page content (price, rating, availability)
  • FAQPage schema added to high-traffic informational pages
5 Page Speed and Images
  • All images in WebP or AVIF format
  • Below-fold images use loading="lazy", hero image uses fetchpriority="high"
  • CSS, JavaScript, and HTML minified
  • Browser caching and Gzip or Brotli compression enabled at server level
  • content-visibility: auto applied to long off-screen page sections
5 Duplicate Content and Hreflang
  • No duplicate title tags or meta descriptions across the site
  • Faceted navigation and filter URLs managed with noindex or canonicals
  • Thin content pages (tag, author, date archives) noindexed or consolidated
  • Hreflang tags bidirectional and consistent across all language variants
  • Discontinued pages 301 redirected to nearest live replacement (not to homepage)

Frequently Asked Questions

How often should I run a full technical SEO audit? +
Run a full audit quarterly for actively maintained sites. Run one immediately before and after any major change: platform migration, domain change, redesign, or significant URL restructure. Between full audits, monitor Google Search Console weekly for new crawl errors, coverage drops, and Core Web Vitals regressions. Technical SEO issues introduced by development deployments can go undetected for weeks if there is no ongoing monitoring in place.
What tools do I need to run this checklist? +
The free tools cover most of the checklist: Google Search Console (crawl errors, Coverage report, Core Web Vitals field data, URL Inspection), Google PageSpeed Insights (lab performance data), Google's Rich Results Test (schema validation), and Bing Webmaster Tools (secondary index coverage). For a deeper audit, Screaming Frog SEO Spider (desktop crawler, free up to 500 URLs) handles URL analysis, duplicate content detection, and canonical auditing. Log file analysis requires raw server log access and a log analyser tool or a developer.
My PageSpeed Insights lab score is good but my CrUX field data fails. Which matters more? +
Field data matters more for rankings. PageSpeed Insights lab scores measure a controlled synthetic environment using specific hardware and network conditions. CrUX field data measures actual users visiting your site on their own devices and connections. Google uses field data in its ranking systems. A high lab score with poor field data often indicates your page performs well in ideal conditions but fails for real users on slower devices or congested networks. Diagnose the discrepancy by comparing your lab results against your field data by device type in Search Console.
Should I block AI crawlers in my robots.txt? +
It depends on your goals. Blocking Google-Extended (the AI training crawler) prevents your content from being used in AI training data but does not prevent your pages from being cited in AI Overviews. Those are separate systems. Blocking GPTBot (OpenAI) prevents ChatGPT training data use. If your goal is to appear in AI-generated answers in Google Search, you should not block Googlebot or Google's standard crawlers. Review your robots.txt to ensure blanket User-agent: * disallow rules are not catching crawlers you want to allow.
How long do technical SEO fixes take to affect rankings? +
It depends on the fix and your crawl frequency. Canonical tag corrections and robots.txt changes typically take effect within 1 to 2 Googlebot crawl cycles, which can range from days to weeks depending on your site's crawl rate. Core Web Vitals improvements are measured over a rolling 28-day window in CrUX, so ranking improvements appear 4 to 6 weeks after the fix is live. Noindexing large numbers of thin pages can produce ranking improvements in 4 to 8 weeks as Google reassesses your site's overall quality signal.
Is technical SEO different for JavaScript-heavy sites? +
Yes, significantly. JavaScript-rendered content is crawled in two waves: the HTML is crawled first, then Google queues the page for full rendering. The rendering queue means JavaScript-dependent content can take days or weeks longer to be indexed compared to server-rendered HTML. For critical content (navigation, body text, structured data, internal links), always render server-side or ensure it is present in the initial HTML response. Use the URL Inspection tool's rendered HTML view to confirm exactly what Google sees after rendering your pages.

Want an Expert Eye on Your Technical SEO?

We run the full 47-point audit against your site and deliver a prioritised fix list with estimated traffic impact per issue.

Audit delivered in 48 hours Prioritised by traffic impact No contract required

Table of Contents

Free SEO Audit

Ready to dominate your local market? Get a free local SEO audit.