Why I'm publishing the scoring logic

IndexReady is a free tool I built to automatically grade websites on both traditional SEO and Generative Engine Optimization (GEO). Enter any URL and you get two independent 100-point scores within seconds.

I'm Josef — a web engineer, and IndexReady is a personal project I run and improve continuously.

One question I hear more than any other is: "What exactly are you grading, and why should I trust the score?" It's a fair question. Acting on a black-box number is how teams end up spending weeks on the wrong fixes.

This post documents the scoring logic for all 26 items, the threshold values, and the design intent as concretely as I can. The implementation lives under src/lib/analyzers/seo/ and src/lib/analyzers/geo/, with one file per item. The rest of this article maps onto those files one-to-one.

Design intent — why SEO and GEO are kept independent

The biggest design decision is treating SEO and GEO as two independent categories, each scored out of 100. Most existing SEO tools collapse everything into a single "SEO score." That's user-friendly, but it's a problem in an AI-search era.

Imagine a corporate site with a flawless title tag, polished meta description, and a clean sitemap. If llms.txt is missing and GPTBot is blocked in robots.txt, ChatGPT and Perplexity have essentially zero chance of citing the site. Compressing that situation into a single number hides a critical signal.

IndexReady takes the position that Google search optimization and AI search optimization are separate axes and surfaces them side by side, so users can see at a glance which side is weak.

Scoring algorithm — how items aggregate

Each item returns a status (ok / warning / error) and a numeric score (0 to its weight). The category score (SEO 100, GEO 100) is a simple sum, and the total is the two categories added together for a 200-point ceiling.

Status	Color	Meaning
ok	Green	Full points
warning	Yellow	Partial credit (often around 50% of the weight)
error	Red	0 points — critical absence or misconfiguration

The result page surfaces the top three highest-loss items as "Top Priorities" so improvement order is obvious. The point of the tool is not the total score — it's deciding what to fix first.

SEO scoring — 15 items with thresholds

The SEO category sums to 100 points. Weights balance "impact on rankings" against "implementation cost."

Item	Weight	Full marks	Partial credit	Zero
Title tag	10	30–60 chars	Out of range (5)	Missing
Meta description	10	70–160 chars	Out of range (5)	Missing
Meta robots noindex	8	No noindex directive	—	noindex detected (likely misconfig)
PageSpeed	8	PSI ≥ 90	50–89 (5) / unavailable (3)	< 50 (1)
Core Web Vitals	8	LCP < 2.5s / INP < 200ms / CLS < 0.1 all pass	One metric over threshold or unavailable (5 or 3)	Two or more over threshold (1)
Heading structure	6	Exactly one h1 + at least one h2	h1 present, h2 absent (4) / multiple h1s (2)	No h1
OGP tags	6	og:title / description / image all set	One or two missing (2)	All missing
HTTPS	6	URL on https	—	http
Canonical tag	6	Present	—	Missing
Image alt	6	All images have alt (or no images)	≤30% of images missing alt (3)	>30% missing
robots.txt	6	Both User-agent and Sitemap directives	One of the two missing (3)	Missing
sitemap.xml	6	Contains `<urlset>` or `<sitemapindex>`	Found but not valid XML sitemap (3)	Missing
Content length	6	800+ chars (JA) or 500+ words (EN)	300–799 chars / 200–499 words (3)	< 300 chars / < 200 words
HTML lang	4	lang attribute present and well-formed (`ja`, `en-US`, etc.)	Malformed value (2)	Missing
viewport	4	Contains `width=device-width`	viewport set but `width=device-width` absent (2)	Missing

Why title and meta description are 10 points each

These two carry the maximum weight because they directly affect CTR. They're the first thing a user sees in search results, and a weak title kills clicks even when the page ranks well. Implementation cost is near zero, so any site that hasn't done them is leaving easy traffic on the table. The 30–60 / 70–160 ranges follow Google Search Central — Title Link Best Practices and the typical snippet-truncation lengths.

Why noindex is worth 8 points

I weighted the noindex check at 8 because the cost of a misconfiguration is wildly out of proportion to the cost of the fix. Staging templates leaking into production and silently de-indexing the entire site is a recurring incident class. The damage is enormous; the fix takes seconds. The score reflects that asymmetry.

Why PageSpeed and Core Web Vitals are separate

PageSpeed (8) and Core Web Vitals (8) look redundant but measure different things. PageSpeed is the Lighthouse aggregate (mobile, lab data) — what improvement headroom looks like. CWV uses field data (CrUX) for LCP / INP / CLS, which is the user's actual experience. I keep both because they answer different questions. Thresholds follow web.dev — Core Web Vitals (LCP < 2.5s, INP < 200ms, CLS < 0.1).

Why lang and viewport are 4 points

HTML lang and meta viewport don't tank rankings on their own, so the weights are modest. They stay in the scorer because mobile-first indexing makes them basics nobody should skip.

Why content-length thresholds differ by language

Information density differs between Japanese and English. The implementation detects character classes to decide whether the page is Japanese, then counts characters (excluding whitespace) for JA and word counts for EN. The thresholds (~800 chars JA ≒ ~500 words EN) are operational rules for separating "thin content warning" from "clear deficiency error" — not research-grade values.

GEO scoring — 11 items with thresholds

The GEO category also sums to 100 points across 11 items, so per-item weights are heavier than SEO.

Item	Weight	Full marks	Partial credit	Zero
llms.txt	12	Both llms.txt and llms-full.txt, llms.txt has ≥5 lines	Both present but thin content (9) / llms.txt only (7)	Neither file
AI crawler permissions	12	All 6 major AI bots allowed	robots.txt missing or 1–2 bots blocked (6)	3+ bots blocked (2)
Structured data JSON-LD	10	3+ `@type` values detected	1–2 types (6) / JSON-LD present but not parseable (3)	No JSON-LD
Citation quality	10	3+ unique authoritative external domains	1–2 authoritative (5) / 3+ external links but no authoritative (3)	Few external links / none authoritative
Clear answers	8	3+ paragraphs of 30–200 chars + a definition pattern (`is defined as`, `means that`, `とは`, etc.)	1–2 paragraphs + definition pattern (5) / 3+ paragraphs only (5) / 1–2 paragraphs only (warning bracket)	No paragraphs or internal score below 3
FAQ / list / definition	8	Composite ≥5 from: FAQ heading (3) + ≥3-item list (2) + `<dl>` (2) + `<details>` (2)	Composite 2–4 (raw value reported as the score)	Composite 0–1
Question-format headings	8	3+ question-format h2/h3	1–2 (4)	0 or no h2/h3
Statistics	8	5+ numeric expressions (%, years, counts, etc.)	2–4 (4)	0–1
Content freshness	8	At least 2 of: `datePublished`, `dateModified`, `<time datetime>`, `last-modified`	Exactly 1 signal (4)	None
E-E-A-T signals	8	Composite ≥4 from: author info (2) + ≥2 external references (2) + date info (1)	Composite 2–3 (reported as composite + 1, capped at 8)	Composite 0–1
Google-recommended schema	8	3+ Google-recommended types	1–2 types (4)	0

Why llms.txt and AI crawler permissions are each 12 points

These two share a property: without them, AI exposure is physically zero. If llms.txt is missing, AI clients can't efficiently understand your site structure. If GPTBot or ClaudeBot is blocked in robots.txt, those crawlers don't reach you at all. Neither is "make it better" — they're "do this or stay invisible." The weight reflects that the door is shut. The major-bot list as of 2026 is GPTBot / ClaudeBot / PerplexityBot / Google-Extended / CCBot / anthropic-ai.

The "citability" cluster

The cluster of items measuring how easy it is for AI to cite a page centers on:

Clear answers — AI prefers concise definitional paragraphs over rambling prose. The implementation combines two signals: paragraph count (30–200 chars) and presence of definition patterns (とは, is defined as, refers to, means that).
Question-format headings — phrasings like "What is X?" / "How do I X?" align with question-driven AI search.
Statistics — concrete numbers are trust signals; the implementation extracts numeric expressions (percentages, years, counts, million/billion) via regex from body text.
E-E-A-T — Experience, Expertise, Authoritativeness, Trustworthiness. We score author info (rel="author", [itemprop="author"], etc.), at least two external references, and date info (<time>, datePublished, dateModified, etc.).

These were minor concerns under classic SEO and have moved into the spotlight under GEO.

How citation quality is judged

The citation-quality item parses outbound links and resolves each to a hostname. Government, academic, and Japanese institutional TLDs (*.gov, *.edu, *.ac.jp, *.go.jp, *.or.jp) plus a curated list of authoritative publishers (wikipedia.org, scholar.google.com, pubmed, doi.org, arxiv.org, w3.org, schema.org, developers.google.com, web.dev, mdn.mozilla.org) count as authoritative. 3+ unique authoritative domains earn full marks (10); 1–2 authoritative domains earn 5; if no authoritative domain is present but the page links out to 3+ external domains, you still get 3; otherwise 0. The weight is on quality of sources, not the raw link count.

How content freshness is judged

Content freshness is worth 8 points. Full marks require at least two distinct freshness signals from datePublished, dateModified, <time datetime>, and last-modified meta. One signal earns partial credit (4); zero earns 0. Putting both datePublished and dateModified into JSON-LD is the most robust way to clear this, but combinations like <time> + datePublished also reach full marks.

Three principles behind every weight

The whole weight scheme follows three principles:

Items with high misconfiguration risk get heavy weights — noindex, canonical, robots.txt
Items that are cheap to implement and high-impact get heavy weights — title, meta description
Items that gate the entire category if missing get the maximum weight — llms.txt, AI crawler permissions

Conversely, leaf-level details (e.g., the exact wording of an alt attribute) are weighted lightly. For most sites, the wins come from "no fatal misconfigurations" plus "the basics are covered" — not micro-tuning.

Official references the rules build on

Where possible, every threshold maps to public documentation:

General SEO: Google Search Central — SEO Starter Guide
Title and snippet length: Title Link Best Practices
Core Web Vitals: web.dev — Core Web Vitals
Structured data: Schema.org and Google Search Gallery
robots.txt: Robots Exclusion Protocol (RFC 9309)
OGP: Open Graph protocol
llms.txt: llmstxt.org — an emerging standard

Internal AI model behavior isn't public, so some GEO items reflect "best practices observed today" rather than spec text. When specs or behavior change, weights and thresholds change with them.

Reading the score

The recommended order when reviewing a result is:

Errors first — red items are critical, fix these before anything else
Warnings next — yellow items are early signs of impact
Compare SEO vs GEO — see which side is weaker
Concentrate on heavy weights — for limited time, focus on items worth more

Trying to hit 200/200 isn't realistic. "No errors plus full marks on the top 5 weights" is a more practical bar. As a worked example, my own verification report on index-ready.jp landed at 185/200 — the gap is concentrated in PageSpeed/LCP, which is exactly where I'm going to spend the next iteration.

FAQ

Will improving this score actually move my rankings?

The score measures whether your site has the technical foundation in place. It doesn't promise rankings, but a site without the foundation can't compete for them either. Conversely, a high score is not a ranking guarantee — content quality, link signals, and topical authority all matter and are out of scope for this tool.

Why is GEO important enough to be its own category?

Citation traffic from chat-style AI search and AI Overview keeps growing. The signals that make a page citable (llms.txt, clear answers, structured data) overlap only partially with classic SEO. Folding them into a single SEO score lets them get drowned out — making GEO a peer category keeps them visible.

Will the scoring items change?

Yes. The list will be tuned as AI search behavior and Google updates evolve. The current 26 items reflect early-2026 reality. When a spec or rule changes, this post and the Scoring page get updated together.

Are weights going to be rebalanced?

No major rebalancing is planned, because changing weights frequently makes historical scores incomparable. When a new item is added, neighboring weights may shift slightly to keep the category sum at 100; those shifts are documented in this post.

Try it on your own URL

IndexReady takes a single URL and returns the full 26-item evaluation in seconds. It's free and there's no signup.

Each item's detection rules and improvement guidance are also published on the Scoring criteria page. Read this post and that page together for the complete picture.

IndexReady's Scoring Logic Explained: How 26 Items Are Graded with Real Thresholds