What It Is
Entity salience is a 0.0-1.0 score that measures how prominent and important a specific named entity is within a piece of content. A salience of 0.8 means the entity is central to the page's topic; a salience of 0.1 means it's a passing mention. Google uses entity salience in its Natural Language API and ranking systems to understand what a page is truly about — not just what keywords it contains.
Why It Matters for Your SEO
Search engines have evolved from keyword matching to entity understanding. Google builds knowledge graphs connecting entities (people, places, products, organisations) and evaluates whether your content demonstrates genuine topical coverage. Pages that mention key entities with high salience — in titles, headings, and throughout the body — signal deep relevance. Pages where important entities only appear once in a footnote signal shallow coverage.
How Korvex Measures It
Each entity receives a salience score from 5 weighted components:
| Component | Weight | What It Measures |
|---|---|---|
| Frequency | 30% | How often the entity appears (10+ mentions approaches maximum) |
| Position | 25% | Where the entity first appears (earlier = higher salience) |
| Keyword Proximity | 20% | Distance between the entity and target keywords |
| Title/Heading Presence | 10% | Whether the entity appears in the page title (+7%) or headings (+3%) |
| Contextual Importance | 15% | Semantic similarity between the entity and the overall page topic |
Salience Thresholds
| Range | Meaning |
|---|---|
| 0.5-1.0 | Primary entity — the page is fundamentally about this entity |
| 0.3-0.5 | Supporting entity — important to the topic, mentioned substantially |
| 0.15-0.3 | Referenced entity — relevant but not central |
| 0.0-0.15 | Passing mention — barely relevant to the page's core topic |
How to Improve Your Score
- Mention key entities early — the position signal rewards entities that appear in the first 20% of content
- Use entities in headings — placing an entity in an H2 or H3 adds up to 3% to its salience
- Include entities in the title — worth up to 7% of the salience score
- Reference entities consistently — 5-10 natural mentions across different sections scores well
- Co-locate entities with keywords — keep target entities near your tracked keywords for the proximity signal
Salience Formula
salience = (frequency × 0.30) + (position × 0.25) + (proximity × 0.20) + (title_heading × 0.10) + (contextual × 0.15)
Where:
- Frequency:
min(occurrence_count / 10.0, 0.30)— 10+ occurrences reaches the cap - Position:
(1.0 - (first_position / text_length)) × 0.25— appearing at position 0 (start) = maximum - Keyword Proximity:
(1.0 / (1.0 + min_distance / 1000.0)) × 0.20— closer to keywords = higher - Title/Heading: In title = +0.07, in heading = +0.03, capped at 0.10
- Contextual Importance:
cosine_similarity(entity_embedding, mean_context_embedding) × 0.15
Entity Extraction Pipeline
- NER: spaCy
en_core_web_smidentifies named entities and their types - Type mapping: spaCy types → standardised types (e.g.,
ORG→ORGANIZATION,GPE/LOC/FAC→LOCATION) - Filtering: entities must be > 2 characters with at least one alphabetic character
- Quality filter: 3-layer filter removes technical artifacts, template text, and low-quality entities
- Salience scoring: 5-component weighted formula applied to each surviving entity
- Corpus storage: entities stored in
corpus_entity_occurrenceswith page reference and salience score
Entity Types
| Standardised Type | Source Types | Example |
|---|---|---|
| PERSON | PERSON | "Koray Tugberk" |
| ORGANIZATION | ORG | "Google", "Korvex" |
| LOCATION | GPE, LOC, FAC | "London", "Canary Wharf" |
| CONSUMER_GOOD | PRODUCT | "iPhone 15" |
| EVENT | EVENT | "Google I/O" |
| NUMBER | PERCENT, QUANTITY, CARDINAL | "45%", "3.5 million" |
Data Sources
- Extraction: Phase 5 page scoring pipeline
- Embeddings: SentenceTransformer
all-MiniLM-L6-v2(384 dimensions) - Storage:
corpus_entity_occurrencestable (entity × page × salience) - Knowledge graph: Linkable types (PERSON, ORGANIZATION, LOCATION, EVENT, WORK_OF_ART) stored in Neo4j
Related Concepts
- The Koray Score — entity coverage feeds the Central Entity fundamental
- Information Gain — unique entities drive the information gain score
- Semantic Networks — entities form nodes in the internal linking graph