Search engines index web pages by evaluating semantic concepts and search terms. To rank for a specific keyword, your copy must contain that keyword and its related variations naturally, without over-optimizing or triggering spam filters.
If you write content without reviewing keyword density, you risk either under-optimizing (failing to signal what the page is about) or over-optimizing (triggering keyword stuffing penalties).
To help you audit copy in real-time, FluxToolkit provides a free, client-side Keyword Extractor.
Keyword Extractor
Pull top keywords and 1–3 word phrases from any text. Get frequency counts, density percentages, and CSV export — all client-side, no account needed.
The Mathematics of Keyword Density and N-Grams
Keyword density is the percentage frequency with which a term or phrase appears in a block of text relative to the total word count.
Formula for Keyword Density
$$\text{Density (%)} = \left( \frac{\text{Term Count}}{\text{Total Words}} \right) \times 100$$
Example: If a 1,000-word blog post contains the phrase "SQL formatter" exactly 15 times, the keyword density for that term is:
$$\text{Density} = \left( \frac{15}{1000} \right) \times 100 = 1.5%$$
Understanding N-Grams
An n-gram is a contiguous sequence of $n$ items from a given sample of text. The keyword extractor parses text into three types of n-grams:
- 1-Gram (Unigram): Single words (e.g., "developer", "database").
- 2-Gram (Bigram): Two-word phrases (e.g., "SQL formatter", "source code").
- 3-Gram (Trigram): Three-word phrases (e.g., "free online tool", "open graph preview").
Analyzing bigrams and trigrams is crucial because search engines rely heavily on multi-word phrases to evaluate user intent.
Tokenization and Stop-Word Filtering
To extract meaningful keywords, the text engine performs two key prep steps:
- Tokenization: Splits the text input into individual word tokens, strips punctuation, and converts all letters to lowercase.
- Stop-Word Filtering: Filters out high-frequency grammatical connector words (like the, is, at, which, and, on) that do not carry semantic weight. Leaving stop-words active would cause them to dominate the extraction table, hiding your actual topics.
Here is a simple example of stop-word filtering in JavaScript:
const stopWords = new Set(['the', 'is', 'and', 'a', 'to', 'in', 'of', 'for', 'on', 'with', 'at']);
function cleanText(rawText) {
return rawText
.toLowerCase()
.replace(/[^\w\s]/g, '') // Strip punctuation
.split(/\s+/) // Tokenize
.filter(word => word.length > 2 && !stopWords.has(word));
}
SEO Best Practices: Keyword Density Benchmarks
There is no "magic percentage" for keyword density that guarantees rankings, but following these industry guidelines will keep your copy natural and compliant:
- Primary Keywords: Keep density between 1.0% and 2.5% for your main search queries.
- Secondary / LSI Keywords: Target 0.5% to 1.5% for related terms and synonyms.
- Keyword Stuffing Boundary: Avoid densities over 3.0%. Exceeding this limit signals search engines that the copy is artificially optimized for robots rather than written for humans, which can hurt your rankings.
Step-by-Step: How to Analyze Your Copy
Follow these instructions to audit your text:
Step 1: Input Your Text Copy
Paste your article draft or webpage text into the textarea workspace.
Step 2: Configure Extraction Parameters
Adjust settings in the sidebar:
- Stop-Word Filter: Toggle to filter out common articles and prepositions.
- Min Word Length: Set to ignore very short words (default is 3 letters).
- Phrase Type: Switch between 1-Gram, 2-Gram, and 3-Gram tabs to view different phrase lengths.
Step 3: Analyze the Density Table
Review the output grid. The tool displays terms sorted by frequency and density. Check if your target search queries are near the top and ensure no single term has a density over 3.0%.
Step 4: Export Your Keyword Data
Click the Export CSV button to download a spreadsheet of your n-gram analysis for content audits or spreadsheets.
Frequently Asked Questions
What are semantically related keywords (formerly LSI)?
While older SEO guides often refer to "LSI (Latent Semantic Indexing) keywords", modern search engines like Google use advanced neural embeddings (like BERT) rather than LSI. However, the core concept remains: search engines expect to find conceptually related terms and synonyms alongside your primary keyword. For example, if your main keyword is "car", related terms might include "engine", "tires", and "mileage". Our extractor helps you identify if these secondary terms are adequately represented.
Will keyword density help me rank if my content is low quality?
No. Google's helpful content systems evaluate user engagement, depth of coverage, and authority. Having perfect keyword density on a thin or poorly written article will not help it rank. Density optimization should only be used to refine high-quality drafts.
How does Google handle stop-words in search queries?
Google's semantic search models (like BERT and MUM) understand prepositions and connectors to determine search intent (e.g., "flights from Paris to London" vs. "flights from London to Paris"). However, for basic keyword matching and indexing density, stop-words are ignored.
Does the extractor count keywords inside image alt text?
No. This tool analyzes copy pasted into the textarea block. If you want to audit image alt tags, code blocks, or hidden meta tags, copy the rendered text output of the page rather than the raw HTML source code.
Does the tool send my drafts to an external database?
No. Our suite is fully private. All text parsing, tokenization, n-gram calculations, and CSV formatting happen inside your browser. No copy is transmitted to a server.
Related Articles
- SERP Preview Tool Guide — Preview how your primary keywords look in search results.
- Schema Markup Generator Guide — Target semantic search targets using structured data.
- Readability Score Guide — Check if your keyword-optimized text is readable for your target audience.