Generative AI and SEO: How to optimize content for Google search and LLMs with cosine similarity (in other words, LLMO)

Key learnings

  • Natural language processing (NLP) is the foundation that allows search engines and LLMs to understand and interpret your content.
  • Cosine similarity measures how semantically aligned your content is with user queries, helping you optimize for intent rather than just keywords.
  • Retrieval-augmented generation (RAG) enhances LLMs by retrieving external knowledge, making your content more accurate and up-to-date.
  • Combining NLPcosine similarity, and RAG helps you create content that ranks well in Google SERPs and performs well in LLMs.
  • Eliminating content overlap and filling content gaps ensures your content is comprehensive and avoids cannibalizing its own rankings.
  • Semantic alignment and external knowledge integration are more important than ever.

Keyword stuffing is dead. Let’s face it—Google’s AI-powered algorithms, like BERT and MUM, have moved far Imagine you’re selling gaming chairs. Ranking for “best gaming chair” alone won’t cut it. Why? Because users aren’t just typing keywords—they’re searching for solutions. They care about ergonomics, durability, and affordability. Your job is to craft content that answers these needs.

And here’s the secret sauce: cosine similarity. This powerful tool measures how semantically aligned your content is with a user’s query. But its uses go beyond just aligning content with queries. You can also use it to:

  • Identify content gaps: Find topics or angles missing from your content library.
  • Detect content overlap or cannibalization: Ensure your posts aren’t competing for the same keywords.
  • Cluster similar content: Group related articles for internal linking or topic hubs.
  • Benchmark against competitors: Compare your content to competitors’ to find opportunities for improvement.

But there’s more. While LLMs are great at generating text, they sometimes lack access to the most up-to-date or domain-specific information. This is where retrieval-augmented generation (RAG) comes in. RAG allows LLMs to pull in external knowledge, making their responses—and your content—more accurate, comprehensive, and authoritative.

Google AI Overviews is here to stay, meaning semantic alignment and external knowledge integration are more critical than ever. Google AI overviews use AI models like BERT and MUM to understand the intent behind a query and generate concise, relevant summaries. By using cosine similarity to measure how well your content matches the query’s intent—and enriching it with RAG—you can increase your chances of being featured in these AI-powered search experiences.

By understanding and applying these tools, you can create content that not only ranks but truly resonates with your audience—and even earns a spot in Google’s AI-driven search experience.

Let’s dive into the emerging world of “large language model optimization” (LLMO).

How it all fits together

At the core of modern SEO and AI-driven content optimization are three key technologies: natural language processing (NLP)cosine similarity, and retrieval-augmented generation (RAG). Together, they form a powerful framework for creating content that ranks well in Google SERPs and performs well in LLMs.

NLP is the foundation. It’s what allows search engines and LLMs to process and understand human language. For example, when someone searches for “best ergonomic gaming chair,” NLP helps Google’s algorithms understand that the user is looking for a chair that’s comfortable, supportive, and designed for long hours of use—not just any gaming chair. Similarly, LLMs use NLP to generate responses that feel human-like and contextually relevant.

But understanding language is only the first step. To create content that truly resonates, you need to measure how well it aligns with what users are searching for. This is where cosine similarity comes in. Cosine similarity is a powerful tool for measuring the semantic alignment between your content and user queries. It’s not just about matching keywords—it’s about ensuring your content answers the intent behind the query. This is critical for ranking well in Google SERPs and being featured in AI Overviews.

While LLMs are great at generating text, they sometimes lack access to the most up-to-date or domain-specific information. This is where retrieval-augmented generation (RAG) steps in. RAG allows LLMs to retrieve and incorporate external knowledge, making their responses more accurate and comprehensive. For content creators, this means your content can be enriched with the latest data, making it more authoritative and useful.

By combining these tools and techniques, you can create content that not only ranks well in Google but also performs well in LLMs. This means writing content that answers user intent, editing and consolidating content to eliminate overlap and redundancy, and enriching content with external data to make it more comprehensive and up-to-date.


The evolution of search: From keywords to context

Google’s ranking algorithms have come a long way. Gone are the days when stuffing “cheap gaming chair” across your site would boost your rankings. Today, Google’s AI models, like BERT and MUM, are designed to understand what users really want—not just what they type.

What is BERT, and how does it work?

BERT (Bidirectional Encoder Representations from Transformers) is an AI model that reads entire sentences to understand word relationships and intent. Think of it as a super-smart reader that doesn’t just look at individual words but understands how they fit together in a sentence.

BERT processes text bidirectionally, meaning it considers the context of words both before and after them in a sentence. This allows it to grasp the nuance and intent behind a query, making it a powerful tool for semantic understanding.

For example, if someone searches for “best ergonomic chair for long hours,” BERT helps Google recognize that the user wants a chair that offers comfort during extended use—not just any chair. It’s not just about the keywords “ergonomic” or “chair”; it’s about the context and intent behind the query.

How Google AI Overviews uses BERT

  • Contextual matching: BERT helps AIO match the query to relevant content by understanding the semantic relationships between words. For example, it can identify that “ergonomic” is related to “comfort” and “lumbar support,” ensuring the summaries include these concepts.
  • Query interpretation: When a user enters a query (e.g., “best ergonomic gaming chairs”), BERT analyzes the entire query to understand its meaning. For instance, it recognizes that the user is looking for gaming chairs that prioritize comfort and ergonomics, not just any gaming chair.

What about MUM?

MUM (Multitask Unified Model)  is the next step in Google’s AI evolution. It’s even more advanced than BERT and can handle multimodal inputs like text, images, and videos. Think of it as a multitasking powerhouse that not only understands complex queries but also provides holistic, cross-domain answers.

MUM is trained across 75 languages and can perform multitask learning, meaning it can handle multiple tasks (e.g., summarization, translation, and comparison) simultaneously. This makes it ideal for generating comprehensive responses to complex queries.

For example, if someone searches for “compare gaming chairs for posture and durability,” MUM can aggregate reviews, product comparisons, and expert recommendations into a single, holistic response. It’s not just about finding relevant information—it’s about synthesizing it into a concise, user-friendly summary.

How Google AIO uses MUM

  • Summarization: MUM excels at distilling complex information into concise, easy-to-understand summaries. For example, it can highlight key features like “adjustable lumbar support” and “premium materials for long-lasting use” in response to a query about gaming chairs.
  • Multimodal understanding: MUM can process not just text but also images, videos, and other data types. For instance, if a query involves comparing gaming chairs, MUM can analyze product images, reviews, and specifications to generate a richer summary.
  • Cross-language and cross-domain insights: MUM’s multilingual and cross-domain capabilities allow it to draw insights from diverse sources, ensuring the summaries are comprehensive and globally relevant.

The building blocks of modern SEO: LLMs, NLP, and transformers

To understand cosine similarity, we need to start with the basics: Large Language Models (LLMs)Natural Language Processing (NLP), and transformers. These are the technologies powering Google’s AI-driven search.

What are LLMs?

LLMs are AI models trained on vast amounts of text data. They can understand and generate human-like language, making them perfect for interpreting search queries. Examples include GPT (used in ChatGPT) and BERT.

What is NLP?

NLP (Natural Language Processing) is the field of AI that focuses on enabling machines to understand and interpret human language. It’s the magic behind search engines, chatbots, and voice assistants.

What are transformers?

Transformers are a type of neural network architecture that powers LLMs. Unlike older models that analyzed text word by word, transformers look at entire sentences at once. This allows them to understand context and relationships between words, even in complex or ambiguous queries.

So, what does this mean for your content strategy? It’s all about semantic alignment. Instead of repeating keywords like “best gaming chair” or “cheap gaming chair,” focus on creating content that answers specific user questions. For example, “What makes a gaming chair ergonomic?” or “How to choose a chair for comfort during marathon gaming sessions.”

Cosine similarity helps you measure how well your content aligns with these intent-driven queries.


Tokens and tokenization: The language of AI

To dive deeper into how these models work, we need to talk about tokens and tokenization. Tokens are the building blocks of NLP—they’re the units of text that AI models analyze. Depending on the model, a token might represent a word, a subword, or even a single character.

For example, the sentence “Best gaming chair” could be tokenized into [“Best,” “gaming,” “chair”]. More complex words, like “unbelievable,” might be broken into subwords like [“un-,” “believ-,” “-able”]. Tokenization is the process of splitting text into these smaller units, bridging the gap between raw text and machine-readable data.

Why does this matter?

Because tokenization allows NLP models to handle complex queries, maintain context, and match content to user intent. For instance, a query like “affordable gaming chair with lumbar support” can be broken down into tokens, enabling the model to understand and respond to the user’s specific needs.


What is cosine similarity, and how does it work?

Now, let’s get to the heart of the matter: cosine similarity. At its core, cosine similarity measures how semantically similar two pieces of text are. It does this by representing text as vectors in a multidimensional space and calculating the cosine of the angle between them.

Here’s how it works in practice:

  1. Preprocess your text: Normalize it (convert to lowercase), remove stop words (like “the” or “and”), tokenize it, and lemmatize it (reduce words to their root form).
  2. Convert text to vectors: Use methods like TF-IDF (Term Frequency-Inverse Document Frequency) or word embeddings (e.g., Word2Vec, GloVe, BERT).
  3. Calculate similarity: Use Python libraries like Scikit-learn to compute the cosine similarity score.

For example, let’s say your query is “best ergonomic gaming chair,” and your content is “Top gaming chairs designed for ergonomic support during long sessions.” The cosine similarity score might be 0.92, indicating a high level of alignment.


How to use cosine similarity to analyze and optimize content

Let’s walk through a practical example. Imagine you’ve written a blog post about gaming chairs, and you want to see how well it aligns with the query “gaming chairs with lumbar support.” Here’s what you’d do:

  1. Preprocess your text: Normalize, tokenize, and lemmatize both the query and your content.
  2. Convert text to vectors: Use TF-IDF or embeddings to represent the text numerically.
  3. Calculate cosine similarity: Use Python to compute the score.
  4. Interpret the results:
    • A score of 0.9–1.0 means your content aligns well with the query.
    • A score of 0.7–0.9 suggests there’s room for improvement.
    • A score of 0–0.4 indicates significant misalignment.

Example: Analyzing a long-form document

For longer documents (e.g., blog posts or articles), you can break the content into smaller sections (e.g., paragraphs or headings) and calculate cosine similarity for each section. Here’s an example using Python:e using Python:

Output:

Interpret the score

The cosine similarity score of 0.68 indicates that the content is moderately aligned with the query. To improve the score, the content could include more specific details about “best” or “ergonomic” features.


Identifying content overlap and cannibalization

Cosine similarity can also help you identify content overlap or cannibalization, which occurs when multiple pieces of content target similar topics or keywords, potentially competing with each other in search rankings. Here’s how it works:

Example: Comparing multiple blog posts

Let’s say you have three blog posts:

  • Post A“Top ergonomic gaming chairs for long hours.”
  • Post B“Affordable gaming chairs with lumbar support.”
  • Post C“Best gaming chairs for tall gamers.”

After preprocessing and vectorizing, you calculate the cosine similarity scores:

  • Post A vs. Post B: 0.85 (high similarity—both focus on comfort but differ in emphasis).
  • Post A vs. Post C: 0.45 (low similarity—Post C focuses on height, which isn’t covered in Post A).
  • Post B vs. Post C: 0.50 (low similarity—Post C doesn’t address affordability or lumbar support).

What this tells you

  • Posts A and B are highly similar, which could lead to cannibalization. For example, both might rank for queries like “best gaming chairs for comfort” or “ergonomic gaming chairs,” potentially splitting traffic.
  • Post C is distinct and doesn’t overlap significantly with the other posts, so it’s unlikely to cannibalize their rankings.

How to address content overlap

If you identify high similarity scores, here’s how you can address the issue:

  1. Merge overlapping content: Combine Posts A and B into a single, comprehensive piece.
  2. Differentiate the content: Update Post A to focus on “ergonomic gaming chairs for professional gamers” and Post B to focus on “budget-friendly gaming chairs for casual gamers.”
  3. Create bridging content: Write a post titled “Affordable gaming chairs for tall gamers” to address the gap between Posts B and C.
  4. Optimize for distinct keywords: Use keyword research tools to identify unique keywords for each post.

Putting it all together: NLP, BERT, MUM, LLMs, vectors, and embeddings

1. Map user journeys and identify intent

  • Break down the journey: Divide the user journey into stages (e.g., awareness, consideration, decision) and identify the intent behind each stage.
    • Example: For “gaming chairs,” awareness-stage users might search for “what makes a good gaming chair,” while decision-stage users might look for “best ergonomic gaming chair under $300.”
  • Understand user needs: At each stage, determine what users are looking for—whether it’s general information, comparisons, or specific product details.

2. Use cosine similarity to align content with intent

  • Analyze query semantics: Use cosine similarity to measure how well your content matches the semantic meaning of user queries at each stage.
    • Example: If the query is “best ergonomic gaming chair,” ensure your content includes terms like “lumbar support,” “adjustable armrests,” and “comfort for long hours.”
  • Optimize for semantic relevance: Focus on entity coverage rather than keyword stuffing. Ensure your content covers all relevant entities (e.g., “ergonomics,” “durability,” “affordability”) that align with the query’s intent.
  • Tools to use: Use tools like TF-IDF, word embeddings (e.g., Word2Vec, BERT), or Python libraries (e.g., Scikit-learn) to calculate cosine similarity and refine your content.

3. Balance breadth and depth across pages

  • Awareness stage: Create short, broad pages that introduce the topic and answer general questions. Use cosine similarity to ensure these pages align with exploratory queries.
    • Example: A page titled “What to look for in a gaming chair” should cover key features (e.g., ergonomics, materials) without going into excessive detail.
  • Consideration stage: Develop focused pages that dive deeper into specific aspects. Use cosine similarity to align with comparison or evaluation queries.
    • Example: A page titled “Top 5 ergonomic gaming chairs for 2023” should compare features, pros, and cons.
  • Decision stage: Provide detailed, actionable pages that help users make a purchase. Use cosine similarity to align with transactional queries.
    • Example: A page titled “Best budget gaming chairs with lumbar support” should include pricing, reviews, and buying links.

4. Create a cohesive content ecosystem

  • Internal linking: Connect pages across stages to guide users seamlessly through their journey.
    • Example: Link from an awareness-stage page (“What makes a good gaming chair”) to a consideration-stage page (“Top 5 ergonomic gaming chairs”).
  • Avoid overlap: Use cosine similarity to ensure pages target distinct intents and avoid cannibalizing each other.
    • Example: If two pages score high in similarity (e.g., >0.8), merge them or differentiate their focus.

Similar Posts