Schema markup guide: Everything you’ll need to know to be prepared for the future of semantic search

Most of the web’s schema markup is either inaccurate, incomplete or missing.

It’s not you or your developer’s fault because the people who make decision on what structured data and what ontologies exist, how they should be used, and how schema markup can be used are academics.

That is, they operate in completely different circles to SEO professionals who want to implement it.

No wonder the web’s schema markup is either inaccurate, incomplete or missing.

In this explainer you will learn everything you will need to begin implementing connected schema markup so that every important webpage on your website is rich with descriptive context for search engines.

What is semantic SEO?

Semantic SEO is the practice of translating unstructured data into structured data. It reduces your reliance on Google to get things right by removing ambiguity.

Most SEOs focus purely on the uniqueness and helpfulness of the content itself. While I stand by this tactic, this approach relies too heavily on Google to interpret unstructured data.

Currently, Google’s Knowledge Graph is not quite mature yet meaning it doesn’t know everything yet. What this means for you is that you can give your content a significant boost by relaying the same information through schema markup.

Therefore, the purpose of semantic SEO is to amplify your investment in user experience so that you can extract maximum return on your investment.

By the way, semantic search, semantic SEO, entity-based SEO refer to the same practice. That is, using structured data to create schemas for search engines to better understand how one entity relates to another.

What is schema markup?

Schema markup is a way to convert unstructured data into structured data.

What is unstructured data then?

Well, any text you see on a page, images on a page, and embedded videos on a page are all forms of unstructured data.

You can process and understand unstructured data with relative ease because we learn to develop schemas as children and adults.

But machines struggle to do so and this is why structured data exists.

That is, structured data is a way that machines can process and understand all these complicated things, meanings and nuances humans have created.

What is schema?

Schemas are ways humans understand concepts and things. We use these mental models every single day to disambiguate between similar things.

Machines, on the other hand, need a lot of help to understand these concepts and things and this is why schema markup is needed.

To demonstrate this concept, imagine trying to describe your dog to an alien species.

If you said, “It is a creature with 4 legs.” – while not incorrect, a horse, cow, and a cat have 4 legs.

So you expand with, “It is a creature with 4 legs and a tail.”

But a horse, cow and a cat each have these features too.

In other words, machines such as search engines rely on their own engineering to interpret unstructured data on the web but you can make it far more efficient with schema markup.

That is, the less work the machine has to use, the quicker your content may be indexed and ranked on the SERPs.

What is syntax in the context of structured data markup?

There are 3 syntaxes used to communicate schema to machines.

They are RDFa, Microdata, and JSON-LD.

All three are syntaxes and syntaxes are types of coding.

In other words, RDFa, Microdata, and JSON-LD are coding languages.

For example, just as you can say “chicken” in different languages and dialects (e.g., 鸡 vs kip vs kyckling vs pollo vs κοτόπουλο), you can describe the same entity using different syntax.

In more recent times, JSON-LD is the preferred way to turn on-page unstructured data into structured data and the vocabulary you can access and use is defined by schema.org.

What is schema.org vocabulary?

Launched initially by Bing, Google and Yahoo!, schema.org was formed as an alliance to create and support a common set of schemas for structured data markup.

Schema.org is a collaborative community made up of individual contributors, a community group and a steering committee.

Schema.org is the entity that defines how entities can be described and schema.org vocabulary refers to the attributes that can be used to describe an entity.

For example, these are the vocabularies for Person schema.org Type:

What is an entity?

Anything and everything can be an entity. They can be real, visible, imaginary, a work of fiction, or otherwise.

This is because everything is a thing and this is why we need to provide machines with schemas so they can differentiate between things.

An entity can be a physical thing (e.g., a chair), something we cannot see but know exists (e.g., an electron), or something fictional (e.g., Harry Potter).

An easy way to understand entities is the following: if it is a noun then it is an entity.

What are schema Types?

Types of schema (schema.org Types) are used to describe specific entities.

For example, any person can be an entity but there are more than 7 billion people on earth so the role of schema.org vocabulary and its attributes is to help describe as many unique features that make a particular person identifiable to a machine.

The Person schema.org Type (shown above) comes with a set of vocabulary so that you can describe who that person is.

Vocabulary such as name, alumniOf, gender, worksFor, memberOf, sibling, sameAs, and relatedTo let you describe who a person is. This in turn disambiguates them from someone else who shares the same name, gender, and other overlapping attributes.

Using the JSON-LD syntax, you can give Google the information it needs to understand who you’re talking about.

This is the power of structured data markup.

Common schema.org Types you should use

There are 803 schema.org Types.

Of these, you will most likely use:

  • Organization,
  • Person
  • Event
  • CreativeWork
  • Product

These are all specific schema.org Types that sit under the umbrellaType called “Thing” and as the schema.org documentation says, “Thing is the most generic type of item”.

Within many of these Types are more specific schema.org Types.

Which one should you use – Schema.org Type or specific schema.org Types?

There is a hierarchy to schema.org Types.

For example, you can use CollectionPage schema.org Type to describe an e-commerce product category page just as you can also use WebPage schema.org Type.

Both are not wrong because CollectionPage schema.Org Type is a child of WebPage schema.org Type which is a child of CreativeWork schema.org Type which is a child of Thing schema.org Type.

That is, you can describe an entity using more than one Type which raises the question of which one should you use and, more importantly, does it matter if you use WebPage vs CollectionPage?

In this instance, CollectionPage is more specific and relevant to the product category page and this is why it makes sense to use CollectionPage instead of its parents WebPage or CreativeWork.

Always use the best specific schema.org Type possible because specificity matters.

This means choosing the most appropriate schema.org Type to describe an entity because the close you can describe an entity, the better Google can understand what you are describing.

Can I have more than one schema.org Type per page?

Yes, you can have more than one Type per page. However, schema.org Types should be linked to each other.

That is, you’re telling Google the relationship between one Type to another and thus are establishing how one entity relates to another entity on the page.

In doing so, you help Google understand the context for your target keywords and audience and the correct way to optimise your structured data is to build nested relationships.

For example, using JSON-LD, you can tell how a FAQPage is related to the CollectionPage, where an Organization is the publisher of the WebSite.

This process of joining schema.org Types together is how you create page-level knowledge graphs which is a pillar of semantic SEO.

Why should schema be connected to each other?

An entity by itself does not help Google or your website.

Similarly, having multiple disconnected entities described on a webpage is not helpful.

This is because context provides additional information and meaning and when you describe how one entity is related to another entity, Google can begin to see how the things that matter to you should matter to its users.

Connected structured data is the process of providing rich and meaningful context to search engines and if you think about it, it is literally search engine optimization.

What does a page-level knowledge graph look like?

When you define relationships between entities on a webpage, you create a map that search engines understand.

For example, in the above knowledge graph, each circle represents a node where each node is an entity and the lines connecting each circle represents the relationship that has been defined between each entity.

This was created with JSON-LD syntax with relevant schema.org Types connected to each other using schema.org vocabularies.

In fact, if you want to see the raw JSON-LD, click on the toggle below to reveal the full code.

{
    "@context":"https://schema.org",
    "@type":["WebPage","FAQPage"],
    "url":"https://www.danielkcheung.com/how-to-describe-saas-product-with-schema/",
    "headline":"How to describe your SaaS product with semantic SEO",
    "@id":"https://www.danielkcheung.com/how-to-describe-saas-product-with-schema/",
    "author":{
        "@type":"Person",
        "name":"Daniel Cheung",
        "@id":"https://www.danielcheung.com.au/about/",
        "sameAs":"https://www.linkedin.com/in/danielkcheung/",
        "alternateName":"Daniel K Cheung",
        "image":{
          "@type":"ImageObject",
          "url":"https://www.danielkcheung.com/wp-content/uploads/2021/07/danielkcheung-bio-portrait.webp"
        },
        "knowsAbout":[
          {
            "@type":"Thing",
            "name":"search engine optimization",
            "@id":"https://www.wikidata.org/entity/Q180711",
            "alternateName":"seo"
          },
          {
            "@type":"Thing",
            "name":"semantic search",
            "@id":"http://www.wikidata.org/entity/Q1891170",
            "sameAs":"https://en.wikipedia.org/wiki/Semantic_search"
          },
          {
            "@type":"Thing",
            "name":"JSON-LD",
            "@id":"http://www.wikidata.org/entity/Q6108942"
          },
          {
            "@type":"Thing",
            "name":"digital marketing",
            "@id":"https://www.wikidata.org/entity/Q1323528"
          }
        ],
        "memberOf":{
          "@type":"OnlineBusiness",
          "name":"Daniel K Cheung: Digital Marketing Consultancy",
          "legalName":"Daniel K Cheung",
          "sameAs":[
            "https://abr.business.gov.au/ABN/View/97136392116",
            "https://www.linkedin.com/company/daniel-k-cheung-consultancy"
          ],
          "taxID":"97 136 392 116",
          "@id":"https://www.danielkcheung.com/",
          "url":"https://www.danielkcheung.com/"
        }
    },
    "keywords":["SaaS product schema","WebApplication schema","SaaS subscription pricing schema"],
    "mentions":[
        {
            "@type":"Thing",
            "name":"web application",
            "sameAs":"https://www.wikidata.org/wiki/Q189210"
        },
        {
            "@type":"Thing",
            "name":"application",
            "sameAs":"https://www.wikidata.org/wiki/Q166142"
        },
        {
            "@type":"Thing",
            "name":"Schema.org",
            "sameAs":"https://www.wikidata.org/wiki/Q3475322"
        },
        {
            "@type":"Thing",
            "name":"JSON-LD",
            "@id":"http://www.wikidata.org/entity/Q6108942"
        },
        {
            "@type":"Thing",
            "name":"semantic web",
            "sameAs":"https://www.wikidata.org/wiki/Q54837"
        }
    ],
    "mainEntity": [{
        "@type": "Question",
        "name": "What schema type should I use for a Saas product?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "Many SaaS products are accessed through a web browser. For this reason, WebApplication is the best schema type to describe a SaaS product. For example, to use Canva, Clearscope, Semrush, Ahrefs, and Adobe Express, you would navigate to their respective websites in your web browser (e.g., Chrome, Safari, Edge)."
        }
      },{
        "@type": "Question",
        "name": "What is WebApplication schema and how is it different to SoftwareApplication schema?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "WebApplication is a subtype of SoftwareApplication schema which is a subtype of CreativeWork schema. The main difference between SoftwareApplication and WebApplication schema is how they are accessed. Programs such as Final Cut Pro, Google Chrome and Screaming Frog require physical downloads and installation whereas apps such as Google Sheets, Canva, and Substack are accessed via a web browser."
        }
      },{
        "@type": "Question",
        "name": "What schema.org properties does WebApplication schema type offer?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "When you compare Schema.org documentation between WebApplication and SoftwareApplication you will notice WebApplication has only one unique attribute – the browserRequirements item property. This makes sense because this is the defining feature of disambiguation between a piece of software you install on your device versus a cloud-based software you access directly through a web browser. But because WebApplication is a child of SoftwareApplication and CreativeWork schema, you can use all of their attributes to describe your Saas product."
        }
      },{
        "@type": "Question",
        "name": "Where should I use WebApplication schema for the SaaS Product?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "We need to nest WebApplication schema on one or more URLs of your website. For the SaaS product schema I compiled for Adobe Express, the most appropriate way to use this schema would be on the marketing-led URL by connecting the WebApplication schema to WebPage schema using the about item property. That is, we are going to tell Google that the main focus of adobe.com/express/ is the WebApplication schema which happens to be our SaaS product Adobe Express."
        }
      },{
        "@type": "Question",
        "name": "Is there any benefit of marking up a SaaS product with semantic SEO?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "If you’re a startup, it is probably not worth the effort because there are other things you should focus on from a business perspective. However, for an organisation such as Adobe that have many SaaS products in its portfolio, marking up individual SaaS solutions may help Google understand each product better, especially if the product was part of an acquisition and has been rebranded (e.g., from Magento Ecommerce to Adobe Commerce). In my professional opinion, semantic SEO is the missing piece to most of the web’s SEO strategy. This is because schema markup tells a search engine what it will find when it crawls a webpage. While search engines will still use NLP to understand the unstructured data, having the same information described in structured data reduces their need to make guesses on your behalf. Therefore, semantic SEO is quite literally optimisation for search engines."
        }
      }],
    "image":[
        {
            "@type":"ImageObject",
            "name":"WebApplication Schema.org type is best for describing SaaS products",
            "contentUrl":"https://www.danielkcheung.com/wp-content/uploads/2023/05/webapplication-schema-type-screenshot.jpg",
            "creator":{"@id":"https://www.danielcheung.com.au/about/"},
            "copyrightHolder":{"@id":"https://www.danielcheung.com.au/about/"},
            "copyrightYear":2023,
            "about":{
                "@type":"thing",
                "name":"web application",
                "@id":"https://www.wikidata.org/wiki/Q189210"
            },
            "keywords":["webapplication schema","SaaS product schema type"]
        },
        {
            "@type":"ImageObject",
            "contentUrl":"https://www.danielkcheung.com/wp-content/uploads/2023/05/ahrefs-saas-product-schema-with-pricing.jpg",
            "creator":{"@id":"https://www.danielcheung.com.au/about/"},
            "copyrightHolder":{"@id":"https://www.danielcheung.com.au/about/"},
            "copyrightYear":2023,
            "keywords":["aggregateOffer for SaaS product","SaaS product subscription schema"]
        }
    ],
    "isPartOf":{
        "@type":"WebSite",
        "name":"Daniel K Cheung",
        "@id":"https://www.danielkcheung.com/",
        "url":"https://www.danielkcheung.com/"
    }
}

Copy and paste the entire code snippet into schema.org validator and Google Rich Result Test.

Does schema markup help with SEO?

Yes, but schema markup is a lever, just as on-page content, links, and clean technical SEO are levers.

In other words, semantic SEO is not a magic bullet that will sky rocket your keyword rankings or organic traffic.

Interconnected schema markup will amplify your on-site content because you’re delivering efficient code that search engines understand.

So instead of relying on Google’s natural language processing, schema markup gives Google something to validate against.

Doing this can decrease the time to indexing and improve time to ranking, especially for heavy JavaScript front end websites.

Therefore, if you’re thinking of getting buy-in from stakeholders or a client on semantic SEO, or you want to do it for your own website, think of schema markup as a reflection of content. That is, the content on a page should be the marker for how many entities you can mark up.

And remember, you don’t need to have schema on every single webpage – only the most important ones that you want served on Google Search.

And for these pages, you don’t need to mark up every single element – only the important entities.

Further reading:

In closing

  • Entities can be described using Types with the vocabulary as defined by schema.org.
  • Joining one entity to another builds context for search engines and this is the goal of semantic SEO.
  • It is ok to have multiple entities per page as long as their relationships to each other are clearly defined using schema.org attributes.
  • Interconnected schema.org Types create a knowledge graph on the webpage and this knowledge graph is how Google understands context and meaning.
  • But adding connected schema markup is not magic. It is merely another lever at your disposal.

Ready to apply this theory?

The next how-to guides part of this three-part series are:

  1. ‘How to write JSON-LD from scratch’
  2. ‘How to do semantic SEO using nothing but JSON-LD and schema.org vocabulary’

Similar Posts