From this article you’ll get to know:
1. What Semantic Search is;
2. What were the beginnings of Semantic Search at Google;
3. Why Semantic Search was the natural direction for search engine development;
4. What Schema.org is and how you can use structured data on your site;
5. What to look for when creating content on a site – beyond structured data.
Semantics corresponds to the branch of linguistics and logic concerned directly with the philosophical and linguistic study of meaning. The two main focal areas of this are:
- Logical semantics – concerned with matters such as sense and reference, pre-supposition and implication.
- Lexical semantics – concerned with the analysis of word meanings and the relations between them.
The way users search for information has changed dramatically over recent years.
The need to provide relevant and qualitative search results has driven the development of new methods and algorithms for presenting content. Through this focus, emerging technological developments have begun to enable users to search and communicate with devices in completely novel ways (e.g. voice search). This has further enabled natural queries and conversations with devices and search engines to occur. This, in turn, has necessarily driven the development of techniques to better understand natural language and the meaning of words, in order to meet the growing demands of users. In this way, technology has come full circle.
It’s easy to understand why search engines strive to optimally meet the needs of their users and reduce unnecessary (and junk) results. The quality of the search results presented to a user has increasingly resulted in their binary decision-making regarding their use of a given search engine. Users necessarily require products and services that meet their needs. If, however, the search engine does not provide results that correspond to the Internaut’s query, there is a higher chance that they will instead seek out the competition.
Search competition does not just constitute a computer and a browser, however. Amazon is an excellent search engine for products and the reviews about them. Moreover, personal assistants – i.e. Google Home, Amazon’s Echo and Alexa, and Apple’s Siri and newly announced HomePod – are direct competitors in the field of voice search. Such new technologies increasingly force the development of new methods for seeking and retrieving information.
So where does Semantic Search fit in amongst all of this?
In the Beginning, there was Google Knowledge Graph and Hummingbird
It is fairly safe to say that Semantic Search first appeared with the arrival of the Google Knowledge Graph in May, 2012, and the Hummingbird algorithm. Both updates were heavily related to each other and marked a breakthrough in how Google read user queries to then subsequently present search results.
In order to better understand the subject of Semantic Search, it is worth briefly looking at the historical context (if 2012 and 2013 can be called “history”) in which Semantic Search developed.
Knowledge Graph – “Things not Strings”
Things – words become no longer just a patchwork of characters. Now they are objects with related meanings that can refer to other objects. In a quick and clear way, the Knowledge Graph showcases relevant information searched for by the user. It does so through selecting information from a database of millions of objects and gathers the most relevant sources together with information that is intimately linked to it. This will become clearer later on. The user does not have to click through the search results to find information that interests them – it is instead available directly from the Google search result page itself.
Figure 1: Knowledge Graph when searching for “Stranger Things”
The Hummingbird algorithm was predominantly focused at enabling an even more accurate reading and understanding of the intentions of the user. This allowed for a better match of search results to a specific query, taking into account the intent, context and meaning of the query sent by the user. Words ceased to be only a patchwork of letters. Now they started to make sense.
SearchEngineLand has highlighted how the Hummingbird update constituted a complete change to the previous way Google algorithms worked (https://goo.gl/FGgsoM), with the name itself taken to emulate the fact that it is “fast and precise”.
Hummingbird had a significant impact on the development of voice search, creating a lot of hype – particularly of late – in the world of e-marketing and SEO. There is a reason for this. The sale of personal assistant devices (Google Home, Amazon Alexa etc.) and the use of voice search is witnessing a dramatic growth. If you want to find out more about the voice search trend, I would refer you to a fantastic entry on the Branded3 blog that summarises these trends.
What is Semantic Search?
Taking what has been described above into account and collating it with the supporting definition from Wikipedia, we can begin to accurately define what we mean by Semantic Search.
Semantic Search seeks to improve search accuracy by understanding the searcher’s intent and the contextual meaning of the terms they use as they appear, in order to provide relevant search results. By using semantics, Google can provide the user with more tailored, relevant and personalised search results. This is achieved by ‘understanding’ the intentions, word meanings and context in which questions are asked through an awareness of what the terms mean in themselves as well as the links between them. It must be said, Google is doing quite well with this.
Using natural language (either by writing in the search query or searching by ‘voice’) to ask “who plays in stranger things” returned a result to me that was appropriate to the context. Google responded to the query with the cast of the series in the form of a carousel that included pictures, names and the characters they played – despite the fact that I did not include the word “cast” in my query:
Figure 2: An example of understanding user intentions
Algorithms can understand the intent and meaning of a query, resulting in the display of relevant search results. In the following case, Google knows that I am looking for the title of the latest album released by Iron Maiden. I do not have to go to the band’s website or browse a forum in search of this information. I get what I want straight away from the search results themselves.
Figure 3: An example of understanding user intentions
Google also provides a more concrete answer when the queries themselves are more specific. It can understand, for example, the difference between asking for an age or the date of birth of a specific person (in both cases the answer will be matched to the question). It also suggests other information that may be of interest:
Figure 4: Intention of the user: “How old is …” vs “What year …”
The connections between ‘things’, the Hummingbird update and Semantic Search – how does it all come together?
Figure 5: Combined data – “things” not text. The relationships between “things”.
In the chart, I have marked the example of “things” (bold font, blue frame) – objects to which other “things” relate, as well as the types of connections between them (arrows and text in a green frame). See for yourself what these objects and associations look like for Metallica on WikiData.
Google recognises the intent and meaning of a query and displays a specific answer based on the available data. What’s more – the algorithms “know” that it is worth suggesting “similar searches” for other team members, because they know what I am asking about and what else I am likely to ask for!
Continuing the search, Google has already understood my intentions and will now provide me with focused hints that keep within the context of my previous query:
Figure 6: Context-based hints
If I search for the same phrase without first providing context to the search engine, I will receive general hints that may lead me to a different topic that I might be looking for:
Figure 7: Tips without prior context
How to optimise a website for Semantic Search?
You can help search engines to read and interpret the content and context of your pages by using a structured data markup (Schema.org). By properly tagging items on a page, we can influence the way Google displays search results – for example, in the Knowledge Graph we discussed earlier.
The markings indicate the “things” and the connections between them.
Google is constantly expanding the base of elements that it can use in search results to present “extended results”. The two most important resources to pay attention to are:
• http://schema.org/ – created in cooperation with Google, Microsoft, Yahoo and Yandex. It is a catalogue of agreed and standardised definitions/tags that help search engines understand specific pieces of content on a website and then connects them with other “things”. The expansion of the Schema.org dictionary is also improved by other companies. The semantic extensions for the ‘Car’ (focused on the automotive industry) and financial industry (FIBO) were both created by MakoLab and are now officially implemented parts of Schema.org.
• “Google’s Search Gallery” – this constantly developing tool provides information on what extended results are currently supported by the search engine and how the implementation of structured data on a website can help in achieving such results.
The implementation of structured data indicates to Google’s robots that the content may be appropriated to form part of the expanded results. Using Schema, we can, for example, influence the information contained in the Knowledge Graph, or display new information in the results (if it is not already there).
The Google Search Gallery highlights many of the elements that it can display in search results. These include:
• Contact information
• Logo type and internal search engines
• Your social network profiles
• Information on upcoming events
• Job offers (recently added)
• Product information (prices, reviews, availability, other)
• Recipes (photo, cooking time, calories)
• Ratings and reviews (number of ratings, total rating)
Figure 8: Awarded Stars, Rating and Number of Reviews in Google Search Results
Describing the implementation of individual elements goes beyond the scope of this article but the possibilities and approach for implementation are clearly presented in the “Search Gallery”, whose link I have placed above.
However, it is worth describing the more extensive implementation approach for structured data of the ‘Organisation’ (http://schema.org/Organization), whereby the quantity and quality of the information displayed in the Knowledge Graph could be affected. Below is an example of the implementation of structural data for “Organization” and “Address” for the MakoLab site in JSON-LD format. Modify the data and enjoy!
“name”: “MakoLab S.A.”,
“name”: “MakoLab centrala”,
“addressLocality”: “Łódź, Polska”,
“streetAddress”: “ul. Rzgowska 30”,
“telephone” : “+48-42-239-28-50”,
“contactType” : “customer support”,
“availableLanguage” : [
If you want to implement structured data on your website, there are resources that you will likely need:
- Search Engine Gallery: https://developers.google.com/search/docs/guides/search-gallery
- Structural Data Testing Tool: https://search.google.com/structured-data/testing-tool
- Structured Data Marker: https://www.google.com/webmasters/tools/data-highlighter – if you do not have the option to implement markers directly in your site’s code.
Figure 9: Structured Data Testing Tool – the right implementation
It is not all about structural data – going back to basics
Semantic Search optimisation is not the only way of implementing structural data. Search engine algorithms (including Google) are so advanced that they can often find the right answer to a user’s question, even without the techniques we have discussed above. Of course, search engines were basically fulfilling this function before, with every update dedicated to improving the way users obtained the most relevant search results.
Search engines do not only rely on webmasters and their willingness and skills to describe code with structured data. After reading a user’s query and intentions, the search engines have to determine what results will be most useful.
Let’s take a look at a Featured Snippet, where Google chooses one search result and then selects a fragment from the entire page dedicated to a specific topic that will hopefully best answer the user’s question.
Figure 10: A Featured Snippet answering the question “how to cook rice”
Figure 11: A Featured Snippet answering the question “Why do stars twinkle”
The Featured Snippet doesn’t have to be marked with any structured data – it’s entirely up to Google to identify the most relevant sub-page. However, there are several methods (in addition to structural data designations) that can help search engine crawlers to better read a page’s content – and these haven’t changed for years:
- Write for users, not search engine robots (do not stuff keywords)
- Write in a natural language, complete sentences and use synonyms of keywords
- Respond to users’ questions – prepare content in a way that allows you to easily identify the answer
- Try to prepare content in such a way that its structure allows for easier interpretation by search engine robots. Use formatting, tables and calculations.
- Find words semantically linked to the main theme of the content. I presented a good example at the beginning of the article – the question “who plays in stranger things” returned the answer “Stranger Things> Cast”
There is no doubt that the way users search the internet is evolving towards using more natural language and conducting conversations directly with devices. The usual use of key phrases in content is not enough to optimise a site for semantic search. Increase your chances of getting in front of the user through the use of structured data, natural language, semantic keywords and a response-oriented content structure. Do not be left behind.
This article was initially published in Online Marketing Magazine