Also known as ” data clustering/partitioning” , clustering is an organization strategy that consists of partitioning a heterogeneous data set into homogeneous subsets. Each subset is obtained by grouping elements that share common characteristics. In SEO, it is a strategy known as Topic Clustering, which recommends grouping the contents of a site by theme.
Recently introduced in content marketing as a new SEO approach, clustering is also used in other disciplines because of its effectiveness.
Like other SEO strategies, clustering also pursues the goal of helping sites produce quality content for better SEO, with its own little twist that makes all the difference
But in concrete terms
- What does “clustering” mean?
- In which contexts can it be used?
- What are its advantages and limitations?
- How can you set up a successful data clustering strategy?
We talk about it in this mini-guide dedicated to this term
Chapter 1: What is clustering?
It is appropriate that we start our guide with a complete definition:
1.1. clustering – general definition
The term clustering is an anglicism that designates a statistical data analysis technique. It is used to organize a set of raw data into small homogeneous groups
Each resulting subset groups data that share common characteristics.
Generally, clustering is possible thanks to algorithms that rely on proximity criteria to distribute the data
For a balanced distribution of these data, the algorithms will notably
- Focus on the inertia between the subsets ;
- And minimize the inertia within the subsets
Also known as “data clustering” or “data partitioning”, clustering can be used to prioritize or group data into multiple areas.
1.2. Clustering in computer science
In computing, there are essentially two cases of use of this strategy
- Clustering as a server cluster;
- Clustering for data storage on a personal computer (PC)
1.2.1. Clustering or “server clustering
The best way to explain this expression in this context is to start with a tangible example. Let’s say you have a dating web application that is hosted on a server
At the beginning, your application was working well and your visitors could connect without any problem. Navigation was smooth, fast and users were exchanging messages within seconds.
Then gradually your application starts to be successful. The number of visitors becomes high and the server processes more and more requests
Consequences: The response time has increased considerably and the server now takes longer to respond
Under the weight of the ever-increasing number of requests, the server finally gave in and your dating software became inaccessible
Then you decided to restart your application, but this time with a solution that ensures
- Availability your application will be accessible 24 hours a day, every day of the week
- Scalability your application must be scalable and support a growing number of visitors
Well, the solution you need in such a circumstance is to use a cluster or “server cluster” in the computer jargon
This is a set of servers that work simultaneously to serve a web application and offer more efficiency
On the one hand, if one of the servers in the clustering is no longer functional, another one automatically takes over to continue processing the user’s request without the user suspecting anything
On the other hand, clustering will increase the processing capacity of the application. This time, even if the number of visitors increases, it will be enough to add other servers to the clustering to be able to handle the requests received
1.2.2. A storage cluster for PCs
In terms of data storage on a computer, a cluster (or clusters) represents a file storage unit. So for each file saved on your computer’s hard disk, one or more storage clusters are used
And for the same file, the clusters used can occupy several locations on the hard disk. An average user who reads a file, for example, does not necessarily realize that the data in the file is partitioned into several clusters
Source Real Techs
However, this is what happens underneath and we could even find the location of each of the clusters used thanks to the File Allocation Table (FAT) of the hard disk
But the fact is that the cluster is a software unit and not a physical one, i.e. it cannot take a real form. So it is not locked somewhere in your hard disk, nor in any other physical component of the computer. It is handled by the operating system
This explains the fact that the size of a cluster is not known in advance. It can vary and the number of clusters a hard disk can support depends only on the size of the FAT.
Initially, under the DOS 4.0 operating system, a FAT was only 16 bits and stored at most 65536 clusters
But since Windows 95 OSR2, their size has increased considerably and a 32-bit FAT could hold up to 2 terabytes of data on the clusters, if the hard disk has enough capacity of course.
Today, clustering is mainly used to prioritize or partition a database. Scientists, for example, use this technique to organize their data and do advanced high-level calculations
They can use up to 5 different types of clusters to process their data. In the field of spatial imagery, the data in each image is compressed (forests, ocean, city, etc.) and organized into several clusters to reduce file size
1.3. Clustering and search engines
Clustering is also used in the field of online search. Here, search engines use it in two contexts
- To divide websites into different clusters, i.e. into different themes: religion, finance, politics, education, sports, etc
- To control the number of times the same site can appear in the search results pages
1.3.1. Clustering to distribute information by theme
The first context of use of clustering concerns the dispatching of information (site, video, page, image, etc.) by theme. It must be said that this distribution is done in a flexible way
A web page, for example, can belong to several clusters at the same time. The idea is just to help search engines to understand the theme developed on each site in order to serve, without any ambiguity, the users’ requests
We find all the value of clustering here, when we know that some words can have several meanings. For example, we have the term “orange” which designates a fruit, a color or a brand
The same goes for “python” which can be mistaken for a reptile or a programming language or “jaguar” which designates an animal or a car brand
As a result, thanks to clustering, all the pages that deal with the programming language “python” will be stored on a specific cluster so as not to be mixed up with the results of a query on the animal “python”.
All search engines, and even Google, use clustering algorithms to sort sites by topic
We still remember some search engines (Vivisimo, MSN, Clusty…) that went so far as to offer Internet users the possibility of performing their searches according to the cluster of their choice
1.4. Clustering and SEO
The topic cluster, also known as Topic Cluster in English, is a strategy that recommends grouping the contents of a site by theme
For those who do not know what it is, it is a new approach that advocates the concept of pillar pages with the principle of linking by hyperlinks your content around the same theme
The semantic cocoon is simply a reworked version of the topic cluster:
The main interest of clustering is to bring Internet users as well as search engines to discover the depth of your pages and to access in a few clicks all the pages of a theme
That is to say that instead of targeting a generic keyword in a single article, we start from a general subject which remains in the center and which leads to other contents
It is a very popular strategy that offers web editors a relevant and efficient architecture for the creation of their content. With the thematic cluster, the content is well organized and makes it easier for the visitor to find information on the site
By adopting this thematic cluster approach on your blog or website, you offer your audience a centralized content bank that addresses all aspects of the original topic by partitioning it into small autonomous sections
In the rest of this mini-guide, we will focus on the cluster in the context of SEO, i.e. the thematic cluster
Chapter 2: Thematic Cluster – Strategy Composition, Advantages and Disadvantages
Let’s start this chapter by comparing the topic cluster with the classic blog model:
2.1. Topic Cluster and Classic Structure: What is the difference?
To appreciate the full value of the topic cluster, it is important to compare it to the traditional structure that we often see on blogs. Generally, this is how most blogs arrange their content
As you can see in the image, each circle represents a blog post written to reference the site on a specific keyword. Each figure circumscribed in a circle (suitcase, bed, plane…) designates a general topic developed in the article to which it belongs.
It can be hotels for a stay, reservations, places to visit, restaurants to discover, etc.
Overall, while the keywords are addressed in the articles, the blog itself does not have a general theme, nor even an organizational structure of the different topics addressed.
The result: We have a well-run blog with a good amount of content, but no organizational architecture for its content
It would probably be difficult for a user to find the information he needs in such a configuration.
Worse, this way of leaving articles in bulk could harm the natural referencing of the site. The pages of the same site that deal with similar keywords tend to compete with each other to get a good ranking on Google.
The thematic cluster addresses this problem by offering two solutions
- Organize the information to make it more accessible, both for Internet users and for search engines
- Responding to users’ questions, including conversational ones collected on voice assistants (Siri, Amazon Alexa, Google Assistant…)
So, what does the architecture proposed by the thematic cluster look like to organize the content of a site?
Here’s a look at what your site or blog should look like if you use this strategy :
Contrary to the structural disorder that can be observed on classic blogs, the topic cluster makes it easier for web marketers to organize their content. A user looking for a specific piece of information will now be able to find it quickly
The same goes for search engines, which are becoming increasingly intelligent thanks to AI and which understand human language better. They are now able to understand an architecture based on a subject and not necessarily on a keyword
2.2. The composition of a thematic cluster strategy
Before showing you how to build a thematic cluster strategy, it is important to study its composition which takes into account three main parts
- The pillar page (The pillar page: This is the basic page that deals with a theme as a whole, with hyperlinks redirecting the reader to other content on the same site, and which deals in detail with different aspects of the theme
To make the pillar page more effective, it is recommended to link it directly to the site’s menu or to the home page
- Cluster articles (or Cluster content): These articles will develop a specific aspect of the original theme, that of the pillar page. All cluster or satellite articles must be linked to the main page
- Hypertext links these are the links that must serve as internal links and link the cluster articles to each other and to the pillar page
In short, the thematic cluster is composed of a mother page (pillar page), several daughter pages (cluster articles) and hyperlinks to link the whole and form a whole
Now let’s look at the composition of the thematic cluster in more detail
2.2.1. The pillar page and its content
The content of the pillar page could be represented as the heart of each important theme of your business sector and at the same time as a reference for your visitors
It is the starting point where all the aspects of the theme come together. The reader can go deeper into each of these aspects through blog posts and access them easily with one click from the pillar page
In order to cover all aspects of a topic, the content of the pillar page should be long enough, around 3,000 words or more
With such a length, the writer will have more ease to address all facets of the topic without giving the impression of making a list or going into too much detail while answering users’ questions.
Ideal for Internet users who are not knowledgeable about a subject, but who want to have a global understanding of it.
At Twaino, for example, one of our pillar pages could be a page dedicated to SEO since it is our specialty and a service we offer to our clients
2.2.2. The different cluster articles of the theme
If the content of the pillar page deals with a theme as a whole, the cluster articles on the other hand are exclusively interested in a specific keyword of this theme to make a complete and in-depth post
Moreover, just as the pillar pages contain links to the cluster articles, they too contain links to allow the reader to return to the pillar page at any time
So if I imagine “SEO” as one of my main topics, my cluster articles could be about
- What are the best SEO techniques?
- What are the essential tools for a successful SEO campaign?
- What is the difference between SEO and SEA?
- How to recognize a good SEO agency?
Ideally, a pillar page can have up to ten cluster articles in order to fully address a topic and answer the majority of questions that users are looking for
If you lack content ideas, you can consult my article on 21 keyword research tools
2.2.3. The internal link
Now you have a clearly defined main topic with blog posts that go into more detail on each of the possible aspects of the topic
The last part of the thematic cluster consists in linking the content of the pillar page to the satellite articles you have written. This is called internal linking, and is made possible by hyperlinks
To encourage readers to discover these cluster articles from the pillar page, it is important to include the main keyword of the topic in the anchor, like this example: ” How to choose your SEO agency “.
2.3. The advantages of the thematic cluster
The thematic cluster strategy has enough advantages that we will try to present
2.3.1. The topic cluster allows a better management of keywords
This strategy allows you to make the most of your keywords, generic as well as long tail
As a reminder, a keyword is called “long tail” when it is composed of at least 3 to 4 words. These keywords are generally less requested by Internet users
In contrast, we have the “short tail” keywords which are composed of a maximum of 2 words and are strongly requested by Internet users
Throughout the clustering process, you will be called upon to organize the use of these keywords
All the keywords will be gathered in the pillar page, while the long tail keywords will be targeted in the cluster articles
2.3.2. The cluster topic allows to obtain a better referencing
The referencing remains a priority of any sEO strategy. And the use of pillar pages associated with a cluster of well-detailed content is undoubtedly one of the best ways toimprove the referencing of your site
This is due to the semantic cocoon which allows an internal mesh between your articles. Thanks to this mesh, Internet users can “travel” through your pages and browse all the information you provide on a theme in just 2 or 3 clicks.
This will make their session on your site last. Google will deduce that you offer relevant information to your visitors and will reward you by improving your ranking on its SERP.
Also remember that the thematic cluster allows you to be referenced on conversational queries (voice searches) which currently represent 20% of all queries that Google receives
2.3.3. The topic cluster allows you to reduce your bounce rate
Again thanks to the cocoon you create by applying the cluster topic, your visitors will easily discover the depths of your site without getting lost in their navigation
The fact is that, if the mesh is successful, visitors will always find an internal link, either to return to the pillar page or to continue reading and deepen their understanding of the topic
This ease of navigation on your site greatly reduces your bounce rate. Especially since the bounce rate refers to the number of visitors who leave your page without interacting with the elements of the page or consulting another page of the site
2.3.4. Improve the experience of your users
The Cluster Topic strategy improves the user experience of your site in that it allows your audience to
- Start from a query to find one of your contents and then come across a bank of contents, all related to their initial query
- Be sufficiently informed and at the same time find the right solution to their problem thanks to the products and services you offer. All this will make him satisfied and both parties will win
It must be said that you will benefit from all these advantages of clustering if you already have a lot of content on your site. You will be able to better organize them to offer a good experience to your users and get in return a good referencing of the site.
2.4. The disadvantages of clustering topic
Despite all these advantages, clustering still has some disadvantages that I thought it would be useful to mention. Although it is a beneficial strategy, the big problem here is time
Setting up a cluster topic requires quite a bit of time and know-how to devote. Note that creating long, well-written and well-illustrated content requires a lot of your time, especially if you don’t have any content on your site yet
For those who already have a blog with a lot of content, repurposing it into clusters will certainly take less time than if you were starting from scratch
But when you remember the advantages of the cluster topic, you realize how much the game is worth.
Chapter 3: How to set up a topic cluster strategy?
We’ve gone through the composition, the advantages and the disadvantages of the topic cluster. In this penultimate section, we will discuss the different steps to set it up so that you can try it too
3.1. find a theme and sub-topics
This is the first step in the clustering process, which consists of selecting a relevant topic that is in line with your activities and that is of interest to your readers
Ideally, it would be wise to choose a keyword with a short tail for the topic. In our previous example, it was the keyword “SEO”
In your context, here are some questions you can try to answer to help you find a relevant topic
- What are the challenges my target audience faces?
- Which of my services does my target audience need to solve their problems?
- What topics would I like to be a reference on the Internet?
- And so on.
Once you have chosen your main topic, we can move on to finding the sub-topics that will revolve around it. These are in fact your blog posts
For these satellite contents, we have privileged long tail keywords. To find these keywords, I invite you to consult my complete guide on keyword research
You’ll find out how to get keyword ideas with tools like
- Google Search Console
- Google Trends
- Answer The Public
Once your keyword list is ready, the creation of the cluster topic itself can begin.
3.2) Creation of the pillar page and optimization of its content
Here, we will choose the format of our pillar page and for that, we have the choice between two main formats
- The pillar page of type the “How to…” pillar page it consists of detailing the entire process of the chosen theme. It is a sort of complete guide that allows you to understand the theme in all its meaning
- The “How to” pillar page Summary “or ” Resources “This is a brief summary obtained from a corpus of cluster articles. It can be said that this is the fastest method of creating a pillar page.
3.3. Write your cluster articles (satellite)
At this stage, you have a theme to develop, the sub-topics and a pillar page format of your choice. Now it’s time to audit your existing content
Go through your blog and find all the articles that address a specific aspect of the main theme. This is a step that can help you be more productive
However, if you don’t have any written content yet, this is the time to get out your best pen and start writing. It’s all about producing quality, web-optimized content
If you don’t know how to go about producing content, you’ll find out all the different steps in this guide dedicated entirely to web writing
3.4. Link your content
Here we are at the last step of the thematic cluster creation process. For this last step, it will be a matter of linking all the contents written, one after the other, without forgetting of course the core, the anchor page
This is probably the most important step of all because this is where you organize your content to make it easier for Internet users and search engine bots to find information
But in concrete terms, how does all this work? Well, as explained above, everything will be based on hyperlinks. Remember how I invited you earlier to discover my complete guides on web writing and keyword research
However, do not force the insertion of internal links. Google could interpret it as keyword stuffing and penalize you. Moreover, if you insert links every 2 sentences, even your readers will not find their way
So be natural and insert links only in relevant places where the links are really necessary for the reader’s understanding
In addition to this article, you can consult my complete guide on the semantic cocoon which will provide you with a much more detailed process on link building.
Chapter 4: Other questions about clustering
4.1. What is clustering for?
Clustering (sometimes called cluster analysis) is generally used to classify data into structures that are easier to understand and manipulate.
4.2. What are the different types of clustering?
The different types of clustering are:
- Connectivity-based clustering (hierarchical clustering);
- Clustering based on centroids (partitioning methods);
- Clustering based on distribution ;
- Density-based clustering (model-based methods);
- Fuzzy Clustering;
- Constraint-based (supervised clustering).
4.3. What is a cluster page?
A cluster page tends to focus more closely on a specific user intent. This approach is not just about “finding a long tail keyword and writing about it”. It’s about covering related topics in more detail.
4.4 How to create a content cluster or topic cluster?
Here are 10 main rules to proceed with the creation:
- First understand the essential parts of a topic cluster;
- Perform a content audit;
- Identify the main themes and sub-themes;
- Strategize your subtopics;
- Conduct your keyword research;
- Link your existing content into your topic cluster;
- Identify content gaps;
- Create a content creation strategy to fill the gaps;
- Optimize content for humans and search engines;
- Monitor your results.
4.5. What are the three main parts of a topic cluster?
The topic cluster usually consists of 3 parts
- Pillar article
- Content cluster;
4.6. What is the Pillar Article?
The first step in creating your content cluster is to identify what you want your blog or brand to be known for. This would serve as the main objective of your topic cluster.
The topic should be general, as it will be broken down into different sub-topics later on. The pillar article or pillar page should be a thorough and complete discussion of your chosen topic. This is usually large. Then use links to pages that relate to the topic to keep your article organized.
4.7. What are Content Clusters?
Content clusters are sub-topics of the anchor article and are usually “shorter” blog posts and other types of content.
Each of these content clusters addresses a particular subsection of the pillar article in more detail. They also always link back to the pillar article.
4.8. What are Hyperlinks?
Hyperlinks are essential in topic groups because they link the anchor article to content groups and vice versa, making it easier for visitors to navigate your website. This encourages your visitors to explore your site further.
Clustering is a clustering technique to organize data in multiple areas. In content marketing, the clustering strategy comes to revolutionize the traditional blog architecture
With this strategy, you can organize your content for better search engine indexing and a good experience for your users.
Although it is a technique that requires time and know-how, it can considerably improve your site’s SEO
See you soon!