What and Why is TF-IDF Important for SEO

TF IDF Report from Moz

SEMRush is a central SEO tool that I use to optimize my website on a daily basis. Applying the on-page SEO checker report, I came across a section that said TF-IDF via keyword optimization. TF-IDF is not new in 2018, but I feel like it’s getting more attention via third-party SEO tools. In a nutshell, TF-IDF stands for term frequency and how often you use the initial term along with semantic variations in a post. During this blog post, I’ll share what this strategy along with success I’m seeing from the optimizations.


The picture in this blog post is from Moz.


What Does TF-IDF in SEO Stand For?


The TF stands for term frequency of the word which means how many number of times you use a target keyword via a piece of content. The exact formula definition stands for “term frequency times inverse document frequency.” If you are confused with this formula definition above, I’ll explain it with more clarity below.


Ignoring the acronym for a moment, you should think of this tactic as a two-prong approach to SEO marketing. The simplest way to digest this strategy is how many time a word  is in the document or post you wrote. Let’s say that you are writing a blog post about how to lose weight. Your target keyword and title for the blog post is How to Lose Weight for Men Over 50.


You optimize your content to include that target term in the title tag, h1 tag, body content, and the ALT tag. To help rank this content even higher, you will need to put related terms into the material to help rank for that target phrase. Google and Bing are looking for more context around a subject because they want to rank the best content at the top of the results. The concept is similar to how Wikipedia seems to rank for everything. If I search for The Red Sox, I will see content around Boston, Fenway Park, the curse, players, and more. Even though my search was for the Red Sox, I’m giving related topics that encompass the Red Sox.


TF-IDF Example Using SEMRush


With the example from Wikipedia above, I want to illustrate how my report will look for. The first thing I do is head over to the On Page SEO Checker within SEMRush and type in my target URL. I then put in a few keywords that I would like the page to rank for. SEMRush then crawls the page and uses their database to pull suggestions on why others rank for my target terms.


Running an On Page SEO Report for Key Target Blog Post to find Semantic Ideas



When I click on the green box that says 8 ideas, I’m brought to a semantic SEO Report. If you scroll all the way to the bottom of the page, you will see the TF-IDF report. Since my blog post is about earning rewards with Microsoft, I’m giving all of these additional terms to use in my post. The TF-IDF report will even show me how often I use the target term compared to my rivals. My rivals are who already ranks in the top ten results in Google for the terms that I self identified at the start of the report. The report goes into more detail regarding TF-IDF weighting if you hover over the columns if you were curious. SEMRush’s weighting schemes and formulas are a little wordy within this document to cover.


TF-IDF Report in SEMRush


When you use these keyword suggestions, you should understand that you won’t necessarily rank for these phrases. These additional keywords will help amplify your target keywords that you used at the start of the report. The Wikipedia example that I gave gives the reader and search spiders a lot of context around a particular subject. If we forget how powerful Wikipedia is for example, you would be impressed as a user how much info can be found on a topic. Search engines like Google and Bing are looking for that type of content to rank high in their results because the content will satisfy their searchers. 


Results Using TF-IDF


Google Search Console Rebound via Organic Traffic and Clicks


Using Google Search Console, I can see that my blog post is starting to drive more impressions and clicks after making the changes. Impression growth is an early indicator that a strategy in SEO is beginning to bear fruit. In theory, you are ranking for more keywords which means more opportunity for impression growth. Even more impressivie, you may be moving keywords to page one in Google. Depending on the search volume for the given keyword, you can see tremendous growth in impressions and clicks. 


Below are the results from Google Analytics. You can see below that I’m now driving more organic clicks to this page after making the updates that SEMRush gave me. By combining Google Search Console’s impressions with Google Analytics, you can spot trends before they happen via clicks.


Growth in Google Analytics after using TF IDF Report in SEMRush


Additional Steps I Took and Resources


SEO takes a long time see results. You may hear the expressions that week or months are needed to show most optimizations which is true. However, I would like to point out that the post I worked on was already in Google and Bing’s index for over a year. I had natural backlinks pointing to this page and had this page in my sitemap file for a while. That meant that Google and Bing already had a memory of my content, so making small updates, in theory, will take less time to see the results.


To speed up the crawling of your page, you should use Google Search Console and Bing Webmaster Tools. Both tools have an option to alert their bots to crawl a page, so you should take advantage of this feature. If you are looking for help with this, I will check out my post around a 10 minute check for Daily SEO work. You should also check out Moz’s White Board Friday special around refurbishing old content because they go into great detail then I did.


Final Thoughts on TF-IDF for SEO


Content creation is a long and challenging process for online marketing. The time, research, editing, publishing, promoting, and more take up a considerable amount of time. My theory is that you should spend time trying to improve your organic performance as often as you can. I rerun the report in SEMRush often on target pages to see if I should use new words. SEO is constantly evolving, so you can’t stay stagnant in this field. One important element to stress is that content should not solely be based on the total number of words on a page. There are a lot of studies that showcase more words rank better, but it’s about covering the topic in many ways. Just because a keyword occurs in a document in a few times does not necessarily mean the page will rank. The frequency of the word in context to the content on the page is what helps the page rank in Google.


This process that I’ve outlined below is a standard optimization that I perform on TM Blast and client websites. Routinely looking for new ways to optimize a site should always be the focus of SEO Services. TF-IDF (term frequency inverse document frequency) is a fantastic way to drive more traffic to your site. Let me know if you have any questions!