How to Clean a Dirty XML Sitemap Using WordPress

301 Redirects Screaming Frog

A clean sitemap is a vital part of having a fantastic SEO strategy for your website. Bing has said they have a low tolerance for dirty sitemaps while Google has stated they are more lenient on that issue. I believe it is still essential to have a clean sitemap none the less. A clear sitemap will also cut the amount of unnecessary steps the search bots have to take to crawl your sitemap. Assuming you know how to crawl your sitemap for errors, you are now ready for the next step if you use WordPress.


Screaming Frog Results + WordPress


Here is an example of a screaming frog pull that I did on my site. I sorted the view to show the three 301 redirects in my sitemap. When I look at the URL’s, I can see what blog posts need my attention to fix this issue.


301 Redirects Screaming Frog


Assuming you are using Yoast, you can head over to the SEO area and look for the XML sitemap section. From there, you want to look for the exclude post branch so we can specify what posts we want to exclude from the sitemap. Here is a picture below of what you should be seeing. The posts exclude section is where you are going to list what posts you want to exclude.


How to Exclude Posts from your Sitemap


How Do I Find the Post ID in WordPress?


Now that we know where to add the exclusion, we need to figure out how to find the ID number of the post that we want to remove. To do this, you merely head to the post section of your site and open up that blog post that you want to remove from your website. If you click on edit, you will see the URL at the top of the screen post the ID. Here is an example of the ID number that is in association with this blog post. If you have a lot of posts to remove, I would recommend putting this name in excel so you can copy and paste this at the end.


How to Find the Post ID in WordPress


Now we head over to the Yoast XML section, and we paste in that number to exclude it from the sitemap. Once that ID number is in place in Yoast, we can head back to Screaming Frog and re-run the report.


One of the best things about Screaming Frog is that you can run this report with the free version. If you upload the sitemap as a file, you can go past the 500 URL’s that you are limited to with the free version. You can also put in the sitemap URL to run the report as well.


Why Do Some URL’s Never Get Indexed in Google and Bing?


There are many reasons as to why some of your key pages don’t receive any organic traffic. Assuming you know the basics like title tags and stuff, you might want to take a deeper dive with tools like Google Search Console and Bing Webmaster Tools. Take notice of the Search Console report from Google individually to see what the ratio of submitted vs. pages that are indexable are. You want that number to be as close to 1:1 as possible. Anything that looks far off usually needs a more in-depth dive to figure out why the pages are not indexed in Google. If you want to run a comprehensive analysis, you should check out how to use log files for SEO where I go over how to see what search bots see when they visit your website.

Here is an example of what Google Search Console says about my sitemap. As you can see here, I am very close to the 1:1 ratio, so I know what I am trying to present is being properly indexed. This rate is the ideal state that you want to strive for regarding SEO.



Clean Sitemap


Another reason why individual pages might not be indexed can be due to an improper sitemap file. You should be using an XML sitemap file when you present that to the engines. You might have some restrictions on your site with the robots.txt file that explicitly blocks individual pages from being seen by their bots. You could also have some NOINDEX tags on your site that can be preventing these pages from being indexed. Either way, you now have a few things to look at when you are reviewing your site.


If you are looking for help, I offer NY SEO Expert along with a SEO audit that can help get you back on track!