Google Explains Why Index Coverage Report is Slow

Google clarified that the Search Console that the Index Coverage Report does not report the up to the minute coverage data. Google recommends using the URL Inspection Tool for those who need the most up to date confirmation of whether a URL is indexed or not.

Google Clarifies Index Coverage Report Data

There have been a number of tweets noticing what seemed like an error in the Index Coverage Report that was causing it to report that a URL was crawled but not indexed.

Turns out that this isn’t a bug but rather a limitation of the Index Coverage report.

Google explained it in a series of tweets.

Reports of Search Console Report Bug

“A few Google Search Console users reported that they saw URLs in the Index Coverage report marked as “Crawled – currently not indexed” that, when inspected with the URL Inspection tool, were listed as “Submitted and indexed” or some other status.”

Google Explains the Index Coverage Report

Google then shared in a series of tweets how the Index Coverage report works.

“This is because the Index Coverage report data is refreshed at a different (and slower) rate than the URL Inspection.

The results shown in URL Inspection are more recent, and should be taken as authoritative when they conflict with the Index Coverage report. (2/4)

Data shown in Index Coverage should reflect the accurate status of a page within a few days, when the status changes. (3/4)

As always, thanks for the feedback , we’ll look for ways to decrease this discrepancy so our reports and tools are always aligned and fresh! (4/4)”

John Mueller Answers Question About Index Coverage Report

Google’s John Mueller had answered a question about this issue on October 8, 2021. This was before it was understood that there wasn’t an error in the Index Coverage Report but rather a difference in the expectation of data freshness of the the Index Coverage Report and the reality that the data is refreshed at a slower pace.

The person asking the question related that in July 2021 they noticed that URLs submitted through Google Search Console reported the error of submitted but not indexed, even though the pages didn’t have a noindex tag.

Thereafter Google would return to the website, crawl the page and index it normally.

“The problem is we get 300 errors/no index and then on subsequent crawls only five get crawled before they re-crawl so many more.

So, given that that they are noindexed and granted if things can’t render or they can’t find the page, they’re directed to our page not found, which does have a no-index.

And so I know somehow they’re getting directed there.

Is this just a memory issue or since they’re subsequently crawled fine, is it just a…”

John Mueller answered:

“It’s hard to say without looking at the pages.

So I would really try to double-check if this was a problem then and is not a problem anymore or if it’s still something that kind of intermittently happens.
Because if it doesn’t matter, if it doesn’t kind of take place now anymore then like whatever…”

The person asking the question responded by insisting that it still takes place and that it continues to be an ongoing problem.

John Mueller responded by saying that his hunch is that something with the rendering might be going wrong.

“And if that’s something that still takes place, I would try to figure out what might be causing that.

And it might be that when you test the page in Search Console, nine times out of ten it works well. But kind of that one time out of ten when it doesn’t work well and redirects to the error page or we think it redirects to the error page.

That’s kind of the case I would try to drill down into and try to figure out is it that there are too many requests to render this page or there’s something complicated with the JavaScript that sometimes takes too long and sometimes works well and then try to narrow things down from that point of view.”

Mueller next explained how the crawling and rendering part happens from Google’s side of crawling.

He makes reference to a “Chrome-type” browser which might be a reference to Google’s headless Chrome bot which is essentially a Chrome browser that is missing the front end user interface.

“What happens on our side is we crawl the HTML page and then we try to process the HTML page in kind of the Chromium kind of Chrome-type browser.

And for that we try to pull in all of the resources that are mentioned on there.

So if you go to the Developer Console in Chrome and you look at the network section, it shows you a waterfall diagram of everything that it loads to render the page.

And if there are lots of things that need to be loaded, then it can happen that things time out and then we might run into that error situation.”

Mueller next suggested reducing the amount of resource requests being made for JavaScript and CSS files and try to combine or reduce them, and minimize images, which is always a good thing to do.

Mueller’s suggestion is related to Rendering SEO which was discussed by Google’s Martin Splitt, where the technical aspects of how a web page is downloaded and rendered in a browser is optimized for fast and efficient performance.

Some Crawl Errors Are Server Related

Mueller’s answer was not entirely precisely relevant for this specific situation because the problem was one of expectation of freshness and not an indexing.

However his advice is still accurate for the many times that there is a server-related issue that is causing resource serving timeouts that block the proper rendering of a web page.

This can happen at night in the early morning hours when rogue bots swarm a website and slow down the site.

A site that doesn’t have optimized resources, particularly one on a shared server, can experience dramatic slowdowns where the server begins showing 500 error response codes.

Speaking from experience in maintaining a dedicated server, misconfiguration in Nginx, Apache or PHP at the server level or a failing hard drive can also contribute to the website failing to show requested pages to Google or to website visitors.

Some of these issues can creep in unnoticed when the various software are updated to less than optimal settings, requiring troubleshooting to identify errors.

Fortunately server software like Plesk have diagnostic and repair tools that can help fix these problems when they arise.

This time the problem was that Google hadn’t adequately set the correct expectation for the Index Coverage Report.

But next time it could be a server or rendering issue.

Citations

Google Search Central Tweets Explanation of Index Coverage Report

Google Index Coverage Report and Reported Indexing Errors

Watch at the 6:00 Minute Mark