Why GSC Shows Crawl Errors When Page Loads Normally

Google’s John Mueller answered a question about Google Search Console (GSC) showing crawl errors on pages that load fine in the browser. John Mueller explained that the problem is typically found on the server side and not an issue with Googlebot.

The person asking the question tried validating the web pages in Google Search Console (GSC) but the pages would stubbornly not validate. It’s like something was blocking Googlebot and serving an error message.

Here is the question:

“I had a server error on some of my pages. When I checked the few pages shown in the example they worked just fine.

I also used the validate option a few times but Google keeps marking the pages with an error.

It’s been a month since then.

I’ve waited for Googlebot to index these pages with no success.

This has affected my organic impressions and clicks.

Is there anything I can do here?”

John Mueller affirmed that if GSC shows server errors when Googlebot is crawling the web pages then they really exist and are not a bug on Google’s side.

Here’s what John said:

“We don’t invent errors on pages. When Googlebot checks the page and there’s a server error then we really see a server error there.”

Temporary Crawl Errors

Next Mueller observed that some of these problems are temporary.

Here’s what he said:

“And it might be that this is something that is temporary on your website and if it’s temporary then with one of the future crawls we will …try that again.

And if the error is gone then we can index the page normally.”

Sometimes these errors are indeed temporary.

For example, a server might go down for maintenance, there might be issues in DNS that is taking down a part of the Internet or it can also be that the server is overloaded and preventing crawling.

But that’s not the issue that the person asking the question is experiencing.

If the web pages load normally but whenever you validate with GSC (or try to test it with one of Google’s tools like the rich results tester or mobile friendly tester) and it fails to validate, that’s an issue at the server.

Server Issues Can Cause GSC Indexing Errors

John Mueller next suggested that the problem the person asking the question is experiencing may be server related.

Mueller:

“So that’s something where if you see these kinds of issues come up regularly and in particular you use the validate feature in search console and the validation comes back and says there are still server errors then that’s something I would take up with your hoster.

And try to see if there’s something that they can do to diagnose this issue to double check what might be happening here to give you a sense of how many URLs does this affect, actually.

Because sometimes it can be tricky where if Googlebot is crawling millions of pages from your website and one hundred of them have a server error then probably that’s irrelevant because like there are always some errors somewhere.

But if Googlebot is crawling 200 pages from your website and a hundred of those are server errors then that’s a little bit more concerning that something you’d probably want to fix and make sure that it doesn’t happen.

So that’s kind of the direction I would head there.

It’s not something that Google can fix.

It’s really something that you need to fix on your side with regards to the hosting.”

Diagnosing Googlebot Crawl Errors

There’s a diagnostic trick that you can do to ascertain if this is a server-wide configuration issue, if your IP is being shared by other sites on the server.

What can be done is to identify the IP address that the site is on then run that ISP through a reverse IP checker that can show what other sites are hosted on that IP address.

Then you take that list of domain names and run them through one of Google’s tools like the AMP checker or Google’s rich results checker tool.

If the tool reports an error response for one or more of the domains then that may indicate that there is a server-wide error, which

Every server has a server log and that’s a good place to start diagnosing what is causing the problem. Those server logs will show the date and exact time of when an error happened as well as the IP address of the visitor that triggered the error.

A typical issue can be an error in how a firewall is set up which might be set too strict and blocking Google.

If you don’t have access to the server logs then a call to the web hosting customer support is in order, as John Mueller recommended.

Citation

Watch the Google Office Hours Hangout

Why Does GSC Show Server Errors on Pages that Load Fine?
Watch at the 47:05 minute mark

#