This post is purely theoretical based on what has worked for our client, on top of this there are two scenarios:
Google cache caused indexing issues which in turn may have affected ranking ruing the core update
Google May core update caused the site to index slowly, and the caching issues are unrelated and an overreaction on our part
Table of Contents
Re-occurring problem slow indexing and poor ranking problem
This issue has happened to our client twice now, the website focuses on consumer tech news and reviews and has a large amount of traffic with daily content being posted. It is therefore essential that content is indexed as quickly as possible to stay ahead of the curve when reporting news or doing product reviews.
The website is in Google News, and content generally gets indexed within a few minutes, if not seconds.
In September, around another major Google update, the posts suffered issues with a massive delay in indexing, this could be several hours if not longer before content shows up.
This caused issues with featured snippets and content getting exposure in Google News.
Furthermore, sites scraping the content would get that content indexed first, which in turn would cause the client site to suffer greatly for any rankings based on their new content.
This took quite a long time to identify then fix last year, but since then traffic has grown breaking all records in April
Google May 2020 Core Update
There was a small drop in rankings during this updates, and it became apparent traffic was dropping off slightly with an approximate 20% drop. There are many other factors at play here, the lockdown has had a significant effect on traffic, and the easing of the lockdown seems to correlate to a drop in traffic.
However, it quickly became apparent that content was not getting indexed properly, and certain sites were then ranking above our client for the content the client had written.
Google News Slow Indexing
This issues also affect Google news, the client has a legacy Google news listing, then they also have listed the site via Google Publisher Centre.
In the legacy listing, which we can’t manage anymore, content would not get listed though you can force a content refresh with the newer Google Publisher Centre.
Google News and Google AMP Cache
We found there is a strong correlation between Google News listing an article and the Google AMP Cache refreshing.
While this does not match up 100% of the time, we found that when the AMP cache updates showing the homepage correctly, then the Google News also shows the content.
Persistent delays in Google AMP Cache on Ampproject.org
Accelerated mobile pages have been a fickle mistress; there is strong evidence that shows AMP content has helped traffic grow to the site exponentially and currently the AMP pages receive 44% of the websites traffic.
These pages are specifically designed for mobile devices and aim to speed up the website and offer an improved mobile experience.
Google clearly prioritises these pages when it comes to mobile search, and the website currently receives 56% of its traffic via mobile.
However, Google has two ways of serving AMP content, your actual amp pages such as (https://www.dolphinpromotions.co.uk/amp) or via their own cache content which in the case of our website is https://dolphinpromotions-co-uk.cdn.ampproject.org/c/s/dolphinpromotions.co.uk/amp
However, we have found that ampproject.org doesn’t refresh their cache often in some scenarios; the homepage of the site is particularly problematic.
When we experience these indexing issues, the cache can suffer from delays by days if not weeks.
Is it cache problems that cause delays in indexing?
Before we move onto possible solutions, we eventually thought that it wasn’t just the AMP cache that was the source of the problem, but perhaps caching in general.
The site runs on Cloudflare and uses WP Rocket. Previously we used Stack Path CDN.
The issues we faced also seemed to line up to CDN changes. We theorised that one possible cause was aggressive caching was meaning that Google just doesn’t see the content.
Summary of possible causes of the problem
Aggressive caching on the server
AMP Project Caching
The first time we had this solution, we found that disabling Stack Path and serving the content directly from the server improved the indexing issues considerably. This led to us ditching Stack Path totally and going back to Cloud Flare.
This time around, we also disabled WP Rocket to make sure the caching issues weren’t on the server-side. From our experience, this didn’t negatively affect the site speed, but Google Page Speed test did show a drop in the score.
Permanently serving content directly from the server is not ideal, so you will likely need to slowly re-introduce the various optimisations making sure the site keeps getting indexed properly.
This is also true for WP Rocket, and we have just enabled this.
Refresh AMP Cache on Ampproject.org
The Ampproject.org cache should refresh automatically when a user visits the page, you can force this by vising the appropriate page for the cache, and you can find the URL via AMP.dev.
What we sometimes find works, is to refresh both the article and homepage cache URL then also reload those pages in a separate browser or in incognito. This should simulate two users visiting the cached content forcing the cache update.
Force AMP cache update
You can tell Google to update the cache using the update-cache request, this is not a particularly easy solution, as it requires you to generate your own RSA key then submit the URL with the key.
There is a GitHub project that simplifies this process but even that was tricky to get working.
The easiest solution is to use the AMP for WP plugin to handle all your AMP pages then use their AMP Cache plugin. This was updated back in January to work natively with no need to generate RSA keys on your end.
It doesn’t immediately force the homepage to show new content but it improves the speed considerably.
Force Google Index
While not a permanent solution, if you are experiencing slow indexing issues, we recommend using Google Webmaster Tools to check if the content is indexed, then if it is not submitted it for indexing. We find this normally gets content indexed within a few minutes of submission and prevents scrapers from ranking your content.
DMCA takedown for content indexing higher than you
If you find you do have a site ranking higher for you for your own content, then a Google DMCA request can fix it. It won’t stop the site scraping your content but it works as a partial fix. Google is quick for takedowns, sometimes it is a few hours though it can take a few days .
You can only submit 10 URLs at a time, and you have to supply an explanation per URL, so it’s a bit of a chore, but it works.
While the problem appears to be fixed, content now gets indexed within a few minutes, Google news lists the sites within 15 minutes and no scraped content ranks above us. Also featured snippets appear to be performing better.
However, the website doesn’t seem to perform quite as well as it did prior to the Core Update or the indexing issues. This was true back in September and now in May. Assuming the trend follows September, this takes a while to improve, we can only assume this is while Google builds up trust back in the site and starts to show more featured snippets and rank the site higher in Google news.