Quantcast
Channel: On Site SEO – Scritty's SEO Blog
Viewing all articles
Browse latest Browse all 8

Crawl Budget. How Often Does Google Visit Your Website?

$
0
0

Paul Rone-Clarke SEO Expert is happy to share the following.

Crawl budget is term used to describe the number of times that search engine spiders visit your website over a given period of time. It also has some impact on the depth that Google tend to search. How many links they will follow. What notice they take of some of your meta data and how that informs the length of stay and depth of search.

Make it easy to find your content

 

Looks nice – but is it easy to find, and is everything there you would expect to be there once a visitor arrives? Sorry – another of my tortured analogies.

 

For example, if the Google spider hits your site 5 times in two weeks, you can say that 5 is your bi-weekly crawl budget for Googlebot. You should keep in mind there are no limits to the number of times search engine spiders can visit your site. There are many factors that influence your crawl budget over a given period of time.

Didn’t The Google Caffeine Update Fix This Issue Years Ago?

Before we go any further, isn’t this an issue that was fixed more than 6 years ago.

Those of you who remember that far back probably recall the Google Caffeine update.

In the long path from “update runs” to real time updating (true real time updating that will likely never be achieved, at least not before Google servers work on quantum technology) Caffeine came along and promised more regular and deeper searches.

Well, yes it did. Importantly though the march towards a more responsive web index is iterative. When Google Caffeine released the overall search landscape was very different. For instance mobile searches, and the need to optimise for mobile devices and the subtly different terms searchers tend to use on them, as well as the content presentation changes required to work well with phones and tablets were barely a consideration.

Google’s Potential Indexing Deficit Issues

In the last 6 years most users have refined their searches to a very large degree. Being more and more specific about what they are looking for. I often bemoan the lack of context in the media and nuance in many things covered by professional journalists and individuals, but in terms of search habits? It’s gone the other way. users would far more search only once and hit the exact content they are looking for near the top of page one, than search many times or – heaven forbid – actually go in the the netherworld of “page 2” of their favourite search engine.

Yes, a lot of this is gossips dressed up as news, but it is what is being devoured by the terabyte in terms of downloads and views. Google is keeping up. Yesterdays news, after all, isn’t “news” at all.

The process that Caffeine was a part of 6 years ago was well overdue for an update, and we are seeing improvements in the amount of new content being indexed all the time, but these improvements are not quite keeping up with the massive volumes of content being added to the internet daily. In short, Google, despite speeding up it’s indexing and ranking – risks falling behind and having an “indexing deficit”

Enter (with fanfare)… The Crawl Budget

Crawl budget is one of the important SEO factors that many site owners do not pay much attention to. Many of you might have heard about it but the tend to assume that their websites have been assigned a fixed crawl quota and that there is nothing they can do to change things.

Crawl budget is something that any site owner who wants to perform well in the search engines must pay attention to. As a site owner, you should be very concerned with your crawl budget because you want Google to discover all the pages that you have on your site. Also, you obviously want Google to find the new content that you have published on your site fast. A bigger crawl budget means that Google will be able to discover most of the pages that you have in your site as well as find new content faster.

Pages that have not been crawled recently will not rank well in the search engine results page. If you want good SERPs performance, you have to do the things that will improve your crawl budget. Generally, pages that have been crawled within the past two weeks rank better than those that have not been crawled within the past two weeks.

It is highly likely that Google visit your website several times every single day. However, during these visits, Google may not crawl all the pages on your site. There is a misconception among site owners that Google crawls their entire site every time it visit. The reality is very different. If there are certain pages on your site that the Googlebot does not crawl, it is not because it is not comprehensive. It is because those pages have not earned Googlebot’s attention, either directly or indirectly.

How Search Engines Assign Crawl Budget

There is no one outside Google who know exactly how Google assigns crawl budget to different sites. According to Google’s Matt Cutts, the number of pages that Google crawls on a monthly basis is proportional to the PageRank. It is safe to assume that the number of times Google crawls a given website is roughly proportional to the number of backlinks and how important a website is in Google’s eyes. Google is always trying to ensure that the most important pages appear when people search. Here are two areas that should be of major concern to you when you want to improve your crawl budget;

1. PageRank

Although Google no longer makes PageRank information public, it plays a very important role as your site’s crawl budget. Google grades PageRank on a scale of zero to 10. PageRank is based on the number of quality links that are pointing to your site. Google has added more than 200 raking signals that are meant to help to improve the quality and relevance of the page that are displayed on the search engine results page. PageRank still remains one of the most important factors that Google considers. In the recent past, Google has been trying to discourage site owners from being too concerned with their PageRank when they want to improve their rankings. They PageRank indicator was even removed from Google Toolbar. However, you can always check the Page Rank using third party tools. Googlebot uses a site’s PageRank to prioritize the pages that it is going to crawl.

2. Loading speed

Googlebot want to crawl your website as quickly as possible. This is because there are many other sites that have to be crawled every single day. The fasted way that Google is able to crawl a site is by sending as many concurrent requests as possible. Unfortunately, sites can respond very slowly when they receive too many simultaneous requests. Google will adjust the crawl rate for your site automatically based on how it responds to these request. Sites that demonstrate the ability to handle higher concurrency rates receive higher crawl budgets than those that have low concurrency rates. These is because these websites will allow the Googlebot to go deeper within the same period of time that is allocated to every site. There are many factors that have an effect on the time it takes for a page on your site to load. These factors range from the way a page is coded to the speed of the web server. Yes Google has to load your page as well, the longer it takes the less inclined its bots will be to search deeper and more thoroughly. Their time is a resource they pay for. A slow site and Google might just give up at the front door.

3. Clear Navigation

Google follows 2 distinct routes through your site. The one pointed out to it by your sitemap, and also by following the links on your page. Yes indeed. Despite the naysayers and fear mongers, Google pays as much importance to links as it ever did. Inbound, inter site, cross site and outbound. And yes as well, links still play a massive part in your sites ability to rank.

While you probably know that inbound links are a valuable indicator of your site’s popularity and still vital to rank in competitive markets, it is also worth bearing in mind that your internal site links will be looked at. So have clear navigation and no extraneous links, dead links or outbound links to sites that you would rather not be associated with.

 

How to Maximize your Crawl Budget

It is very important to focus on maximizing your crawl budget when you want to improve your performance on Google and other search engines. There are few things that you can do that will help you ensure that search engine spiders consume as many pages as possible when they visit your site. These things will also help to make them visit more often. Here are the things that you should do to maximize your crawl budget;

 

1. Make sure that search engine spiders can crawl all the important pages on your site

Your robot.txt and .htcaccess should not block the important pages on your site. The search engine spiders should be able to access and crawl all the important pages on you site. The spider should also have an easy access to JavaScript and CSS files. If there is any content on your site that you don’t want on the search engine results page, you can block them to prevent search engines from crawling and indexing them. This includes the pages that are under constructions, those with duplicate content etc. This will help you save up your crawl budget so that is used only for pages that matter.

 

2. Manage your sites URL parameters

Most of the popular content management systems usually generate a lot dynamic URLs that lead to the same page. By default, search engines will see all the unique URLs as a separate pages. As a results, your crawl budget will be wasted. In addition, you may be penalized for breeding duplicate content. If you are using a website that adds parameters to URLs, make sure that Google knows about it by adding the parameters in your Google webmaster tools account. This will help you ensure that your crawl budget is utilized in the best way possible.

 

3. Find all broken links on your site and fix them

You obviously don’t want to spend a lot of the crawl budget allocated to your site on 404 pages. That is why you should take your time to check if your site has any broken link. If you find broken links, you should try to fix them as soon as possible. Having a lot of broken links (free link checker) on your site makes your crawl budget go to waste.

 

4. Ensure that there are no long redirect chains on you site

It is okay to have a few 301 and 302 redirects on your site. However, you have to keep in mind the fact that if there are too may 301 and 302 redirects on your site, the search engine spiders will be forced not to follow the redirects at some point. This mean that the destination page will not be crawled by the spiders. Each of the redirected URL will therefore become a waste of time. Every time the spiders refuse to follow a redirected chain, your crawl budget goes to waste.

 

5. Ensure that your sitemap is clean and up to date

XML sitemaps are very important when you want to ensure that search engine spiders will crawl your website in the best way possible. XML site maps make it easy for search engines to crawl your content and helps them to discover new content faster. A sitemap also shows Googlebots and other search engine spider how your content is organized. Ensure that your XML sitemap is regularly updated. Ensure that it is free from garbage such as 404 pages, URLs that redirect to other pages, pages blocked from indexation and non-canonical pages. If you have a large website that has a lot of subsections, it is best to make a separate sitemap for each of the sections. This will make it easier for you to manage your sitemaps and it will be much easier for you to detect the problematic areas of your site. For example, if you are running an ecommerce website, it is best to create an individual sitemap for categories with many products. If you have site that has a blog, discussion board and websites pages, it is best to create a sitemap for each of these sections. Make sure that it is easy for search engine spider to discover all the sitemaps that you have created for your site.

 

6. Use RSS

RSS feeds are visited regularly by Googlebot. If there is a section of your website that you usually update on a regular basis, create an RSS feed and submit it to Feed Burner. Keep RSS feeds free of 404 pages, non-canonical pages and blocked content.

 

7. Increase the number of links pointing to the important pages on your site

If you want to increase the number of times that Google crawls a given page on your site, you have to increase the number of links (both internal and external) pointing to those pages. You can shift the page authority in your site easily by increasing the number of internal links pointing to the important page and decreasing the number of links that point to the less important pages. This means that you will have to go through your site to change its link structure.

 

8. Use pagination only when you have to

If you have content on your site that is separated into multiple pages, chances are that most of the higher page numbers are not being crawled by search engine spiders. Most of the higher page numbers will have a low page rank. Go through your site and analyze your pages to determine if pagination was really necessary. Are there enough entries on a given page to justify pagination? If not, you should de-paginate the pages so that everything appears on one page. Google will be able to crawl more of the page that you have on your site when you de-paginate pages that did not need pagination in the first place. For the pages that genuinely need pagination, make sure that you have followed Google’s guidelines.

 

9. Check your site’s navigation structure

When you want to improve your crawl budget, you have to ensure that your site has a good navigation structure. Your navigation structure determines how easy or how difficult is for search engine crawlers to find your pages. Having a poor navigation structure means that it will be very difficult for search engines to discover and crawl the pages that you have on you site. Ensure that the link that are supposed to take search engines and to different pages are crawlable. It will not be possible for search engine to crawl and index your pages when there is no clear path to the pages.

 

10. Keep your content fresh

Today, Google can update its index as fast as it crawls the web. Google can easily know the sites that are being updated on a regular basis. It can also know which pages on a given site are getting regular updates. A page that has not been updated recently is less likely to be crawled than new that was recently updated. To make sure that your pages are crawled continuously by the search engine spiders, you should try to make routine enhancements and time sensitive additions as regularly as possible. The whole idea is to improve the value that your pages are providing to the visitors. You are on the right track if you have been improving your pages consistently.

The post Crawl Budget. How Often Does Google Visit Your Website? appeared first on Scritty's SEO Blog.


Viewing all articles
Browse latest Browse all 8

Latest Images

Trending Articles





Latest Images