Does Googlebot Support HTTP/2? Challenging Google’s Indexing Claims – An Experiment

Posted by goralewicz

I was recently challenged with a question from a client, Robert, who runs a small PR firm and needed to optimize a client’s website. His question inspired me to run a small experiment in HTTP protocols. So what was Robert’s question? He asked…

Can Googlebot crawl using HTTP/2 protocols?

You may be asking yourself, why should I care about Robert and his HTTP protocols?

As a refresher, HTTP protocols are the basic set of standards allowing the World Wide Web to exchange information. They are the reason a web browser can display data stored on another server. The first was initiated back in 1989, which means, just like everything else, HTTP protocols are getting outdated. HTTP/2 is one of the latest versions of HTTP protocol to be created to replace these aging versions.

So, back to our question: why do you, as an SEO, care to know more about HTTP protocols? The short answer is that none of your SEO efforts matter or can even be done without a basic understanding of HTTP protocol. Robert knew that if his site wasn’t indexing correctly, his client would miss out on valuable web traffic from searches.

The hype … Read the rest

October 17, 2017  Tags: , , , , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

Google Shares Details About the Technology Behind Googlebot

Posted by goralewicz

Crawling and indexing has been a hot topic over the last few years. As soon as Google launched Google Panda, people rushed to their server logs and crawling stats and began fixing their index bloat. All those problems didn’t exist in the “SEO = backlinks” era from a few years ago. With this exponential growth of technical SEO, we need to get more and more technical. That being said, we still don’t know how exactly Google crawls our websites. Many SEOs still can’t tell the difference between crawling and indexing.

The biggest problem, though, is that when we want to troubleshoot indexing problems, the only tool in our arsenal is Google Search Console and the Fetch and Render tool. Once your website includes more than HTML and CSS, there’s a lot of guesswork into how your content will be indexed by Google. This approach is risky, expensive, and can fail multiple times. Even when you discover the pieces of your website that weren’t indexed properly, it’s extremely difficult to get to the bottom of the problem and find the fragments of code responsible for the indexing problems.

Fortunately, this is about to change. Recently, Ilya … Read the rest

October 16, 2017  Tags: , , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

Optimizing AngularJS Single-Page Applications for Googlebot Crawlers

Posted by jrridley

It’s almost certain that you’ve encountered AngularJS on the web somewhere, even if you weren’t aware of it at the time. Here’s a list of just a few sites using Angular:

  • Upwork.com
  • Freelancer.com
  • Udemy.com
  • Youtube.com

Any of those look familiar? If so, it’s because AngularJS is taking over the Internet. There’s a good reason for that: Angular- and other React-style frameworks make for a better user and developer experience on a site. For background, AngularJS and ReactJS are part of a web design movement called single-page applications, or SPAs. While a traditional website loads each individual page as the user navigates the site, including calls to the server and cache, loading resources, and rendering the page, SPAs cut out much of the back-end activity by loading the entire site when a user first lands on a page. Instead of loading a new page each time you click on a link, the site dynamically updates a single HTML page as the user interacts with the site.

image001.png

Image c/o Microsoft

Why is this movement taking over the Internet? With SPAs, users are treated to a screaming fast site through which they can navigate almost instantaneously, while developers … Read the rest

May 31, 2017  Tags: , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

SEO Finds in Your Server Logs, Part 2: Optimizing for Googlebot

Posted by timresnik

This is a follow-up to a post I wrote a few months ago that goes over some of the basics of why server log files are a critical part of your technical SEO toolkit. In this post, I provide more detail around formatting the data in Excel in order to find and analyze Googlebot crawl optimization opportunities.

Before digging into the logs, it’s important to understand the basics of how Googlebot crawls your site. There are three basic factors that Googlebot considers. First is which pages should be crawled. This is determined by factors such as the number of backlinks that point to a page, the internal link structure of the site, the number and strength of the internal links that point to that page, and other internal signals like sitemaps.

Next, Googlebot determines how many pages to crawl. This is commonly referred to as the “crawl budget.” Factors that are most likely considered when allocating crawl budget are domain authority and trust, performance, load time, and clean crawl paths (Googlebot getting stuck in your endless faceted search loop costs them money). For much more detail on crawl budget, check out Ian Lurie’s post on the … Read the rest

August 3, 2013  Tags: , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

Googlebot Crawl Issue Identification Through Server Logs

Posted by Dave Sottimano

Sifting through server logs has made me infinitely better at my job as an SEO. If you're already using them as part of your analysis, congrats – if not, I encourage you to read this post.

In this post we’re going to:

  • Briefly introduce a server log hit
  • Understand common issues with Googlebot's crawl
  • Use a server log to see Googlebot's crawl path.
  • Look at a real issue with Googlebot wasting crawl budget and fix it.
  • Introduce or reacquaint you with my favourite data analyzer.

It’s critical to SEOs because:

  • Webmaster tools, 3rd party crawlers and search operators won’t give you the full story.
  • You’ll understand how Googlebot behaves on your site, and it will make you a better SEO.

I’m going to casually assume that you at least know what server logs are and how to obtain them. Just in case you've never seen a server log before, let's take a look at a sample "hit".

Anatomy of a server log hit

Each line in a server log represents a "hit" to the web server. The following illustrations can help explain:

File request example: brochure_download.pdf

A request for /page-a.html will likely end up … Read the rest

July 2, 2012  Tags: , , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments



TechNetSource on Facebook




TechNetSource » Googlebot