The Machine Learning Revolution: How it Works and its Impact on SEO

Posted by EricEnge

Machine learning is already a very big deal. It’s here, and it’s in use in far more businesses than you might suspect. A few months back, I decided to take a deep dive into this topic to learn more about it. In today’s post, I’ll dive into a certain amount of technical detail about how it works, but I also plan to discuss its practical impact on SEO and digital marketing.

For reference, check out Rand Fishkin’s presentation about how we’ve entered into a two-algorithm world. Rand addresses the impact of machine learning on search and SEO in detail in that presentation, and how it influences SEO. I’ll talk more about that again later.

For fun, I’ll also include a tool that allows you to predict your chances of getting a retweet based on a number of things: your Followerwonk Social Authority, whether you include images, hashtags, and several other similar factors. I call this tool the Twitter Engagement Predictor (TEP). To build the TEP, I created and trained a neural network. The tool will accept input from you, and then use the neural network to predict your chances of getting an RT.

The TEP leverages the data from a study I published in December 2014 on Twitter engagement, where we reviewed information from 1.9M original tweets (as opposed to RTs and favorites) to see what factors most improved the chances of getting a retweet.

My machine learning journey

I got my first meaningful glimpse of machine learning back in 2011 when I interviewed Google’s Peter Norvig, and he told me how Google had used it to teach Google Translate.

Basically, they looked at all the language translations they could find across the web and learned from them. This is a very intense and complicated example of machine learning, and Google had deployed it by 2011. Suffice it to say that all the major market players — such as Google, Apple, Microsoft, and Facebook — already leverage machine learning in many interesting ways.

Back in November, when I decided I wanted to learn more about the topic, I started doing a variety of searches of articles to read online. It wasn’t long before I stumbled upon this great course on machine learning on Coursera. It’s taught by Andrew Ng of Stanford University, and it provides an awesome, in-depth look at the basics of machine learning.

Warning: This course is long (19 total sections with an average of more than one hour of video each). It also requires an understanding of calculus to get through the math. In the course, you’ll be immersed in math from start to finish. But the point is this: If you have the math background, and the determination, you can take a free online course to get started with this stuff.

In addition, Ng walks you through many programming examples using a language called Octave. You can then take what you’ve learned and create your own machine learning programs. This is exactly what I have done in the example program included below.

Basic concepts of machine learning

First of all, let me be clear: this process didn’t make me a leading expert on this topic. However, I’ve learned enough to provide you with a serviceable intro to some key concepts. You can break machine learning into two classes: supervised and unsupervised. First, I’ll take a look at supervised machine learning.

Supervised machine learning

At its most basic level, you can think of supervised machine learning as creating a series of equations to fit a known set of data. Let’s say you want an algorithm to predict housing prices (an example that Ng uses frequently in the Coursera classes). You might get some data that looks like this (note that the data is totally made up):

In this example, we have (fictitious) historical data that indicates the price of a house based on its size. As you can see, the price tends to go up as house size goes up, but the data does not fit into a straight line. However, you can calculate a straight line that fits the data pretty well, and that line might look like this:

This line can then be used to predict the pricing for new houses. We treat the size of the house as the “input” to the algorithm and the predicted price as the “output.” For example, if you have a house that is 2600 square feet, the price looks like it would be about $xxxK ?????? dollars.

However, this model turns out to be a bit simplistic. There are other factors that can play into housing prices, such as the total rooms, number of bedrooms, number of bathrooms, and lot size. Based on this, you could build a slightly more complicated model, with a table of data similar to this one:

Already you can see that a simple straight line will not do, as you’ll have to assign weights to each factor to come up with a housing price prediction. Perhaps the biggest factors are house size and lot size, but rooms, bedrooms, and bathrooms all deserve some weight as well (all of these would be considered new “inputs”).

Even now, we’re still being quite simplistic. Another huge factor in housing prices is location. Pricing in Seattle, WA is different than it is in Galveston, TX. Once you attempt to build this algorithm on a national scale, using location as an additional input, you can see that it starts to become a very complex problem.

You can use machine learning techniques to solve any of these three types of problems. In each of these examples, you’d assemble a large data set of examples, which can be called training examples, and run a set of programs to design an algorithm to fit the data. This allows you to submit new inputs and use the algorithm to predict the output (the price, in this case). Using training examples like this is what’s referred to as “supervised machine learning.”

Classification problems

This a special class of problems where the goal is to predict specific outcomes. For example, imagine we want to predict the chances that a newborn baby will grow to be at least 6 feet tall. You could imagine that inputs might be as follows:

The output of this algorithm might be a 0 if the person was going to shorter than 6 feet tall, or 1 if they were going to be 6 feet or taller. What makes it a classification problem is that you are putting the input items into one specific class or another. For the height prediction problem as I described it, we are not trying to guess the precise height, but a simple over/under 6 feet prediction.

Some examples of more complex classifying problems are handwriting recognition (recognizing characters) and identifying spam email.

Unsupervised machine learning

Unsupervised machine learning is used in situations where you don’t have training examples. Basically, you want to try and determine how to recognize groups of objects with similar properties. For example, you may have data that looks like this:

The algorithm will then attempt to analyze this data and find out how to group them together based on common characteristics. Perhaps in this example, all of the red “x” points in the following chart share similar attributes:

However, the algorithm may have trouble recognizing outlier points, and may group the data more like this:

What the algorithm has done is find natural groupings within the data, but unlike supervised learning, it had to determine the features that define each group. One industry example of unsupervised learning is Google News. For example, look at the following screen shot:

You can see that the main news story is about Iran holding 10 US sailors, but there are also related news stories shown from Reuters and Bloomberg (circled in red). The grouping of these related stories is an unsupervised machine learning problem, where the algorithm learns to group these items together.

Other industry examples of applied machine learning

A great example of a machine learning algo is the Author Extraction algorithm that Moz has built into their Moz Content tool. You can read more about that algorithm here. The referenced article outlines in detail the unique challenges that Moz faced in solving that problem, as well as how they went about solving it.

As for Stone Temple Consulting’s Twitter Engagement Predictor, this is built on a neural network. A sample screen for this program can be seen here:

The program makes a binary prediction as to whether you’ll get a retweet or not, and then provides you with a percentage probability for that prediction being true.

For those who are interested in the gory details, the neural network configuration I used was six input units, fifteen hidden units, and two output units. The algorithm used one million training examples and two hundred training iterations. The training process required just under 45 billion calculations.

One thing that made this exercise interesting is that there are many conflicting data points in the raw data. Here’s an example of what I mean:

What this shows is the data for people with Followerwonk Social Authority between 0 and 9, and a tweet with no images, no URLs, no @mentions of other users, two hashtags, and between zero and 40 characters. We had 1156 examples of such tweets that did not get a retweet, and 17 that did.

The most desirable outcome for the resulting algorithm is to predict that these tweets not get a retweet, so that would make it wrong 1.4% of the time (17 times out of 1173). Note that the resulting neural network assesses the probability of getting a retweet at 2.1%.

I did a calculation to tabulate how many of these cases existed. I found that we had 102,045 individual training examples where it was desirable to make the wrong prediction, or for just slightly over 10% of all our training data. What this means is that the best the neural network will be able to do is make the right prediction just under 90% of the time.

I also ran two other sets of data (470K and 473K samples in size) through the trained network to see the accuracy level of the TEP. I found that it was 81% accurate in its absolute (yes/no) prediction of the chance of getting a retweet. Bearing in mind that those also had approximately 10% of the samples where making the wrong prediction is the right thing to do, that’s not bad! And, of course, that’s why I show the percentage probability of a retweet, rather than a simple yes/no response.

Try the predictor yourself and let me know what you think! (You can discover your Social Authority by heading to Followerwonk and following these quick steps.) Mind you, this was simply an exercise for me to learn how to build out a neural network, so I recognize the limited utility of what the tool does — no need to give me that feedback ;->.

Examples of algorithms Google might have or create

So now that we know a bit more about what machine learning is about, let’s dive into things that Google may be using machine learning for already:

Penguin

One approach to implementing Penguin would be to identify a set of link characteristics that could potentially be an indicator of a bad link, such as these:

  1. External link sitting in a footer
  2. External link in a right side bar
  3. Proximity to text such as “Sponsored” (and/or related phrases)
  4. Proximity to an image with the word “Sponsored” (and/or related phrases) in it
  5. Grouped with other links with low relevance to each other
  6. Rich anchor text not relevant to page content
  7. External link in navigation
  8. Implemented with no user visible indication that it’s a link (i.e. no line under it)
  9. From a bad class of sites (from an article directory, from a country where you don’t do business, etc.)
  10. …and many other factors

Note that any one of these things isn’t necessarily inherently bad for an individual link, but the algorithm might start to flag sites if a significant portion of all of the links pointing to a given site have some combination of these attributes.

What I outlined above would be a supervised machine learning approach where you train the algorithm with known bad and good links (or sites) that have been identified over the years. Once the algo is trained, you would then run other link examples through it to calculate the probability that each one is a bad link. Based on the percentage of links (and/or total PageRank) coming from bad links, you could then make a decision to lower the site’s rankings, or not.

Another approach to this same problem would be to start with a database of known good links and bad links, and then have the algorithm automatically determine the characteristics (or features) of those links. These features would probably include factors that humans may not have considered on their own.

Panda

Now that you’ve seen the Penguin example, this one should be a bit easier to think about. Here are some things that might be features of sites with poor-quality content:

  1. Small number of words on the page compared to competing pages
  2. Low use of synonyms
  3. Overuse of main keyword of the page (from the title tag)
  4. Large blocks of text isolated at the bottom of the page
  5. Lots of links to unrelated pages
  6. Pages with content scraped from other sites
  7. …and many other factors

Once again, you could start with a known set of good sites and bad sites (from a content perspective) and design an algorithm to determine the common characteristics of those sites.

As with the Penguin discussion above, I’m in no way representing that these are all parts of Panda — they’re just meant to illustrate the overall concept of how it might work.

How machine learning impacts SEO

The key to understanding the impact of machine learning on SEO is understanding what Google (and other search engines) want to use it for. A key insight is that there’s a strong correlation between Google providing high-quality search results and the revenue they get from their ads.

Back in 2009, Bing and Google performed some tests that showed how even introducing small delays into their search results significantly impacted user satisfaction. In addition, those results showed that with lower satisfaction came fewer clicks and lower revenues:

The reason behind this is simple. Google has other sources of competition, and this goes well beyond Bing. Texting friends for their input is one form of competition. So are Facebook, Apple/Siri, and Amazon. Alternative sources of information and answers exist for users, and they are working to improve the quality of what they offer every day. So must Google.

I’ve already suggested that machine learning may be a part of Panda and Penguin, and it may well be a part of the “Search Quality” algorithm. And there are likely many more of these types of algorithms to come.

So what does this mean?

Given that higher user satisfaction is of critical importance to Google, it means that content quality and user satisfaction with the content of your pages must now be treated by you as an SEO ranking factor. You’re going to need to measure it, and steadily improve it over time. Some questions to ask yourself include:

  1. Does your page meet the intent of a large percentage of visitors to it? If a user is interested in that product, do they need help in selecting it? Learning how to use it?
  2. What about related intents? If someone comes to your site looking for a specific product, what other related products could they be looking for?
  3. What gaps exist in the content on the page?
  4. Is your page a higher-quality experience than that of your competitors?
  5. What’s your strategy for measuring page performance and improving it over time?

There are many ways that Google can measure how good your page is, and use that to impact rankings. Here are some of them:

  1. When they arrive on your page after clicking on a SERP, how long do they stay? How does that compare to competing pages?
  2. What is the relative rate of CTR on your SERP listing vs. competition?
  3. What volume of brand searches does your business get?
  4. If you have a page for a given product, do you offer thinner or richer content than competing pages?
  5. When users click back to the search results after visiting your page, do they behave like their task was fulfilled? Or do they click on other results or enter followup searches?

For more on how content quality and user satisfaction has become a core SEO factor, please check out the following:

  1. Rand’s presentation on a two-algorithm world
  2. My article on Term Frequency Analysis
  3. My article on Inverse Document Frequency
  4. My article on Content Effectiveness Optimization

Summary

Machine learning is becoming highly prevalent. The barrier to learning basic algorithms is largely gone. All the major players in the tech industry are leveraging it in some manner. Here’s a little bit on what Facebook is doing, and machine learning hiring at Apple. Others are offering platforms to make implementing machine learning easier, such as Microsoft and Amazon.

For people involved in SEO and digital marketing, you can expect that these major players are going to get better and better at leveraging these algorithms to help them meet their goals. That’s why it will be of critical importance to tune your strategies to align with the goals of those organizations.

In the case of SEO, machine learning will steadily increase the importance of content quality and user experience over time. For you, that makes it time to get on board and make these factors a key part of your overall SEO strategy.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

February 4, 2016  Tags: , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

New and Improved Local Search Expert Quiz: What’s Up with Local SEO in 2016?

Posted by Isla_McKetta

Think you’re up on the latest developments in local SEO?

One year ago we asked you to test your local SEO knowledge with the Local Search Expert Quiz. Because the SERPs are changing so fast and (according to our latest
Industry Survey) over 42% of online marketers report spending more time on local search in the past 12 months, we’ve created an updated version.

Written by local search expert Miriam Ellis, the quiz contains 40 questions designed to test both your general local SEO knowledge and your industry awareness. Bonus? The quiz takes less than 10 minutes to complete.

Ready to get started? When you are finished, we’ll automatically score your quiz.

Rating your score

Although the Local Search Expert Quiz is
just for fun, we’ve established the following guidelines for bragging rights:

  • 0–14 Newbie: Time to study up on your citation data!
  • 15–23 Beginner: Good job, but you’re not quite in the 3-pack yet.
  • 24–29 Intermediate: You’re getting close to the centroid!
  • 30–34 Pro: Let’s tackle multi-location!
  • 35–40 Guru: We all bow down to your local awesomeness.

Resources to improve your performance

Didn’t get the score you hoped for? We’ve included all the correct answers and references here. Or, brush up on your local SEO knowledge with this collection of free learning resources:

  1. The Moz Local Learning Center
  2. Glossary of Local Search Terms and Definitions
  3. Guidelines for Representing Your Business on Google
  4. Local Search Ranking Factors
  5. Blumenthal’s Blog
  6. Local SEO Guide
  7. Whitespark Blog

You can also learn the latest local search tips and tricks by signing up for the MozCon Local one-day conference, subscribing to the Moz Local Top 7 newsletter, or reading
local SEO posts on the Moz Blog.

Don’t forget to brag about your local search expertise in the comments below!

This post has been edited to include a link to the answer sheet.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

February 3, 2016  Tags: , , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

Moz Content Launches New Tiers and Google Analytics Integration

Posted by JayLeary

When we launched Moz Content at the end of November, we limited the subscription to a single tier. At the time we wanted to get the product out in the wild and highlight the importance of content auditing, competitive research, and data-driven content strategy. To all of those early adopters who signed up as Strategists over the last few months, a big thanks.

Since then, we’ve made a long list of improvements to both Audit performance/stability and the size of our Content Search index. Along with those updates we also added two new subscriptions for larger sites and agencies handling multiple clients: Teams and Agencies. The new tiers not only increase page and Audit limits, but also enable Google Analytics integration with Tracked Audits.

More data for Tracked Audits

For those that aren’t familiar, Tracked Audits let you trend and monitor the performance of your site’s content over time. On the backend, Moz Content re-audits your site every week in order to discover new pages and update site metrics. This allows you to compare, say, the average shares per article across your entire site from week-to-week or month-to-month.

To date, Moz Content Audits have focused on links and shares. We did that intentionally, since a major goal for the product was to enable the analysis of any site on the web, including competitors. We also wanted to give agencies the freedom to prospect or audit clients without painful integrations or code snippets.

That said (and I’m guessing you’ll agree), figuring out your content ROI usually requires a deeper look at site performance.

With Moz Content, we stopped short of a full, conversion-focused analysis suite with custom tracking code and the like. Instead we’re focused on a product that delivers content analysis and insights to our entire community of online marketers — not just the select few that can afford it. Besides, there are plenty of big-ticket tools out there filling the niche, great products like Newscred, Kapost, SimpleReach, and Idio.

So while we’re not jumping into the “enterprise” ring (just yet), we did want to give data-driven marketers a leg up in their analysis. The good news: if you’re a Teams or Agencies subscriber, you now have the added option of integrating Google Analytics with Tracked Audits.

It’s easy to get started, and if you’re a Moz Pro subscriber you’re probably familiar with the authentication flow. Just go to any Tracked Audit you have GA data for, scroll down to the middle of the report, and click “Connect Google Analytics”:

authentication.png

Once you’ve connected a profile, Moz Content immediately pulls in key metrics for each URL in the Inventory:

inventory.png

You’ll probably notice we’re only displaying page views in the inventory. While we didn’t have the real estate to include all of the metrics we’re collecting in the interface, we’ve added them to the CSV export:

csv.png

Once you dump the data to Excel you’ll see the following metrics for the Organic and Paid segments, as well as a rollup of all referrers:

  • Unique Page Views
  • All Page Views
  • Time on Page
  • Bounce Rate
  • Page Value

After we’ve collected multiple audits, Moz Content also starts trending aggregated metrics so you can get a sense of performance over time:

graph.png

We’re hoping this added reporting gets you a step closer to that all-important ROI analysis. Some of you will already have a sense of how much a page view is worth, or the impact of a unique page view on a specific conversion. And for the Google Analytics pros out there, properly configuring Page Value will give you a direct indicator of a page’s effectiveness.

This is just the beginning, and we’d love to hear about other ways we could make the GA data more useful. Please reach out and let us know what you think by clicking the round, blue Help button in the lower-right corner of the Moz Content app.

More to come!

We’re pushing out code every day to improve the app experience and build on the current features. We’re also growing our Content Search index with new sources of popular, trending content. If you haven’t tried it for a while, we encourage you to take a second look. Head over to http://moz.com/content and start a search or enter a domain to preview an Audit (be sure to sign into your community account to access both the Dashboard and increased limits).

On the feature front, we’re pushing to integrate Twitter data into both the Audit and Content Search. As I’m sure you’ve noticed, this has been missing in our reporting to date. While we won’t have exact share counts for individual articles (see Twitter’s decision to deprecate the share count endpoint), we’re confident we can provide related information that’ll be useful for your Twitter analysis.

As we develop this and other features, we’re always on the lookout for feedback from our community. As always, feel free to reach out to our Help team with any issues or feedback about the product. And if you’ve used Moz Content and are interested in beta testing the latest, shoot us an email at mozcontent+testing@moz.com and we’ll add you to the list.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

February 2, 2016  Tags: , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

What’s it Take to "Go Viral?" 11 Traits to Give Your Idea Wings

Posted by mattround

Before joining Distilled I worked for UsVsTh3m, an experimental Trinity Mirror project, where we created hundreds of games, quizzes and daft “toys.” We had unprecedented freedom to try out new interactive formats, learning a great deal about what works… and what doesn’t.

The key to success was “viral” traffic. You’ve probably heard the term bandied about in reference to something popular, and might even have rolled your eyes; it’s a much-abused buzzword.

The idea is that online word-of-mouth can drive exponential traffic growth and broad media coverage with little or no traditional promotional support, but achieving this requires a certain way of thinking. This article focuses on interactive content, but many of the same principles will apply to other formats.

The viral life cycle

It’s useful to aim for interactive content to be…

  1. Clickable — When someone sees a link and description (on social media or a site), it seems compelling enough to take a look.
  2. Playable — The visitor sticks with it and finds it enjoyable or interesting.
  3. Shareable — There’s a strong urge to tell others, often involving the visitor sharing their individual result/score.

You usually need all three aspects to be strong to get a viral hit. It’s easy to focus on one, an experienced team can usually achieve two, but it’s difficult to consistently get the full set.

Crudely, you can think of it in terms of losing potential sharers, ultimately needing to end up with more than one to start the next cycle(s). This image explains it nicely:

Congratulations, it’s going viral! That’s a massive simplification, but a helpful one.

11 ways to make it shareable

1. Attributes

Develop a concept that ties in with the player’s personal attributes: age, location, abilities, personality, etc. For example, measuring reaction time in milliseconds is fine… but if you can correlate it with age, then you’ve immediately got something far more compelling.

2. Tribes

Reinforce a sense of belonging; tribes can be regional, generational, interest-based, political, etc. Perhaps play different tribes off against each other so that your interactive content can address niche groups while having broad appeal overall.

3. Insights

It tells you something about yourself or, more likely, confirms a flattering/intriguing attribute, leading into…

4. Humblebrags

Sharing to make yourself look good… but without it seeming too blatant.

5. “One more go…”

Ensure the player is hooked and will want others to share in that. Although bear in mind that the best games often aren’t the most viral — adding multiple levels and features to a game tends to put off non-gamers and can actually reduce sharing (enthusiasm has a chance to ebb away, and the game will tend to end on a low note when the player finally fails or quits).

6. Topical

People are impressed by fast-turnaround topical content, and sharing it can show you’re up-to-date (perhaps even the first in your social circle to discover something). We regularly developed and launched games in half a day at UsVsTh3m, and more than once within an hour. This obviously isn’t feasible for most commercial projects, but with more agile development and approval processes you can reduce lead times.

7. Delight

Overwhelm the senses: strong use of music, dance, animation, spectacular explosions, anything that’s a straightforward pleasure.

8. Competition

“Can you beat this score?”

9. Comparison

“I got this result, how about you?” This is much “softer” than direct competition and typically more welcoming for a broad audience.

10. Collaborating

Things like global counters, polls, or territorial maps can create the sense of playing your part in a bigger cause. Even just clicking something to increment a number can be made hard to resist with the right “cause.”

11. Quality

It’s still possible for something to succeed simply by being good, but in the absence of any other aspects it’d better be really good. Knock-your-socks-off good.

Of course, all of the above overlap and interrelate, and it’s by no means an exhaustive list.

Telling the world

Something that’s strongly viral can actually just be exposed to a few hundred people via Twitter or Facebook. It won’t need a big push; the viral mechanism will ensure it spreads and attracts media attention.

It’s often useful to accompany a launch with relevant press material, perhaps teasing out key angles or supplementary content/data to suit each type of media outlet. Don’t force a story if there isn’t one, though; you don’t want to jeopardize later coverage based around “this cool thing is going viral.”

If the stats are showing it’s strongly going viral (this should be obvious within minutes), you’re then in the fortunate position of planning for success. Keep an eye out for initial coverage that may benefit from additional material, and look to do a follow-up press campaign at a suitable milestone (e.g. at X million visitors, or when you have interesting data to share), broadening the coverage.

If it’s not going viral, stop and consider whether minor changes to wording might make it more clickable. Look at whether it needs to break into a niche audience or broaden its appeal, and retarget accordingly. Although Twitter drives far less traffic than Facebook, it offers more freedom to experiment, target influential individuals, and re-promote over time. If a topical angle may arise, perhaps wait and be ready to repackage and relaunch at a moment’s notice.

Case studies: Two simple games that went viral

The North-o-Meter

UsVsTh3m’s North-o-Meter (sadly, this is currently broken due to hosting issues) used multiple-choice questions to guesstimate how Northern/Southern you are. Despite being entirely UK-focused, within just 4 days of launch it had 3.6 million visitors, 1 million Likes, 1.1 million comments on Facebook, and 41,000 tweets. It went on to get millions more visits, virtually saturating the potential audience. Countless similar quizzes had used this topic before, so why did this one make such a big impact?

  • It was clickable because the wording of tweets and Facebook posts worked well, teasing the Northern/Southern cultural identity element in a way that seemed intriguing and non-threatening.
  • It was playable thanks to working well on mobile (people were playing and comparing scores late into Friday nights down the pub), being easy to play and giving constant visual feedback, unlike many similar things that simply ape static magazine personality quizzes.
  • It was shareable by tapping into attributes (location/origin), tribes (north vs. south), insights (using mundane questions to infer something greater), competition, comparison, and quality (the visual feedback and often surprisingly accurate conclusions).
  • Northern/Southern cultural identity is immensely strong in the UK. It’s a key part of how many people define themselves.
  • The whole quiz was grounded in honest personal experience. One of our young journalists had written about moving to London, and the way it resonated with people led us to think about how to apply that to an interactive format.
  • Naming a specific place to go with the percentage meant it sometimes got the player’s location/origin spot-on, so they were then likely to share it in a very enthusiastic way.


How Old Are Your Reactions?

How Old Are Your Reactions?, produced by Distilled for JustPark, is a simple web game where you stop a car with a tap/click. Your reaction time is then used to look up the corresponding age for that score, based on a survey of 2,000 players.

Our thinking beforehand was that it would work well due to the following aspects:

  • It would be clickable by setting an intriguing personal challenge.
  • It would be playable thanks to clear, quick gameplay and good presentation, including full mobile compatibility.
  • It would be shareable due to attributes (age and reactions), insights (inferring age from reactions), humblebrags (impressively young age result), “one more go…” (few will play it just once), competition, comparison, and quality.
  • Age is a key personal attribute, and age estimation prompts a great deal of conversation and comparison, whether the result is accurate or lower/higher than the player’s actual age. Lively conversations on Facebook help ensure visibility.
  • Driving is a relevant, relatable way to dress the game up, particularly for this brand. A straightforward, bare-bones reactions test would have been “colder” and less engaging.
  • The combination of elements would allow for multiple storytelling angles in coverage, to do with good-natured rivalry between generations, road safety, etc.

This all seems to have been borne out by the stats since launch: Over 3 million unique page views, nearly 300,000 social shares, and links from over 400 domains.

In summary…

Always ask yourself:

  • How can we make it clickable, playable, and shareable? Judge your ideas harshly — you need all three.
  • Which sharing impulses can it tap into? It should be possible to readily pick out a few motivations, or refine the concept to strengthen this aspect.
  • What will be the best way to capitalize on success? Be ready to build a story around it, using popularity as the foundation for broad and varied coverage.

The way people share and interact is constantly changing, and reaching large audiences is always challenging, but the approaches I’ve outlined can help you to devise interactive content that’ll have a great shot at going viral.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

February 1, 2016  Tags: , , , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments

The Most Important Things We Learned About Google’s Panda Algo

Posted by jenstar

Webmasters were caught by surprise two weeks ago, when Google released many new statements about their Panda algorithm to The SEM Post. Traditionally, Google tends to be rather quiet about their search algorithms, but their new comments were a departure from this. Google was quite transparent and shared a lot of new Panda-related information that many SEOs weren’t aware of.

Here are what I consider to be the top new takeaways from Google about the Panda algorithm. These are all things that SEOs can put into action, either to create new, great-quality content or to increase the quality value of their current content.

First, the Panda algorithm is specifically about content. It’s not about links, it’s not about mobile-friendliness, it’s not about having an HTTPS site. Rather, the Panda algorithm rewards great-quality content by demoting content that’s either quite spammy in nature or that’s simply not very good.

Now, here are the most important things you should know about Panda, including some of the mistakes and misconceptions about the algorithm update that have confused even the expert SEOs.

Removing content Google considers good

One big issue is that many SEOs have been promoting the widespread removal of content from websites that were hit by Panda. In actuality, however, what many webmasters don’t realize is that they could be shooting themselves in the foot by doing this.

When performing content audits, many penalty experts will cut a wide swath through the site’s content and remove it. Whether claiming that X% of content needs to be removed to recover from Panda or that older, less fresh content needs to be removed, doing this without the proper research will cause rankings to decrease even further. It’s never a “surefire Panda recovery tactic,” despite what some might say.

Unfortunately for SEOs, there’s no magic formula to recover from Panda when it comes to the quantity, age, or length of the content on the site. Instead, you need to look at each page to determine its value. The last thing you want to do is remove pages that are actually helping.

Fortunately, we have the tools to be able to determine the “good versus bad” when it comes to figuring out what Google considers quality. And the answer is in both Google Analytics (or whatever your preferred site analytics program is) and in Google Search Console.

If Google is sending traffic to a page, then it considers it quality enough to rank. If you were going to remove one of these pages because it was written a few years ago or because it was below a magic word count threshold, you would lose all the future traffic Google would send to that page.

If you’re determined to remove content, at least verify that Google isn’t sending those pages traffic before you add to your Panda problems by losing more traffic.

Your content should match the search query

We all laugh when we look in our Google Search Console Search Analytics and see the funny keywords people search for. However, part of providing quality content is also delivering those content expectations. In other words, if a search is repeatedly bringing visitors to a specific page, you’ll want to make sure that page delivers the promised content.

From the Panda Algo Guide:

A Google spokesperson also took it a step further and suggested using it also to identify pages where the search query isn’t quite matching the delivered content. “If you believe your site is affected by the Panda algorithm, in Search Console’s Search Analytics feature you can identify the queries which lead to pages that provide overly vague information or don’t seem to satisfy the user need for a query.”

So if your site has been impacted by Panda — or you’re concerned it might be and want to be proactive — start matching up popular queries with their pages, making sure you’re fully delivering on those content query expectations. While this won’t be as big of a concern for sites not impacted by Panda, it’s something to keep in mind if you do notice those “odd” keywords popping up with frequency.

Ensuring your content matches the query is also one of the easiest Panda fixes you can do, although it might take some legwork to spot those queries that under-deliver. Often, it’s just a matter of slightly tweaking a paragraph or two, or adding an additional few paragraphs to change the content for those queries from “meh” to “awesome.” And if you deliver that content on the visitor’s landing page, it means they’re more likely to stick around, view more of your content, and share it with others — rather than hitting the back button to find a page that does answer their query.

Fixable? Or kill it with fire?

“Fixing” versus “removing” is another area where many experts disagree. Luckily, it’s been one of the areas that Google has been pretty vocal about if you know where to find those comments.

Google has been a longtime advocate of fixing poor quality content. Both Gary Illyes and John Mueller have repeatedly talked about improving the quality of content.

In a hangout, John Mueller said:

Overall, the quality of the site should be significantly improved so we can trust the content. Sometimes what we see with a site like that will have a lot of thin content, maybe there’s content you are aggregating from other sources, maybe there’s user-generated content where people are submitting articles that are kind of low quality, and those are all the things you might want to look at and say what can I do; on the one hand, hand if I want to keep these articles, maybe prevent these from appearing in search.

Now, there are always edge cases, and this is what many experts get hung up on. The important thing to remember is that Google’s not talking about those weird, random edge cases, but rather what applies to most websites. Is it forum spam for the latest and greatest Uggs seller? Of course, you’ll want to remove or noindex it. But if it’s the content you hired your next-door neighbor to write for you, or “original” content you bought off of Fiverr? Improve it instead.

If you do have thin content that you’ll want to upgrade in the future, you can always noindex it for now. If it’s not indexable by Google, it’s not going to hurt you, from a Panda perspective. However, it’s important to note that you still need to have enough quality content on your site, even if you’re noindexing or removing the bad stuff.

This is also what Google recommended in the Panda Algo Guide:

A Google spokesperson also said this, when referring to lower quality pages. “Instead of deleting those pages, your goal should be to create pages that don’t fall in that category: pages that provide unique value for your users who would trust your site in the future when they see it in the results.”

Still determined to remove it after checking all the facts? Gary Illyes gave suggestions during his keynote at Pubcon last year on how to remove thin content properly.

Ranking with Panda

One of the most surprising revelations from Google is that sites can still rank while being affected by Panda. While there are certainly instances where Panda impacts an entire site, and this is probably true in the majority of cases, it is possible that only some pages are negatively impacted by Panda. This is yet another reason you want to be careful when removing pages.

From the Panda Algo Guide:

What most people are seeing are sites that have content that is overwhelmingly poor quality, so it can seem that an entire site is affected. But if a site does have quality content on a page, those pages can continue to rank.

A Google spokesperson confirmed this as well.

The Panda algorithm may continue to show such a site for more specific and highly-relevant queries, but its visibility will be reduced for queries where the site owner’s benefit is disproportionate to the user’s benefit.

This comment reinforces the idea from Google that a key part of Panda is where Google feels the site owner is getting the most benefit from a visitor to their site, rather than vice-versa.

Duplicate content

One of the first things that webmasters do when they get hit by Panda is freak out over duplicate content. And while managing your duplicate content is always a good idea from a technical standpoint, it doesn’t actually play any kind of a role in Panda, as confirmed by John Mueller late last year.

And even then, John Mueller described fixing duplicate content on a priority scale as “somewhere in the sidebar or even quite low on the list.” In other words, focus on what Panda is impacting first, then clean up the non-Panda related technical details at the end.

Bottom line: Duplicate content can certainly affect your SEO. But from a Panda perspective, if your main focus is on getting your site ranking well again in Google after a Panda hit, leave it until the end. Google is usually pretty good about sorting it out, and if not, it’s fixable with either some redirects or canonicals.

Word count

Many webmasters fixate on the idea that content has to be a certain number of words to be deemed “Panda-proof.” There are plenty of instances of thousand-word articles that are extremely poor quality, and other examples of content so great that even having only a hundred or so words will trigger a featured snippet… something Google tends to give only to higher-quality sites.

Now, if you’re writing content, there’s nothing wrong with trying to set up certain benchmarks for the number of words — especially if you have contributors or you’re hiring writers. There’s no issue with that. The issue is with falsely believing that word count is related to quality, both in Google’s eyes and from the Panda algo perspective.

It’s very dangerous to assume that because an article or post is under a specific word count that it needs to be removed or improved. Instead, as with the case of considering whether you should remove content, look to see whether Google is sending referrals to those pages. If they’re ranking and receiving traffic from Google, word count is not an issue.

Advertising & affiliate links

The role that both advertising and affiliate links play in Google Panda is an interesting one. This isn’t to say that all advertising is bad or all affiliate links are bad. It’s a topic that John Mueller from Google has brought up in his Google Hangouts, as well. The problem is the content surrounding it — how much there is and what it’s like.

Where there’s an impact is in the amount of advertising and affiliate links. Will Google consider a page that is essentially just affiliate links without any quality content as good? It’s not that Panda is specifically targeting ads or affiliate content. There are lots of awesome affiliate sites out there that rank really well and are not affected by Panda whatsoever.

The problem lies in the disconnect between the balance of useful content and monetization. At Pubcon, Gary Illyes said the value to the visitor should be higher than the value to the site owner. But as we see on many sites, that balance has tipped the other way, where the visitor is seen merely as a means of revenue, without concern about giving that visitor any value back.

You don’t need to hit your visitors over the head with a huge amount of advertising and affiliate links to make money. That visitor brings a lot of additional value to your site when they don’t feel your site is too ad heavy. From the Panda Algo Guide:

There are also benefits from traffic even if it doesn’t convert into a click on an affiliate link. Maybe they share it on social media, maybe they recommend it to someone, or they return at a later time, remembering the good user experience from the previous visit.

A Google spokesperson also said, “Users not only remember but also voluntarily spread the word about the quality of the site, because the content is produced with care, it’s original, and shows that the author is truly an expert in the topic of the site.” And this is where many affiliate sites run into problems.

There’s another thing that often happens when a website is hit by Panda: naturally, the revenue from the ads they do have on the site goes down. Unfortunately, often the response to this loss of revenue is to increase the number of ads or affiliate links to compensate. But this degrades the value of the content even further and, despite the knee-jerk reaction, is not the appropriate move in a Panda-busting plan.

Bottom line: There is absolutely nothing wrong with having advertising or affiliate links on a site. That alone won’t cause a Panda issue. What can cause a Panda issue, rather, is how and how much you present these things. Ads and affiliate links should support your content, not overwhelm it.

User-generated content

What about user-generated content? Sadly, it’s getting a pretty bad rap these days. But it’s getting this reputation for the crappy user-generated content out there, not for the high-quality user generated content you see on sites. Many so-called experts advise removing all user-generated content, when again that’s one of those moves that can negatively impact your site.

Instead, look at the actual user-generated content you have your site and decide whether it’s quality or not. For example, YouMoz is considered to be fairly high-quality user generated content: all posts still have to be approved by editors, and only a small percent of submitted articles make it live on the site. Even then, their editors also work to improve and edit the pieces as necessary, ensuring that even though it is user-generated content, it’s still high quality.

But like any content on the web, user-generated or not, there are different levels of quality. If your user-generated content quality is very high, then you have nothing to worry about. You could have a different contributor for every single article if you wanted to. It has nothing to do with how you obtained the content for your site, but rather how high-quality and valuable that content is.

Likewise, with forums or community-driven sites where all the content is user-contributed, it’s about how quality that content is — not about who contributes it. Sites like Stackoverflow have hundreds of thousands of contributors, yet it’s considered very high-quality and it does extremely well in the Google search results.

If your user-generated content has both its high point and its low point regarding quality, there are a few things actions that Google recommends so that the lower-quality content doesn’t drag down the entire site. John Mueller said if you can recognize the types of lower-quality content on the forum or the patterns that tend to match it, then you can block it from being indexed by Google. This might mean noindexing your welcome forum where people are posting introductions about themselves, or blocking the chitchat forums while leaving the helpful Q&A as indexable.

And, of course, you need to deal with any spam in your user-generated content, whether it’s something like YouMoz or a forum for people who all love a specific hobby. Have good guidelines in place to prevent your active users from spamming or link-dropping. And use some of the many forum add-ons that identify and remove spam before Google can even see it.

Do not follow the advice of those who say all user-generated content is bad… it’s not. Just ensure that it’s high quality, and you won’t have a problem with Panda from the start.

Commenting

You may have noticed a trend lately: Many blogs and news sites are removing comments from their sites completely. When you do this, though, you’re removing a signal that Google can use that shows how well people are responding to your content. Like any content, comments aren’t all bad simply because they’re comments — their quality is the deciding factor, and this will vary.

And it’s not just the Google perspective that dictates why you should keep them. Having a comment section can keep visitors coming back to your site to check for new commentary, and it can often offer additional insights and viewpoints on the content. Communities can even form around comment sections. And, of course, it adds more content.

But, like user-generated content, you need to make sure you’re keeping it high quality. Have a good comments policy in place; if you’re in doubt, don’t approve the comment. Your goal is to keep those comments high-quality, and if there’s any suspicion (such as a username of “Buy Keyword Now,” or it’s nothing more than an “I agree” comment), just don’t allow it.

That said, allowing low-quality comments can affect the site, something John Mueller has confirmed. I wouldn’t panic over a handful of low-quality comments, but if the overall value of the comments is pretty low, you probably want to weed them out, keep the high-quality comments, and be a little bit more discriminating going forward.

Technical issues

No, technical issues do not cause Panda. However, it’s still a widespread belief that things like page speed, duplicate content, or even what TLD the site is on can have an impact on Panda. This is not accurate at all.

That said, these kinds of technical issues do have an impact on your overall rankings — just not for Panda reasons. So, it’s best practice to ensure your page speed is good, you’re not running long redirect chains, and your URL structure is good; all these things do affect your overall SEO with Google’s core algorithm. With regards to recovering from Panda, though, it doesn’t have an impact at all.

“Core” Algo

One of the surprises was the addition of the core algo comment, where Google revealed to The SEM Post that Panda was now part of the core algorithm. But what does this mean? Is it even important to the average SEO?

The answer is no. Previously, Panda was a filter added after the core search algo. Now, while it’s moved to become part of that core algo, Panda itself is essentially the same, and it still impacts websites the same way.

Google confirmed the same. Gary Illyes from Google commented on it being one of the worst takeaways from all the Panda news.

A2. I think this is the worst takeway of the past few days, but imagine an engine of a car. It used to be that there was no starter (https://en.wikipedia.org/wiki/Starter_(engine)), the driver had to go in front of the car, and use some tool to start the engine. Today we have starters in any petrol engine, it’s integrated. It became more convenient, but essentially nothing changed.

For a user or even a webmaster it should not matter at all which components live where, it’s really irrelevant, and that’s why I think people should focus on these “interesting” things less.

It really doesn’t make a difference from an SEO’s perspective, despite the initial speculation it might have.

Overall

Google released a lot of great Panda information last week, and all of it contained advice that SEOs can put into action immediately — whether to ensure their site is Panda-proofed, or to fix a site that had been slapped by Panda previously.

The bottom line: Create high-level, quality content for your websites, and you won’t have to worry about Pandas.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

January 30, 2016  Tags: , , , , , , ,   Posted in: SEO / Traffic / Marketing  No Comments



TechNetSource on Facebook




TechNetSource. WebSite Development, Hosting, and Technology Resources and Information.