Replicate Google’s Panda Questionnaire – Whiteboard Friday

Posted by caseyhen

Want to avoid the next Panda Update and improve your websites quality? This week Will Critchlow from Distilled joins Rand to discuss an amazing idea of Will’s to help those who are having problem with Panda and others who want to avoid future updates. Feel free to leave your thoughts on his idea and anything you might do to avoid Panda.

 

Video Transcription

Rand: Howdy, SEOmoz fans. Welcome to a very special edition of Whiteboard Friday. I am joined today by Will Critchlow, founder and Director of Distilled, now in three cities – New York, Seattle, London. My God, 36 or 37 people at Distilled?

Will: That’s right. Yeah, it’s very exciting.

Rand: Absolutely amazing. Congratulations on all the success.

Will: Thank you.

Rand: Will, despite the success that Distilled is having, there are a lot of people on the Web who have been suffering lately.

Will: It’s been painful.

Rand: Yeah. What we’re talking about today is this brilliant idea that you came up with, which is essentially to replicate Google’s Panda questionnaire, send it out to people, and help them essentially improve your site, make suggestions for management, for content producers, content creators, for people on the Web to improve their sites through this same sort of search signals that Panda’s getting.

Will: That’s right. I would say actually the core thing of this, what I was trying to do, is persuade management. This isn’t necessarily about things that we as Internet marketers don’t know. We could just look at the site and tell people this, but that doesn’t persuade a boss or a client necessarily. So a big part of this was about persuasion as well.

So, background, I guess, people probably know but Goggle gave this questionnaire to a bunch, I think they used students mainly to assess a bunch of websites, then ran machine learning algorithms over the top of that so that they could algorithmically determine the answer.

Rand: Take a bunch of metrics from maybe user and usage data, from possibly linked data, although it doesn’t feel like linked data, but certainly onsite analysis, social signals, whatever they’ve got. Run these over these pages that had been marked as good or bad, classified in some way by Panda questionnaire takers, and then produce results that would push down the bad ones, push up the good ones, and we have Panda, which changed 12% of search results in the U.S.

Will: Yeah, something like that.

Rand: And possibly more.

Will: And repeatedly now, right? Panda two point whatever and so forth. So, yeah, and of course, we don’t know exactly what questions Google asked, but . . .

Rand: Did you try to find out?

Will: Obviously. No luck yet. I’ll let you know if I do. But there’s a load of hints. In fact, Google themselves have released a lot of these questions.

Rand: That’s true. They talked about it in the Wired article.

Will: They did. There have been some that have come out on Search Engine Land I think as well. There have been some that have come out on Twitter. People have referred to different kinds of questions.

Rand: Interesting. So you took these and aggregated them.

Will: Yeah. So I just tried to pull . . . I actually ignored quite a chunk that I found because they were hard to turn into questions that I could phrase well for the kinds of people I knew I was going to be sending this questionnaire to. Maybe I’ll write some more about that in the accompanying notes.

Rand: Okay.

Will: I basically ended up with some of these questions that were easy to have yes/no answers for anybody. I could just send it to a URL and say, "Yes or no?"

Rand: Huh, interesting. So, basically, I have a list of page level and domain level questions that I ask my survey takers here. I put this into a survey, and I send people through some sort of system. We’ll talk about Mechanical Turk in a second. Then, essentially, they’ll grade my pages for me. I can have dozens of people do this, and then I can show it to management and say, "See, people don’t think this is high enough quality. This isn’t going to get past the Panda filter. You’re in jeopardy."

Will: That’s right. The first time I actually did this, because I wasn’t really sure whether this was going to be persuasive or useful even, so I did it through a questionnaire I got together and sent it to a small number of people and got really high agreement. Out of the 20 people I sent the questionnaire to, for most questions you’d either see complete disagreement, complete disarray, basically people saying don’t know, or you’d see 18 out of 20 saying yes or 18 out of 20 saying no.

Rand: Wow.

Will: With those kind of numbers, you don’t need to ask 100 people or 1,000 people.

Rand: Right. That’s statistically valid.

Will: This is looking like people think this.

Rand: People think this article contains obvious errors.

Will: Right. Exactly. So I felt like straight away that was quite compelling to me. So I just put it into a couple of charts in a deck, took it into the client meeting, and they practically redesigned that "catch me" page in that meeting because the head of marketing and the CEO were like okay, yeah.

Rand: That’s fantastic. So let’s share with people some of these questions.

Will: And they’re simple, right, dead simple.

Rand: So what are the page level ones?

Will: Page level, what I would do is typically find a page of content, a decent, good page of content on the site, and Google may well have done this differently, but all I did was say find a recent, good, well presented, nothing desperately wrong with it versus the rest of the content on the site. So I’m not trying to find a broken page. I’m just trying to say here’s a page.

Rand: Give me something average and representative.

Will: Right. So, from SEOmoz, I would pick a recent blog post, for example.

Rand: Okay, great.

Will: Then I would ask these questions. The answers were: yes, no, don’t know.

Rand: Gotcha.

Will: That’s what I gave people. Would you trust the information presented here?

Rand: Makes tons of sense.

Will: It’s straightforward.

Rand: Easy.

Will: Is this article written by an expert? That is deliberately, vaguely worded, I think, because it’s not saying are you certain this article’s written by an expert? But equally, it doesn’t say do you think this article . . . people can interpret that in different ways, but what was interesting was, again, high agreement.

Rand: Wow.

Will: So people would either say yes, I think it is. Or if there’s no avatar, there’s no name, there’s no . . . they’re like I don’t know.

Rand: I don’t know.

Will: And we’d see that a lot.

Rand: Interesting.

Will: Does this article have obvious errors? And I actually haven’t found very many things where people say yes to this.

Rand: Gotcha. And this doesn’t necessarily mean grammatical errors, logical errors.

Will: Again, it’s open to interpretation. As I understand it, so was Google’s. There are some of these that could be very easily detected algorithmically. If you’re talking spelling mistakes, obviously, they can catch those. But here, where we’re talking about they’re going to run machine learning, it could be much broader. It could be formatting mistakes. It could be . . .

Rand: Or this could be used in concert with other questions where they say, boy, it’s on the verge and they said obvious errors. It’s a bad one.

Will: Exactly.

Rand: Okay.

Will: Does the article provide original content or information? A very similar one. Now, as SEOs, we might interpret this as content, right?

Rand: But a normal survey taker is probably going to think to themselves, are they saying something that no one has said before on this topic?

Will: Yeah, or even just, "Do I get the sense that this has been written for this site rather than just cribbed from somewhere?"

Rand: Right.

Will: And that may just be a gut feel.

Rand: So this is really going to hurt the Mahalos out there who just aggregate information.

Will: You would hope so, yeah. Does this article contain insightful analysis? Again, quite vague, quite open, but quite a lot of agreement on it. Would you consider bookmarking this page? I think this is a fascinating question.

Rand: That’s a beautiful one.

Will: Obviously, again, here I was sending these to a random set of people, again which, as I understand it, is very similar to what Google did. They didn’t take domain experts.

Rand: Ah, okay.

Will: As I understand it. They took students, so smart people, I guess.

Rand: Right, right.

Will: But if it’s a medical site, these weren’t doctors. They weren’t whatever. I guess some people would answer no to this question because they’re just not interested in it.

Rand: Sure.

Will: You send an SEOmoz page to somebody who’s just not . . .

Rand: But if no one considers bookmarking a page, not even consider it, that’s . . .

Will: Again, I think the consider phrasing is quite useful here, and people did seem to get the gist, because they’ve answered all of the questions by this point. I would send the whole set to one person as well. They kind of get what we’re asking. Are there excessive adverts on this page? I love this question.

Tom actually was one of the guys, he was speculating early on that this was one of the factors. He built a custom search engine, I think, of domains that had been hit by the first Panda update, and then was like, "These guys are all loaded with adverts. Is that maybe a signal?" We believe it is, and this is one of the ones that management just . . . so this was the one where I presented a thing that said 90% of people who see your site trust it. They believe that it’s written by experts, it’s quality content, but then I showed 75% of people who hit your category pages think there are too many adverts, too much advertising.

Rand: It’s a phenomenal way to get someone to buy in when they say, "Hey, our site is just fine. It’s not excessive. There’s tons of websites on the Internet that do this."

Will: Yeah.

Rand: And you can say, "Let’s not argue about opinions."

Will: Yes.

Rand: "Let’s look at the data."

Will: Exactly. And finally, would you expect to see this article in print.?

Rand: This is my absolute favorite question, I’ve got to say, on this list. Just brilliant. I wish everyone would ask that of everything that they put on the Internet.

Will: So you have a chart that you published recently that was the excessive returns from exceptional content.

Rand: Yeah, yeah.

Will: Good content is . . .

Rand: Mediocre at this point in terms of value.

Will: And good is good, but exceptional actually has its exponential. I think that’s a question that really gets it.

Rand: What’s great about this is that all of the things that Google hates about content farms, all of the things that users hate about not just content farms but content producers who are low quality, who are thin, who aren’t adding value, you would never say yes to that.

Will: What magazine is going to go through this effort?

Rand: Forget it. Yeah. But you can also imagine that lots of great pieces, lots of authentic, good blog posts, good visuals, yeah, that could totally be in a magazine.

Will: Absolutely. I should mention that I think there’s some caveats in here. You shouldn’t just take this blindly and say, "I want to score 8 out of 8 on this." There’s no reason to think that a category page should necessarily be capable of appearing in print.

Rand: Or bookmarked where the . . .

Will: Yes, exactly. Understand what you’re trying to get out of this, which is data to persuade people with, typically, I think.

Rand: Love it, love it. So, last set of questions here. We’ve got some at the domain level, just a few.

Will: Which are similar and again, so the process, sometimes I would send people to the home page and ask them these questions. Sometimes I would send them to the same page as here. Sometimes it would be a category page or just kind of a normal page on the site.

Rand: Right, to give them a sense of the site.

Will: Yeah. Obviously, they can browse around. So the instructions for this are answer if you have an immediate impression or if you need to take some time and look around the site.

Rand: Go do that.

Will: Yeah. Would you give this site your credit card details? Obviously, there are some kinds of sites this doesn’t apply to, but if you’re trying to take payment, then it’s kind of important.

Rand: A little bit, a little bit, just a touch.

Will: There’s obvious overlaps with all of this, with conversion rate optimization, right? This specific example, "Would you trust medical information from this site," is one that I’ve seen Google refer to.

Rand: Yeah, I saw that.

Will: They talk about it a lot because I think it’s the classic rebuttal to bad content. Would you want bad medical content around you? Yeah, okay. Obviously, again only applies if you’re . . .

Rand: You can swap out medical information with whatever type is . . .

Will: Actually, I would just say, "Would you trust information from this site?" And just say, "Would you trust it?"

Rand: If we were using it on moz, we might say, "Would you trust web marketing information? Would you trust SEO information? Would you trust analytics information?"

Will: Are these guys domain experts in your opinion? This is almost the same thing. Would you recognize this site as an authority? This again has so much in it, because if you send somebody to Nike.com, no matter what the website is, they’re probably going to say yes because of the brand.

Rand: Right.

Will: If you send somebody to a website they’ve never heard of, a lot of this comes down to design.

Rand: Yes. Well, I think this one comes down to . . .

Will: I think an awful lot of it does.

Rand: A lot of this comes down to design, and authority is really branding familiarity. Have I heard of this site? Does it seem legitimate? So I might get to a great blog like StuntDouble.com, and I might think to myself, I’m not very familiar with the world of web marketing. I haven’t heard of StuntDouble, so I don’t recognize him as an authority, but yeah, I would probably trust SEO information from this site. It looks good, seems authentic, the provider’s decent.

Will: Yeah.

Rand: So there’s kind of that balance.

Will: Again, it’s very hard to know what people are thinking when they’re answering these questions, but the degree of agreement is . . .

Rand: Is where you get something. So let’s talk about Mechanical Turk, just to end this up. You take these questions and put them through a process using Mechanical Turk.

Will: So I actually used something called SmartSheet.com, which is essentially a little bit like Google Doc spreadsheets. It’s very similar to Google Doc spreadsheets, but it has an interface with Mechanical Turk. So you can just literally put the column headings as the questions. Then, each row you have the page that you want somebody to go to, the input, if you like.

Rand: The URL field.

Will: So SEOmoz.org/blog/whatever, and then you select how many rows you want, click submit to Mechanical Turk, and it creates a task on Mechanical Turk for each row independently.

Rand: Wow. So it’s just easy as pie.

Will: Yeah, it’s dead simple. This whole thing, putting together the questionnaire and gathering it the first time, took me 20 minutes.

Rand: Wow.

Will: I paid .50 an answer, which is probably slightly more than I would have had to, but I wanted answers quickly. I said, "I need them returned in an hour," and I said, "I want you to maybe have a quick look around the website, not just gut feel. Have a quick look around." I did it for 20, got it back in an hour, cost me 10 bucks.

Rand: My God, this is the most dirt cheap form of market research for improving your website that I can think of.

Will: It’s simple but it’s effective.

Rand: It’s amazing, absolutely amazing. Wow. I hope lots of people adopt this philosophy. I hope, Will, you’ll jump into the Q&A if people have questions about this process.

Will: I will. I will post some extra information, yeah, definitely.

Rand: Excellent. And thank you so much for joining us.

Will: Anytime.

Rand: And thanks to all of you. We’ll see you again next week for another edition of Whiteboard Friday. Take care.

Will: Bye.

Video transcription by Speechpad.com

Do you like this post? Yes No


SEOmoz Daily SEO Blog

July 29, 2011  Tags: , , , , ,   Posted in: SEO / Traffic / Marketing

Leave a Reply

You must be logged in to post a comment.



TechNetSource on Facebook




TechNetSource » Replicate Google’s Panda Questionnaire – Whiteboard Friday