Published on February 28th, 2013 | by 0111001101100106
50 Shades of Spam
Disclaimer: the story is based on true events but all the names and website URLs have been changed. Any resemblance to actual people, agencies or sites is coincidental and accidental.
In which we set the scene and meet Jenny, the honest whitehat.
It was a warm, bright late spring morning. On a morning like this, nobody expects any bad news. But Jenny got a call from her boss telling her that the traffic to one of the main clients’ site has dropped badly all of a sudden. She started digging into the logs and stats and blog posts…
Jenny has been working for a mid-sized online marketing agency for a few years already. She considered herself lucky enough to get the trainee position right after graduation. Her family was not particularly happy with her getting into this whole Internet thing after she had spent so many years studying serious stuff but Jenny felt this was where the future is. Over time, she learned a lot of things and progressed in her career, and she liked her job. She was subscribed to all the popular SEO blog feeds in her RSS reader and usually read them in the morning during her commute to work.
… They named it Penguin. It was the “anti-spammer” update Cat Mutts had warned about weeks before. Jenny spent the whole weekend in the office with a few people from her team, digging the stats and analysing the SERPs and trying to find some rhyme or reason to it – and finding patterns only to see them overthrown completely by the next analysed set of the SERPs. It just did not make sense. There were too many signals involved, and the way they interacted in each case was totally unpredictable.
Jenny’s team was a squeaky clean whitehat operation. They never did anything sketchy that could put a client’s site at risk. And this was a “Spammer” update… what a kick in the butt! Never before have they lost a site, the horrible Panda had never hurt them in any of its incarnations – but this was something new, scary and inexplicable.
Rumours had it that the client was quite impulsive and, being pitched things, could act without much syncing with the agency. Could it be one of the side contractors doing something dirty, promising him quick returns? Jenny did not know for sure.
The common verdict on Penguin (much in line with the official Google’s stance) was that spammy links were causing problems and they had to be removed. It had been a few weeks after the initial drop when Jenny’s team finally dug out some really weird links. That was when they started showing up in the site’s Webmaster Tools account. But it was not just a link or two – there were thousands of them all of a sudden. Jenny’s suspicions of a side contractor’s activity grew stronger. How does one go about removing links? They have always been building links, but removing them? The whole team was bewildered and unsure of how to approach the task.
Weeks after weeks, they tracked down site owners and sent out emails. Only this time, it was not begging for links, it was begging to remove links. But hope was vanishing as most of the emails got no reply or webmasters promised to remove the links only to forget about the plea.
Eventually, they gave up. The client’s traffic was stale, despite a few bad links finally getting removed. The new idea that Jenny had was to build a few quality links so that maybe that would change the balance and the site would get better. More weeks passed, yet the evil Penguin didn’t seem to loosen its grip.
If only Bing had a bigger market share, Jenny sometimes caught herself thinking. In Bing, the client site did not drop. But even if it did, Bing has just announced this shiny new disavow tool where you could just drop all the bad links you had no control over and be done with it. Truth be told, Jenny never used the Bing disavow tool and was not sure if using it actually resulted in sites ranking better, and there were only anecdotal references online to people testing it – but in all seriousness, who has ever cared as much about Bing penalties or bans as people did about those in Google? Yet, the disavow tool was considered a great step forward and an example of a search engine open to working hand in hand with webmasters, so Jenny stuck to the same opinion. She even started dreaming of reading the news one morning of Google launching a disavow tool of its own.
And that morning once arrived…
In which using the Disavow Tool leads to more discoveries.
Despite the rain outside, the spirit in the office was elated. Banners reading “Welcome Disavow Tool” were hanging across the room. It was like somebody’s birthday party. People were smiling, many of them for the first time in the last few months.
Jenny read the official Google announcement twice on her mobile during her commute and re-read it once more when she got to the office. She watched the video taking notes. She read the launch coverage in all the main industry blogs. The instructions seemed vague and the warning sinister, but that was better than nothing. Jenny was hopeful. She knew all the bad link URLs by heart by this point. They had managed to remove quite a few links, now it was time to deal with the stubborn ones that they could not manage to get rid of for various reasons. Just one last check in the Webmaster Tools, and she will be ready to put together the disavow submission.
… This cannot be right, Jenny thought 20 minutes later glancing over a CSV with links downloaded from the Webmaster Tools. At the top of the list there was a domain she had never heard of before, with a total of over 1,500 links pointing to the client site from it! The dreaded sitewide links…
Every newbie in Jenny’s team knew from day one sitewides were a no-no. Building a sitewide link was like confessing of stabbing your own grandma. Nobody would have ever done it for a client site. The gloomy shadow of the mysterious evil side contractor that had been rumoured about for the first few weeks since the site drop became almost material again. Either that – or the dreaded N word, a negative campaign by a competitor or just some sick individual haters with too much time on their hands. The client, however, never had any real issues with online reputation or too many bad customer reviews – nothing outstanding in their niche anyways. Who could possibly hate their site so much?
But Jenny had to look at the actual site linking to their client. Maybe, just maybe it’s one of those links that somehow occur naturally, like because of a “Recent Comments” blog plugin – maybe their client commented on some blog that had this functionality in place? Hmm, not likely, their client didn’t really do much by himself, even when asked to do something specifically. Or maybe it’s some scraper – there haven’t been many of them in the SERPs lately but it doesn’t mean they ceased to exist completely. Maybe, just maybe it’s something that can be easily fixed, after all we now have the disavow tool, if nothing else works we will just disavow it, right? – thought Jenny.
Little did she know. When she looked at the site in question, what she saw was nothing like she had expected. It was one single page, with no link to the client site anywhere! But where did all those 1,500 links in Google Webmaster Tools report come from?
Hours later, after doing lots of digging, Jenny still has not managed to discover anything meaningful. Apparently the site was pretty new judging by its whois data, Web Archive had no records for it, Jenny felt lost and did not know where else to look for any clues. How can you do anything about links that do not exist? Yet, she felt these links could be the ones causing trouble, at least partially. She was at her wits end.
With an uneasy heart, Jenny went home at the end of the work day. She couldn’t sleep that night, thinking of possible ways to uncover the mystery that was torturing her. By morning, she made a decision. She would ask for help. Luckily she knew someone who could probably solve this puzzle.
In which a blackhat consultant steps in and the mystery gets solved.
Shane woke up late. His head was hurting a bit after last night’s secret meet up with some old school blackhat buddies. He wasn’t a very public person and wasn’t going to many SEO conferences (after all, being in business for over a decade, what new could he hear at those conferences? and what valuable info did those sissy whitehats that most conferences were infested with, possess and could share anyway?). But he loved an occasional informal get together with fellow old schoolers. They have all been in SEO for ages, since before it even got the name of SEO. Shane didn’t drink much usually but last night one of the guys threw a party to celebrate the much-anticipated purchase of a really cool and old domain name that nobody was supposed to mention in public as one belonging to him. Only the closest circle of old time friends knew and celebrated – and boy did they celebrate!
But it was time to get up. Shane noticed a few missed messages in Skype on his iPad. Somebody has been pinging him all morning. He looked at the name. It took him a while to remember who it was. Somebody called Jenny. Ah yes, that whitehat girl he ran into at that free conference afterparty last year that he got tricked into going to by one of his friends. The afterparty was ok – quite a few of his old spammer friends turned up so it was fun catching up. And this girl, despite her being whitehat, at least wasn’t the close-minded kind of a whitehat, she did not start any of the stupid “ethics” talk, just asked a few rather naive questions and clearly believed in Google’s good intentions. But what could she possibly want now?
Another hour later, after a shower and a coffee, Shane finally could be arsed to reply.
“What’s up”, he typed.
“Sorry to bother you Shane but I don’t know who else could answer a question I have”, Jenny typed back.
“And why would I want to answer your question?” asked Shane.
“Out of love for SEO maybe? Oh, and I’ll buy you a drink next time I see you”, said Jenny.
“How likely is that, huh! OK, go ahead with your question.”
The question was indeed quite interesting, thought Shane. It’s not every day that you see examples of Google being broken that are so vivid. He started digging.
Digging this one was no easy task. No archive, apparently due to the site banning the archive bot in robots.txt (he guessed that through some Web Archive clues as the actual robots.txt file was no longer available). No Google cache for a single page of the site, due to a nocache meta tag that likely was in place. Yet, 1,500 pages indexed – for a site that no longer exists…
By searching for bits and pieces Shane managed to recover off no longer existing pages, he finally got some clues. The picture that unveiled before him looked magnificently evil and terribly stupid at the same time.
Next day, Shane pinged Jenny in Skype. “Ready to hear the weirdest story you’ll ever hear?” he asked and proceeded telling her about what he had found.
… Once upon a time, there lived a spammer. He may appear smart to some, but those who know better would notice at least a few mistakes he made. His business model consisted of finding a competitive niche, finding a large site with plenty of content in that niche, copying all of that content and putting it on a newly registered random domain. Then he would spam some links to it, let it acquire minimal PageRank and proceed by selling links off his new site via one of the link selling sites that did not care much about the quality or background of their equity. Because his sites were plenty, in different competitive niches, and with lots of pages, and he would sell sitewide links, the business was profitable. It went on for months without a glitch until he scraped a site in the same niche as the client that Jenny’s agency was working for.
The site he scraped belonged to the people that did not like anyone messing with them. Infact, they were known for taking people to court for even lesser reasons – and winning in the court. A few years ago they messed up the business of a certain company that was their competitor because that company’s representative was careless enough to get caught accusing them of things that could not be proven.
They first did not notice the scraped site but eventually it started ranking better than the original, biting off a good piece of their long tail traffic (which, because of the nature of their site, constituted the larger part of traffic). That’s when they decided to show the spammer nobody could mess with them and filed a DMCA complaint. Only, instead of sending a copy of that DMCA complaint to Google as it usually is done when content gets stolen, effectively leading to the offending pages removal from the SERPs and DMCA notices appearing in those SERPs instead, they forwarded it to the spammer’s host.
That’s when all the hell broke lose. The host removed the site – what Jenny was seeing was the default Apache page. However, Google did not know about it. Over 1,500 pages were still in the index, still outranking the original scraped site for a bunch of long tail queries. Google Webmaster Tools backlink report still showed them all as well, and they were still causing trouble for Jenny’s client, although it was now impossible to do anything about them. How do you remove the links that do not exist? How do you disavow them? Things were ultimately broken.
“So”, Shane said then, “you have two options now. Either wait till Google actually catches up and reindexes the site and sees that those pages no longer exist – or go to Google URL removal tool and submit all the 1,500 pages, one by one, and pray that they take action fast.”
There was a pause in the conversation. Jenny was digesting what she just heard. After a long silence, she said, “Thank you” and finished the talk. She knew she had a very long day ahead.
50 Shades of Spam,