AdWords Broken URL Checker That Won’t Timeout Ever

A common problem with broken URL checker scripts is that they tend to timeout when running on large AdWords accounts. This is because Google would rather you use their API.

I’d love to start using the Google API and I even got my test MCC account up and running. The problem is that I’m not really at that level of coding comfort yet, so I’ve started playing with scripts. The following Broken URL Checker was my first time using JavaScript!

Don’t get me wrong, there’s a few great link checker AdWords scripts out there and some not so great ones.

Here’s one script that works perfectly fine on smaller AdWords account.

Google also has it’s own Link Checker script that checks 800 URLs every execution. It’s actually pretty neat but I haven’t tested it out.

In theory, Google’s Link Checker should also work if the Free AdWords Scripts keeps timing out. I decided to just write my own little version of his script that allows me to segment my account. In other words, instead of running 1 script to check all my destination pages for errors, I run multiple scripts that check a unique set of campaigns each.

It’s heavily inspired (partially stolen) from Russel Savage’s script but there are some key differences.

The most notable difference is the fact that I no longer take a look at keyword destination URLs. I don’t have that many in my account so I have a separate script that doesn’t use my custom campaign filter.

In fact, the filter in the AdGroup selector is the only other difference other than some comments that I just added.

My Broken URL Checker Code

 
/****************************
* Broken URL Checker /w Segments
* Version 1.0
* Created by: Philip Tomlinson (@philtomm)
* AnotherMarketer.com
* Modified Version of Russ Savage's Broken URL Finder
* FreeAdWordsScripts.com
****************************/
 
function main() {
  // You can add more if you want: http://goo.gl/VhIX
  var BAD_CODES = [404,500];
  var TO = [''/* insert email in the quotes like this 'email_address@example.com'*/];
  var SUBJECT = 'Broken AdWords for Campaign URL Report - ' + _getDateString(); 
  //You may want to add what letters you're targeting
  var HTTP_OPTIONS = {
    muteHttpExceptions:true
  };
 
  // This is the start of my changes
  // Establishing variables for the alphabet filter
  var alphaIndex;
  var alphabet = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"];
  var iters = [];
 
  // This filter selects all campaigns starting with the letters A to G.
  for (alphaIndex = 0 /*starting letter number*/; alphaIndex < 7 /*number - 1 equals last letter */; ++alphaIndex){  
 
  var iterPush = [
    AdWordsApp.ads()
      .withCondition("Status = 'ENABLED'")
      .withCondition("AdGroupStatus = 'ENABLED'")
      .withCondition("CampaignStatus = 'ENABLED'")
      .withCondition("Type = 'TEXT_AD'")
      .withCondition("CampaignName STARTS_WITH '" + alphabet[alphaIndex] + "'")
      .get(),
];
 
    var iters = iters.concat(iterPush);
 
    };
 
  /* Everything else is from Russel Savage's post here: http://www.freeadwordsscripts.com/2013/04/report-on-broken-urls-in-your-account.html */
 
  var already_checked = {}; 
  var bad_entities = [];
  for(var x in iters) {
    var iter = iters[x];
    while(iter.hasNext()) {
      var entity = iter.next();
      if(entity.getDestinationUrl() == null) { continue; }
      var url = entity.getDestinationUrl();
      if(url.indexOf('{') >= 0) {
        //Let's remove the value track parameters
        url = url.replace(/\{[0-9a-zA-Z]+\}/g,'');
      }
      if(already_checked[url]) { continue; }
      var response_code;
      try {
        Logger.log("Testing url: "+url);
        response_code = UrlFetchApp.fetch(url, HTTP_OPTIONS).getResponseCode();
      } catch(e) {
        //Something is wrong here, we should know about it.
        bad_entities.push({e : entity, code : -1});
      }
      if(BAD_CODES.indexOf(response_code) >= 0) {
        //This entity has an issue.  Save it for later. 
        bad_entities.push({e : entity, code : response_code});
      }
      already_checked[url] = true;
    }
  }
  var column_names = ['Type','CampaignName','AdGroupName','Id','Headline/KeywordText','ResponseCode','DestUrl'];
  var attachment = column_names.join(",")+"\n";
  for(var i in bad_entities) {
    attachment += _formatResults(bad_entities[i],",");
  }
  if(bad_entities.length > 0) {
    var options = { attachments: [Utilities.newBlob(attachment, 'text/csv', 'bad_urls_'+_getDateString()+'.csv')] };
    var email_body = "There are " + bad_entities.length + " urls that are broken. See attachment for details.";
 
    for(var i in TO) {
      MailApp.sendEmail(TO[i], SUBJECT, email_body, options);
    }
  }  
}
 
//Formats a row of results separated by SEP
function _formatResults(entity,SEP) {
  var e = entity.e;
  if(typeof(e['getHeadline']) != "undefined") {
    //this is an ad entity
    return ["Ad",
            e.getCampaign().getName(),
            e.getAdGroup().getName(),
            e.getId(),
            e.getHeadline(),
            entity.code,
            e.getDestinationUrl()
           ].join(SEP)+"\n";
  } else {
    // and this is a keyword
    return ["Keyword",
            e.getCampaign().getName(),
            e.getAdGroup().getName(),
            e.getId(),
            e.getText(),
            entity.code,
            e.getDestinationUrl()
           ].join(SEP)+"\n";
  }
}
 
//Helper function to format todays date
function _getDateString() {
  return Utilities.formatDate((new Date()), AdWordsApp.currentAccount().getTimeZone(), "yyyy-MM-dd");

N.B. I did not test my code after adding some comments. If this doesn’t work, please tell me in the comments! I’m a known typo machine.

December 2014 KPI Report

I’m not going to share a graph. I haven’t update this blog since I started diving deep into paid media. In the last few months, I’ve learned a ton about AdWords, Bing Ads, and a ton of other neat tactics to drive traffic to eCommerce websites.

I’ve stopped working on building my first eCommerce website and my first app. However, I’ve gotten a lot better at creating AdWords scripts with my new JavaScript knowledge.

It’s not January yet but I’m going to try aiming for the same objective as last year. Keeping track of everything I learn and ideas on AnotherMarketer.com.

May 2014 Article Roundup

Here’s the first monthly roundup of digital marketing articles. This isn’t a top 10 list or anything close. Just a list of some of the most interesting links I put in Evernote during May.

YouTube SEO Tactics

http://searchenginewatch.com/article/2340726/5-Advanced-YouTube-SEO-Tactics-to-Drive-More-Traffic-to-Your-Videos-Website

Using Excel Fuzzy Lookup to Harvest Bulk Negative Keyword

http://www.portent.com/blog/ppc/bulk-negative-keywords-excel.htm

Why Site Speed is So Important for eCommerce

http://www.portent.com/blog/internet-marketing/research-site-speed-hurting-everyones-revenue.htm

What To Do After Writing A New Piece of Content

http://blog.kissmetrics.com/17-advanced-methods/

CEO Of Bonobo’s Talks About the Future of eCommerce

https://medium.com/what-i-learned-building/

Improved Ranking after Moving to HTTPS

http://prosperitymedia.com.au/seo-move-http-https-39-increase-organic-traffic/

May 2014 KPI Report

I wanted to publish two quality posts this month, but I only ended up writing one.

I did start running 10 km every week. That’s no excuse but I’m still proud of that achievement.

may-2014-comment-report may-2014-word-report

Word count went up which is nice. I’m going write only one post about AdWords this month. I know it isn’t really ambitious, but I’m trying to be realistic!

In order to boost my word and post count, I am going to do a round up post by next Monday.

Steal backlinks from a competitor’s product pages

Last month, a guest post by Chris Laursen caught my eye. It was about link building tactics for eCommerce that do not require quality content. One prospecting tactics Cris used was uncovering backlink profiles of closed businesses. I’ve decided to test something slightly different. Rather than look at closed businesses, the goal will be to steal backlinks from an active competitor.

In theory, webmasters should want to refer their readers to a place where they can actually buy the mentioned product right? That’s why I decided to uncover how many broken links to products page and links to products that are out of stock I can find for one company. If the results are satisfactory, I might test out outreach with a real competitor of mine.

I decided to run my prospecting test with SSENSE. Why them? I have a friend that works there and I don’t want to warn my competitors that I’m planning to steal backlinks from them.

Getting Those Dirty Leads

I was pleasantly surprised to see that SSENSE has a great URL structure. By specifying that the page is a product page in the URL, I’ll be able use search for backlinks using the prefix in Ahref.

ssence-url-structure

Of course, it also means I’ll only be checking the Men’s section. By checking only one backlink per domain, I’ve gotten 1510 results. Not bad!

If You Don’t Got That Prefix

If SSENSE had a flat URL structure (http://www.domain.com/product-name), I would have needed to figure out a unique footprint associated to the product pages, scape the Link URLs for it, remove the fat and continue to the next step.

Find Those 404 Errors

I’m going to assume that my fake eCommerce store carries an identical inventory to SSENSE. If that wasn’t case, I might be interested in cutting out various brands or item categories.

Some people might be tempted to this:

http-status-check-wrong

Doing a such a check with SeoTools for Excel isn’t 100% wrong. It would just be a waste of time because there are definitely duplicate URLs in that Ahref export. In this case, I was able to reduce the list by 1/3 by copying the Link URLs to another sheet and removing duplicates. Not to shabby.

If you got some free tools that do reliable HTTP status checks really fast, I’d love hear about them in the comments because it’s annoying to wait for this check to end.

Once that’s done, copy and overwrite the column by pasting the results as values rather than formulas. This is a habit I have developed when dealing with large columns of function and it can really save you headaches.

ssence-http-results

For some weird reason, SSENSE redirects users to their 404 page. These redirections only account for less than 10% of all 301 redirects. The other redirects were due to a change in the URL structure. They were not redirecting sold out products to their home page or related pages.

Using VLOOKUP, I was able to confirm that there was only one domain per 404 error. In any case, that’s still 9 potential links to steal if you’re carrying the product or something extremely similar!

Discover What’s Out of Stock

While I was waiting for the HTTP status check to finish, I confirmed that sold out products pages aren’t redirected and are easily identifiable.

ssence-sold-out

Because SSENCE has implemented rich snippets for products, it’s really easy to scrape their product availability using xPath.

If you’re using SeoTools for Excel, don’t be tricked into using =XPathOnUrl(H2,"//meta[@itemprop='availability']/@content"). That function will not give you the content of the meta tag, it will only confirm that it exists. You must use =XPathOnUrl(H3,"//meta[@itemprop='availability']","content") instead to see the actual contents.

Once again, there’s a bit of wait.

ssence-out-of-stock

Out of all the valid URLs, over 75% were out of stock! If I were SSENSE, I’d be checking to see how much referral traffic the product pages are getting…

However, that’s good news for the people who still have those products in stock and want to steal those links.

Alternative Method

If product rich snippets aren’t implemented, you could always use ScrapeBox to check if “Sold Out” is present on the page.

What’s Left To Do

Before even beginning to harvest emails, you’ll just need make sure you have the item in stock.

The only roadblock I can see is that some of these backlinks are in articles about SSENSE. Therefore, it may be hard to pitch a replacement link. However, if you have the product in stock and a good price, you may still be able to get a link on the same page if the webmaster is open to it.

I’m still amazed that over 1000 external links were to out of stock products.

If you’ve tried this method, I’d love to hear what your results where.

April 2014 KPI Report

As you can see, there wasn’t much done between the last KPI report. I would love to say it’s because I’ve been busy. That would only be an excuse.

april-2014-word-report april-2014-comment-report

I’ll be trying to get at least two posts up in May. The topics will be as follows:

  • Prospecting for deep links to steal from competitors in eCommerce
  • How I dealt with updating a large amount of sitelinks in AdWords editor

If I’m motivated, I’ll even try finishing one of my multiple ideas in the drafts!

Let’s hope I walk the talk in May!

March 2014 KPI Report

I’ve been a bit slow with this KPI report. I knew it wouldn’t be really good as I didn’t write much in March. I’ve already done a few more posts this month.

I’ve just started a new job at Altitude-sports.com. While I may still be doing some SEO, I’m going to be focusing mainly on developing my PPC skills while I’m there. Keep your eyes open for some eCommerce PPC posts in the near future.

march-2014-word-kpi

I only published an article about link reclamation for large brands, but it has become the most popular article on Another Marketer.

Sadly, this popularity did not bring in any extra comments.

march-2014-comment-kpo

Google+ profile links show up as nofollow when logged in

The second link building tactic from Backlinko’s blog post also ended up being a bit trickier than expected. This time it wasn’t because of a WordPress RSS feed issue, it’s because Google+ shows all profile links as nofollow when you’re logged in.

What is even more misleading is that selecting to view your public profile also shows a nofollow link.

google-plus-profile-link

I decided to double check some other Google+ profiles. All of them actually had dofollow links in their story section. I logged out and checked my true public profile and the link became follow!

dofollow google plus profile link

I have no idea why Google+ changes the follow attribute of profile links this way. Maybe Google is trying to limit SEO abuse?

How to fix invalid WordPress RSS feed errors

I recently read a great article by Brian Dean about untapped backlink sources. While some are definitely not untapped, I decided to try two of them on Another Marketer. The first source I tapped were blog aggregators. I had no problems submitting my blog feed until I hit Alltop. According to Alltop, I submitted an invalid WordPress RSS feed

I checked the RSS feed on FeedValidator.org and, to my surprise, my feed actually had some syntax errors. I’m not sure why this happened as I touched any of the source files for the WordPress RSS feed and I doubt any of the Bone WordPress theme functions had anything to do with it.

Easy Invalid WordPress RSS Feed Fix

All it needed to do is this:

  1. Navigate to the WordPress Reading Settings
  2. Change how the articles were being displayed from full to summary
Fix invalid WordPress RSS feed errors by selecting a new RSS feed option

I’m really not sure why this worked, but it did. If you’re having the same problems but you’re setting is at summary, try switching it to full instead.

Two scalable branded link reclamation tactics

This post is a work in progress. It requires more images and maybe a video tutorial!

There are a lot of tutorials and bloggers that state branded link reclamation is really easy. According to these SEOs, all you need to do is follow these 4 simple steps:

  1. Navigate to Google
  2. Search for your brand
  3. Find a non-linking brand mention
  4. Send an email

The four step process is a lie. Combing through your brand mentions like this takes time and isn’t scalable if your working for Toyota or any other large brand.

These types of businesses get mentioned multiple times everyday and often own a large number of other brands. By taking a look at all these mentions manually, not only will you encounter a high number of linking mentions, but a high number of unusable ones will also show in search results.

This means countless hours are wasted just to turn one mention into a link. You’ll be wasting your time for a very low return. In this post, I’ll show two branded link reclamation tactics that I have used with great success. I’ll also explain how to check for non-linking mentions as well as an easy way to find contact information in order to tell them what’s up.

Reclaiming Branded Links with Scrapebox

If you aren’t using Scrapebox because you’ve heard it’s a blackhat too, you need to wisen up. While some people like to use it for comment spam, it’s actually a great tool for uncovering guest posting opportunities, eliminating dead links from your Ahref or Majestic backlink exports, verifying DA of a huge number of domains and a lot of other everyday SEO tasks. I’m going to assume readers of this blog know their Scrapebox basics or have at least read Jacob King’s Ultimate Guide.

Setting Up Your Scrape

Personally, I don’t think you need to do much here. If you have a list of the branded terms you’re looking for it should be enough. The reason why I don’t suggest using the Scarpebox Keyword Scraper is that you might end up having to waste some time cleaning it up.

Due the fact that I work mainly with Canadian brands, I will merge a list of footprints tied to site: queries with country, province and city name mentions within my keyword list. Lastly, I’ll ensure to have -site:brand.com in my global footprint.

After that’s done, you really just need to stat harvesting. It’s really that easy to get more than a thousand URLs that have potentially mentioned your brand.

Cleaning Your Results

One of the main issues with a Scrapebox harvest is that it tends to be very dirty. If you were looking for brand mentions of a Jeep Grand Cherokee, you’ll definitely get some sites that are actually about Native Americans instead. An easy way to eliminate those sites is by cleaning your harvest into Excel.

Using Seo Tools for Excel, it’s extremely easy to check if pages mention any potential keywords that would signify a scrape error. In our example, I’d most likely run a similar but more complex regex to this =IFERROR(RegexpFindOnUrl("http://www.brandmention.com","Native"), TRUE). I’d keep all TRUE URLs and manually check the others if I have the time or motivation.

Monitoring Brand Mentions for Link Reclamation

Now if you aren’t comfortable getting your hands a bit dirty with some PHP code, you won’t be able to actually use this second link reclamation tactic. It’s a more scalable version of Moz’s Fresh Web Explorer and services like Mention.net.

Creating a Huge Number Number of Google Alerts

You’ll be wanting to set up a lot of Google Alerts. I’m talking 20 to 50 alert different types of alerts. The easiest way to do this without getting suicidal thoughts is to automate it. A easy and efficient solution is to just create a ReMouse macro using your browser and excel columns with each alert.

Setting Up SimplePie

SimplePie is a great tool for filtering out useless mentions. I’m working on getting it to do the actual link verification for me. Until then, here’s my two introductory posts on Simple Pie:

How to merge your Google Alerts RSS feeds and filtering out items

Removing duplicate items and an another method to filter out items

Once you’ve set this up. all you will need to do is hit your page up once every few days to find new brand mentions.

Finding Non-Linking Brand Mentions

After getting our leads using either Scrapebox or our custom monitoring system built with SimplePie, we need to start turning those non-linking brand mentions into links.

This is fairly easy. Just install the Link Checker add-on. You’ll need to create two text file. One will contain the list of sites you scraped using one of the two methods bellow. The other must contain the list of domains you are interested in checking.

I suggest running the link checker multiple times while cleaning out positive hits. Depending on the number of URLs on the same domain and other factors, I tend to get a few false negatives on my first link checker runs.

Scraping Emails for Mass Mailing

While I know there are some other tools that will scrape email address from sites. I haven’t had the budget to invest in any of them yet. My current technique is to just use Scrapebox. I take the list of non-linking URLs and trim to root. I move this list to excel and create a list where I concatenate “/about” and another where I concatenate “/contact”. These three lists paired with Scrapebox Email Grabber tends to give me great results.

If you can suggest a great email scraper in the comments, I’d be enormously grateful.

Remember to have “Email Grabber: Save url with emails” enable in options.

Preventing Spam Accusations

Now you got your emails, you just need to start sending out some messages. The best way to prevent spam accusations is not to mass mail the list of emails you just got and be polite…