The Importance of Ad Testing with Brad Geddes [PODCAST]

In this episode of the  Show, I had the opportunity to interview Brad Geddes, Co-Founder at Adalysis.

Geddes highlights the importance of ad testing and how to leverage it in order to improve your PPC campaign’s performance.

Why do PPC marketers need to focus on ad testing?

Brad Geddes (BG): Marketing is about connecting with people. That’s sort of rule number one as you are persuading someone – connect to them, and hopefully, they do business with you and so the entire conversation around that is the ads.

  • What does your ad say?
  • What does your message say?

At the fundamental level, if you’re not thinking about testing ads, you’re ignoring all this working on the targeting and saying, “We’ll just show some message and hope this works.”

How do you come up with the ads that you’re going to test?

BG: When you think of the paid search side, you’ve got two types of people.

  • You have the math people and they’re amazing at bidding and running reports and they’re terrible at ads.
  • And then you’ve got the ad people, the creative side who are terrible at analytics but they’re great at being creative.

Paid search is one of those disciplines that takes both sides.

First, you’re going to have to sit down and just brainstorm about what works. Do competitive research…

Think of the traditional Google search ads. We have three headlines. It’s kind of a formula almost.

Headlines show we’re relevant to the searcher. We’re echoing the search query.

Next, we need to show a benefit to someone.

  • What do they get out of this?
  • What’s in it for them?
  • Why are we better than someone else?

And now, we’ve got the call to action… Learn more, buy product – be more creative than that.

When working on your ads, you just need to:

  • Write something related to these keywords.
  • Think about the features. Why someone should do business with you.
  • Come up with a good call to action.

What are best practices for calls to action? Should it be very specific to kind of the audience that you’re targeting?

BG: That comes back almost to the funnel in itself.

If someone’s high in the funnel, you can’t ask them to buy something if they don’t know who you are first. Often that’s Learn and Compare type of stuff.

If someone searched for a competitive term, much more likely to talk about compare or alternate options.

If someone is at the buy phase, they’re ready to check out, they’re part of your remarketing lists or whatnot, that’s going to be a much harder call to action.

What you don’t want to do is write your ads with like the eight standard calls to action on the web.

If you look at the majority of ads out there, they are:

  • Shop now
  • Buy now
  • Learn more
  • Call now
  • And a few others

Instead, do something interesting or related to your ad group, such as:

  • Shop now to get the best deals.
  • Call us for the best prices.
  • Call us to learn more about how you can take advantage of this benefit.

You want to be careful about being too boring with them.

What should people be looking for to determine a winner when testing ads?

BG: When you think about determining the winner, this goes back to why you are advertising. Why are you spending money?

  • Is it for leads?
  • Is it for conversions?
  • Is it for eyeballs?

Because each one of those is going to have a different metric you look for…

If you’re ecommerce, you’re probably looking at ROAS or revenue per impression comparisons.

If you just want the most eyeballs, click-through rate, except that doesn’t mean they cared about your page. Which is why even if someone wants eyeballs, we usually define what that looks like

CTR is a terrible metric in general because it just means someone clicked on your ad and cost you money. It doesn’t mean they cared about what they saw afterward.

So you go back to the reason you’re marketing – that’s the primary metric you’re looking at. This is why you have to be careful about how Google serves those impressions because Google because they make money through clicks…

How many ads should you test and how long do you run them before making a decision?

BG: Usually, you’re talking 2-3 ads per ad group to start. If you have a lot of traffic, you can go up to 5 ads. When you break that Google’s ad serving just gets really bad at an ad group level.

And then what you’re doing is you’re defining things such as minimum time. So at least a week at a time, because Sundays and Tuesday searches are so different. Usually, though, that’s minimum. It’s OK to do every 3 or 4 weeks to really dig into the data because that’s roughly a month…

And there’s no way to define minimum data. Usually, a few hundred impressions minimum. Obviously, the more the better.

And then you’re looking at statistical significance which is the percentage chance that a result is not due to randomness. Due to this minimum data, this timeframe, statistical significance, never ever run this manually.

What about the tools that you use for this?

BG: Use a script or third-party software. At AdAlysis, our software does all this automatically but you can run a script for this or some other third-party product vendor.

Don’t do it by hand because it will not take long before you just start eyeballing the data and humans are terrible at eyeballing data and making decisions.

What do you consider when calculating statistical significance?

BG: Our tool is looking at minimum data amounts, statistical confidence levels, which is a variation in data from two or more objects – in this case being ads – and understanding that this ad is statistically significant than the other one.

Now, significance is really percentage chances. So when someone says it’s 90% significant, what they’re really saying is there’s a 10% chance this data is due to randomness and a 90% chance this is a true winner.

Now, not to get too heavy into math here, but statistical significance is based upon deviations from each other and so forth.

The problem with search results is that Tuesdays and Wednesdays and Saturdays are different from each other. And significance is based upon the data that came before will also mirror the data that comes later.

So if you just test in one day, you may have enough for data for statistical significance but you really don’t – because Wednesday is not like Tuesday. This is why we want some of these longer timeframes in looking at our information.

How do you approach continuing to test something?

BG: As long as people are changing or your competitors are changing, then you must at least be aware that it’s time to test again.

Let’s say you don’t have all the time in the world to spend on this.

What you need to watch out for are when your metrics start to decline a little bit or your competitors have completely changed their offer, then you’d probably want to test again.

You may need to respond to that in your ads. Those are usually a couple of the trigger points.

Most large companies never stop testing. They have always two to three ads in an ad group running, testing, getting rid of losers, curating their ads. During a seasonal peak, they may pause down to their absolute best.

Now, large companies have more resources so they have more people who can think about this and write this and use tools that just mass create ads.

The smaller companies are going to do it based upon whether data is changing a little bit.

How do you do ad testing at scale?

BG: Usually, you’re breaking your accounts up to two types of ad groups.

First, you’ve got ad groups that are so important. You usually only have a handful of them that you must control the ad testing in it because those are the super important products, sales messaging, brand, etc.

And then you have everything else. All that long tail type of keywords, which in many cases could be thousands of clicks a month.

Here, you’re doing what we call “multi-ad group testing”, where instead of testing ads per ad group, you’re just testing these two messages or CTAs in a thousand ad groups.

You’re often writing that same pattern. That different call to action, different benefit or whatever it is and then two different headlines across a thousand ad groups – replicating that change in the headline across a thousand ad groups. You’re aggregating that data by those different patterns.

You can aggregate it across the accounts. You’re not finding the best message for an individual targeting method (which is ad group level) instead, you’re finding how consumers across a large variety of products or services would like to interact with you.

So if you need consumer-level data – not targeting level – then there is a lot of good insights you can pull out of here as well.

More Resources:

To listen to this Search Engine Show Podcast with Brad Geddes:

Visit our podcast archive to listen to other Show podcasts!

Image Credits

Featured Image: Paulo Bobita

#