SEO Ranking – A Correlation Study

The Data Collection Background

I have decided to publish some data based on an in-depth study of SEO ranking factors in the UK menswear category. The study uses recent data from a variety of sources. These include MOZ.com¬†and SEMRush.¬† Rather than simply using pure ranking data I have used actual estimates of organic search volumes. This volume is taken directly from SEMRush, which estimates volumes by each keyword based on Google’s search volume and estimates a click through rate based on ranking. The correlation study looks at the performance of 10 selected menswear websites and compares them across 11 ranking factors.

The Selected Correlation Factors

Compared to many other studies on ranking factors I have attempted to use less esoteric sources of data. Most of these can be found without a subscription and should accessible to people without a high level of SEO knowledge. The eleven correlation factors and their sources are as follows:

  1. The number of pages indexed by Google
  2. The number of high ranking keywords – SEMRush
  3. Moz Page authority
  4. Moz Domain authority
  5. Total Root links – Moz
  6. Total links – Moz
  7. Facebook Likes – Moz
  8. Facebook Shares – Moz
  9. Tweets – Moz
  10. Google+ clicks – Moz
  11. Google Adwords PPC spend

Correlation Values

Correlation values have been calculated using the “Correl” function in Excel. For the uninitiated correlation values range from +1 to -1. +1 indicates a perfect positive correlation and -1 indicates a perfect negative, or inverse, correlation. A value of 0 (Zero) indicates no correlation. Scores above 0.75, either negative or positive could be considered as strong correlations. Likewise, scores below 0.5 could be considered as weak.

A caveat in all correlation studies is that not all correlations are the result of causation. Just because it rains on Tuesdays and Thursday doesn’t mean by naming a day with a T causes it to rain.

The Correlation Results

Each of the correlation factors was tested across the 10 websites in relation to the volumes they achieved. In any event, the range of correlation scores ranged from a negative 0.13 to a positive 0.98. The scores were then plotted and ranked. The chart of these results is shown below:

SEO Correlation Factors Mens Clothing

SEO Correlation Factors Men’s Clothing

If you click on the correlation factor chart you will be able to see the full sized version.
The surprising result is how at 0.98 PPC expenditure correlates so strongly with search volume. According to this study much stronger than the number of root links (0.75), Domain Authority (0.7) or Page Authority (0.66) as measured by Moz.com.

Against this, towards the other end of the scale, total links and most social media measures, even Google+, are seen to be weakly correlated.

The Line of Best Fit

In order to demonstrate how the data for PPC spend and organic search volume correlates, I also provide a graph. This uses a line of best fit.

SEO Correlation Factors PPC Spend

SEO Correlation Factors PPC Spend

Data Limitation

The first thing to establish is the nature of data accuracy. All the measures used are effectively estimates and are therefore inherently flawed to some degree. This is because each measure used is largely an estimate based on a sample. There is, therefore, firstly a risk of sampling error. Secondly the actual numbers used are from a collection of samples that themselves are algorithmic. Most importantly, the product that all the correlation factors is measured against is an estimate of organic search volume as provided by SEMRush. As I know from studies of websites where I have access to the real figures, the SEMRush data can be wildly wrong. However, the SEMRush data is based on accurate rankings taken from the Google.co.uk database and all the websites are given equal treatment in relation to rankings and projected volumes. Secondly alternative  sources of data, say using Majestic rather than Moz as a source of link data or website authority, may provide alternative values and, therefore, correlations. That all being said, again all sites in the study are treated equally. If there is a systematic error in any one measure, then that error is applied equally across all sites.

Market Power and Brand Strength

The last error is perhaps the most obvious. Are we comparing a comparable set of websites? When you study the “Line of best fit” chart you will notice that the PPC expenditure behaviour of the ten sites can be clustered into four distinct groups. One site’s spend is recorded as zero. Four websites spend up to ¬£20k per month. Two sites between ¬£30 and ¬£40k. And two large sites spend between ¬£95 and ¬£120k. In my opinion, the spread of PPC expenditures is effectively a surrogate for business scale and market power.

Based on a wider analyses of all of these factors, perhaps market power, or brand strength is the real determinate of rankings and, therefore, volumes. In the case of the UK menswear market, a bricks and mortar retail network, a presence in above the line marketing channels, digital PR, CRM and email marketing activities are what is effectively being measured. If you were a challenger brand up against an online retailer such as Tesco or Marks and Spencer would you expect to outspend and outrank them with your limited budgets and resources. Having a large PPC budget may be a reflection of how well your perform commercially. Logically, if you perform well in organic search you could create the financial resources to invest in PPC activity. So PPC becomes the product of the correlation analysis, rather than the other way around.

At the moment, all of the market power factors identified in the above paragraph are outside of the scope of this present study. However, I plan to return to them in a future study.

Comments are closed.