Author Archive
Linkscape Update, New Stats and an API Dashboard
Posted by Nick Gerner
Update: Why did my Domain Authority change?
In this index update we re-calibrated our Domain Authority metric to better reflect the relationships between all domains on the Internet. This means that many websites’ Domain Authority (DA) changed.
Not to worry! If your domain authority went down, so did all of the other domains that had similar link profiles before the index update. (Don’t think about it like something bad happened to your site, think about it like we changed how we view the entire Internet) You can read more about why we did this in the section below.
The good news is, we have an index update for you! And it’s a couple of days sooner than previously announced. The bad news is, things were, as you might have noticed, a little rocky this morning. We had more traffic to the API than ever before, and, through the magic of being a scrappy startup, we all jumped into action. Fortunately, through the magic of Amazon Web Services, we’ve quickly increased our infrastructure and are serving better than ever. I do apologize for any issues this might have caused.
I’ve got a few things to say in this post, so you can skip forward if you like:
- Page Authority and Domain Authority Change
- New API Dashboard
- Stats on Nofollow vs Rel=Canonical
- Index Update Stats
Page Authority and Domain Authority Change
We’ve gotten a lot of feedback about Page Authority and Domain Authority. We’re excited about these metrics and are using them to power a lot of what we do: sorting links, crawl selection, keyword difficulty. But lately, it seemed as if things were getting a little… clumpy. We were packing our numbers too closely together to give a real sense of the spread in authority over the web. I’ll defer to Ben and Rand, who are working on this a lot, but just to give you a taste:
As you can see, we’ve pulled apart a lot of great sites. This spread, for example SEOmoz with a PA of 80 and Amazon.com with a PA of 89, better reflects different authorities.
This will have effects across tools, including Keyword Difficulty. So take a minute to check those out and make sure that what we’re showing you matches your intuition.
New API Dashboard
It’s actually very fitting that we should have more traffic than ever before, because we’ve been hard at work on better serving one of our biggest API consumers: You! Today we’re launching our SEOmoz API Dashboard.
This dashboard will be the place to go to manage your SEOmoz API account. Right now we’re including all of your API usage. This gives you visibility into your API consumption, critical if you’re on the paid plan. And if you’re on the free plan this gives you some idea of the usage of your tools. As we improve what we offer both in the API and to support application development, you’ll see more and more here.
Stats on Nofollow vs Rel=Canonical
Last week I had a great chat with Eric Enge at Stone Temple Consulting. We talked a little bit about the usage of nofollow and rel=canonical over the last year (a big year for both!), but I didn’t have anything concrete to share at the time. I dug into it and it’s pretty interesting:

As you can see, rel=canonical is really taking off. Since we started keeping really good stats on its usage in our data, it’s grown in usage by about 50%, in just six months! We see rel=canonical being used more than either internal or external nofollows. And internal nofollows have fallen off quite a bit about eight months ago, but are reasonably stable since then.
My hypothesis (without supporting data at the moment), is that two mindsets are winning:
- use rel=canonical right away
- if rel=nofollow is working leave it
I’ll leave it to the expert SEOs to debate this (in the comments, please!), but that could well be sound advice.
Index Update Stats
Here are some charts and graphs of the data we’ve updated since last month.

We’re staying on course with our current update rate for pages. We’ve got updated information for about 43 billion pages.

And we have a corresponding update for links to and from those pages.
.png)
We’ve got two focuses for our data updates:
- Get those domains which we used to think about as niche
- Get deep on those domains that are highly authoritative
This is actually a big initiative we’ve been working on this year. And we’re already seeing great improvements to our data quality.
I hope you enjoy the data. As always, feedback is much appreciated!
Linkscape Index Update and a Peek Behind the Curtains
Posted by Nick Gerner
Last week we updated the Linkscape index, and we’ve been doing it again this week. As I’ve pointed out in the past, up-to-date data is critical. So we’re pushing everyone around here just about as hard as we can to provide that to you. This time we’ve got updated information on over 43 billion urls, 275 million sub-domains, 77 million root domains, and 445 billion links. For those keeping track, the next update should be around April 15.
I’ve got three important points in this post. So for your click-y enjoyment:
If you’ve been keeping track, you may have noticed a drop in pages and links in our index in the last two or three months. You’ll notice that I call these graphs "Fresh Index Size", by which I mean that these numbers by and large reflect only what we verified in the prior month. So what happened to those links?


Note: "March - 2" is the most recent update (since we had two updates this month!)
At the end of January, in response to user feedback, we changed our methodology around what we update and include. One of the things we hear a lot is, "awesome index, but where’s my site?" Or perhaps, "great links, but I know this site links to me, where is it?" Internally we also discovered a number of sites that generate technically distinct content, but with no extra value for our index. One of my favorite examples of such a site is tnid.org. So we cut pages like those, and made an extra effort to include sites which previously had been excluded. And the results are good:

I’m actually really excited about this because our numbers are now very much in line with Netcraft’s survey of active sites. But more importantly, I hope you are pleased too.
I’ve been spending time with Kate, our new VP of Engineering, bringing her up to speed about our technology. In addition to announcing the updated data, I also wanted to share some of our discussions. Below is a diagram of our monthly (well, 3-5 week) pipeline.

You can think of the open web as having essentially an endless supply of URLs to crawl, representing many petabytes of content. From that we select a much smaller set of pages to get updated content for on a monthly basis. In large part, this is due to politeness considerations: there’s about 2.6 million seconds in a month, and most sites won’t tolerate fetching one page a second by a bot. So we only can get updated content for so many pages in a month.
From the updated content we get, we discover a very large amount of new content, representing a petabyte or more of new data. From this we merge non-canonical forms, and remove duplicates, as well as synthesize some powerful metrics like Page Authority, Domain Authority, mozRank, etc.
Once we’ve got that data prepared, we drop our old (by then out of date) data, and push the updated information to our API. On about a monthly basis we turn over about 50 billion urls, representing hundreds of terabytes of information.
What Happened To Last Week’s Update
In the spirit of TAGFEE, I feel like I need to take some responsibility for last week’s late update, and explain what happened.
One of the big goals we’ve got is to give fresh data. One way we can do that is to shorten the amount of time between getting raw content and processing it. That corresponds to the "Newly Discovered Content" section of the chart above. For the last update we doubled the size of our infrastructure. In addition to doubling the number of computers we have running around analyzing and synthesizing data, it actually increased the coordination between those computers. If everyone has to talk to everyone else, and you double the number of people, you actually quadruple the number of relationships. This caused lots of problems we had to deal with at various times.
Another nasty side-effect of all of this was this made machine failures even more common than we experienced before. If you know anything about Amazon Web Services and Elastic Computer Cloud then you know that those instances go down a lot
So we needed an extra four days to get the data out.
Fortunately we’ve taken this as an opportunity to improve our infrastructure, fault tolerance and lots of other good tech start-up buzz words. Which is one of the reasons we’re able to get this update out so quickly after the previous one.
As always, we really appreciate feedback, so keep it coming!
Double Your Fun with Double the SEOmoz API
Posted by Nick Gerner
I know, I promised a Linkscape update by last week. And I missed it. But there’s an update today! Do you forgive me? No? Not enough? Well how about doubling the volume of data available in our free API? You might have gotten a totally awesome email last week announcing that the free SEOmoz API is now serving up to 1,000 links. This email was so awesome I just had to share it (nice work Scott!)
- Up to 1,000 links to a page, subdomain or root domain (sorted by Page Authority of the linking page)
- Anchor text for those 1,000
- Aggregate anchor text counts across all links in our index
- HTTP status code
- nofollow indicators
- Plenty of metrics for data junkies
We’ve got a community submissions page on our wiki, and we love to share neat apps. So if you build something on our API, send it our way and we’ll make sure the community hears about it.
The Freshest Linkscape Data Ever
Posted by Nick Gerner
Since the launch of Open Site Explorer and our API update, Chas, Ben and I have invested a lot of time and energy into improving the freshness and completeness of Linkscape’s data. I’m pleased to announce that we’ve updated the Linkcape index with crawl data that’s between two and five weeks old—the freshest it’s ever been. We’ve also changed how we select pages, in order to get deeper coverage on important domains and waste less time on prolific but unimportant domains.
You may recall Rand’s recent post about prioritizing the best pages to crawl, and mine about churn in the web. We’ve applied some of the principles from these posts to our own crawling and indexing. Rand discussed how crawlers might discover good content on a domain by selecting well-linked-to entry points:

In the past, we’ve selected pages to crawl based purely on mozRank. That turned out to favor some unsavory elements (you know who you are :P). Now, we look at each domain and determine how authoritative it is. From there we select pages using the principle illustrated above: Highly linked-to pages—the homepage, category pages, important pieces of deep content—link to other important pages we should crawl. From intuition and experience we believe this gives the right behavior to crawl like a search engine would.
In a past post, I discussed the importance of fresh data. After all, if 25% of pages on the web disappear after one month, data collected two or more months ago just isn’t actionable.

From now on, we’re focusing on that first bar in the graph above. By the time our data approaches that second bar (meaning most of it is out of date), we should have an index update for you. If and when we show you historical data, we’ll mark it as such.
What this means for you is that all our tools powered by Linkscape will provide fresher, more relevant data, and we’ll have better coverage than ever. This includes things like:
As well as products and tools developed outside SEOmoz using either the free or paid API:
There are plenty more. In fact, you could build one too!
Because I know how much everyone likes numbers, here are some stats from our latest index:
- URLs: 43,813,674,337
- Subdomains: 251,428,688
- Root Domains: 69,881,887
- Links: 9,204,328,536,611
Our last index update was on January 17th. You might recall some bigger numbers in the last update. Because of the changes to our crawl selection, our latest index should exclude a lot of duplicate content, spam pages, link farms, and spider traps while keeping high quality content.
Our next update is scheduled for March 11. But we’ll update the index before then if the data is ready early
As always, keep the feedback coming. With our own toolset relying on this data, and dozens of partners using our API to develop their own applications, it’s critical that we hear what you guys think.
NOTE: we’re still updating the top 500 list at the moment. We’ll tweet when that’s ready.
Launching the SEOmoz Free API and Enough Power to Build Open Site Explorer
Posted by Nick Gerner
The launch of Open Site Explorer last week opens up a lot of link data, filters, and anchor text to a much wider audience than we’ve ever had before. In that same vein, today we’re announcing our new and improved SEOmoz Free API.
Any registered (it’s free) SEOmoz member can visit our API Portal and get an API key that gives you access to:
- Data for any URL in our index including
- Domain and Page Authority
- mozRank
- total link count
- external, followed link count
- The first 500 links to any page, sub domain or domain
- Filtering on those links: 301s, Follows, External, etc.
- The first 3 domains linking to any page, sub domain or domain
- The first 3 anchor text terms or phrases in links to any page, sub domain or domain
You’re welcome to use this data for private or publicly-facing purposes. We already have a variety of partners integrating this data including:
- Buzzstream
- Brandwatch
- HubSpot’s Grader Suite
- Quirk’s Search Status toolbar
Check out some sample code and applications on the wiki.
Our idea is that getting this data into the hands of webmasters makes everyone better off: we’re excited about our new authority scores, marketers are thirsty for metrics, and users of all kinds of tools are better off with a deeper look at real data. The free package will keep you covered up to a million links per month that you’re free to use for any purpose from consulting to building an SEO campaign management suite.

In addition to the free API (which I think is quite powerful already), we’re expanding our paid API offering. The paid API includes everything above, but also includes:
- Additional metrics:
- number of domains that link to you
- mozTrust
- number of links to all pages on your domain
- and more
- A deeper look at links, way beyond the first 500 (first 100k for each sort per page, domain or sub domain)
- Plenty of sorts on links:
- domain authority
- page authority
- linking root domains
- Way more anchor text terms and phrases (up to 100k per page, domain or sub domain if you’ve got that many)
This is exactly the same API powering Open Site Explorer. So if you think OSE missed a feature, or should include other data sources, you can build it over again and do an even better job :) If you do, drop me a line and I’ll take a look. We’d love to share partner apps on our wiki, Twitter, the blog, and elsewhere.
We don’t even have an attribution requirement. Although, we have a tasty 15% discount if you do cite us as a source
To sign up, just contact us, and we’ll start the process.
EDIT: The paid API is available outside of a PRO membership. A PRO membership buys the tools, and content, and sweet sweet badge. The paid API is extra. Of course, the free API is both free and full of awesome.
Looking Back at Linkscape’s Trillion + URLs (and Announcing our Latest Index Update)
Posted by Nick Gerner
As we rapidly approach the end of 2009 and opening of 2010, we’ve got a much anticipated index update ready to roll out gang. Say it with me "twenty-ten". Oh yeah, I’m so gonna get a flying car and a cyberpunk android :) …Ahem. I thought this would be a great time to take a look back at the year and ask, "where did all those pages go?" Being a data-driven kind of guy, I want to take a look at some numbers about churn, freshness and what it means for the size of the web and web indexes over the last year, and the hundreds of billions, indeed trillion plus urls we’ve gotten our hands on.
This index update has a lot going on, so I’ve broken things out section by section:
- Analysis of the Web’s Churn (or why having ten trillion URLs isn’t very useful)
- Canonicalization, De-Duping & Choosing Which Pages to Keep
- Statistics on our December Linkscape Update
- New Updates to the FREE SEOmoz API (and a 90% price drop on the paid API)
An Analysis of the Web’s Churn Rate
Not too long ago, at SMX East, I heard Joachim Kupke (senior software engineer on Google’s indexing team) say that "a majority of the web is duplicate content". I made great use of that point at a Jane and Robot meet up shortly after. Now, I’d like to add my own corollary to that statement: "most of the web is short-lived".

After just a single month, a full 25% of the URLs are what we call "unverifiable". By that I mean that the content was either duplicate, included session parameters, or for some reason could not be retrieved (verified) again (404s, 500s, etc.). Six months later, 75% of the tens of billions of URLs we’ve seen are "unverifiable" and a year later, only 20% qualifies for "verified" status. As Rand noted earlier this week, Google’s doing a lot of verifying themselves.
To visualize this dramatic churn, imagine the web six months ago…
Using Joachim’s point, plus what we’ve observed, that six-month old content today looks something like this:

What this means for you as a marketer is that some of the links you build and content you share across the web is not permanent. If you engage heavily with high-churn portions of the web, the statistics you monitor over time can vary pretty wildly. It’s important to understand the difference between getting links (and republishing content) in places that will make a splash now, but fade away, versus engaging in lasting ways. Of course, both are important (as high-churn areas may drive traffic that turns into more permanent value), but the distinction shouldn’t be overlooked.
Canonicalization, De-Duping & Choosing Which Pages to Keep
Regarding Linkscape’s indices, we capture both of these cases:
- We’ve got an up-to-date crawl including fresh content that’s making waves right now. Blogscape helps power this, monitoring 10 million+ feeds and sending those back to Linkscape for inclusion in our crawl.
- We include the lasting content which will continue to support your SEO efforts by analyzing which sites and pages are "unverifiable" and removing these from each new index. This is why our index growth isn’t cumulative — we re-crawl the web each cycle to make sure that the links + data you’re seeing are fresh and verifiable.
To put it another way, consider the quality of most of the pages on the web, as measured, for instance, by mozRank:
I think the graph speaks for itself. The vast majority of pages have very little "importance" as defined by a measure of link juice. So it doesn’t surprise me (now at least) that most of these junk pages are disappearing after not too long. Of course, there are still plenty of really important pages that do stick around.
But what does this say about the pages we’re keeping? First of let’s take out any discussion of the pages that we saw over a year ago (as we’ve seen above, there’s likely less than 1/5th of them remaining on the web). In just the past 12 months, we’ve seen between 500 billion and well over 1 trillion pages depending on how you count it (via Danny at Search Engine Land).
So in just a year we’ve provided 500 billion unique urls through Linkscape and the Linkscape powered tools (Competitive Link Finder, Visualization, Backlink Analysis, etc.). And what’s more, this represents less than half of the URLs we’ve seen in total, as the "scrubbing" we do for each index cuts approx. 50% of the "junk" (including canonicalization, de-duping, and straight tossing for spam and other reasons). There’s likely many trillions of URLs out there, but the engines (and Linkscape) certainly don’t want anything close to all of these in an index.
Linkscape’s December Index Update:
From this latest index (compiled over approx. the last 30 days) we’ve included:
- 47,652,586,788 unique URLs (47.6 billion)
- 223,007,523 subdomains (223 million)
- 58,587,013 root domains (59.5 billion)
- 547,465,598,586 links (547 billion)
We’ve checked that all of these URLs and links existed within the last month or so. And I call out this notion of "verified" because we believe that’s what matters for a lot of reasons:
- Our own research on how search engines rank documents
- Your impact on the web (as in traditional marketing) and ability to compare progress over time
- Sharing reliable, trust-worthy data with customers, both for self and competitive analysis
- Measuring progress and areas for improvement in search acquisition and SEO
I hope you’ll agree. Or, at least, share your thoughts
New Updates to the Free & Paid Versions of our API
I also want to call a shout out to Sarah who’s been hard at work on repackaging our site intelligence API suite. She’s got all kinds of great stuff planned for early the coming year, including tons of data in our free APIs. Plus she’s dropped the prices on our paid suite by nearly 90%.
Both of these items are great news to some of our many partners, including:
- Buzzstream - a tool for social media, PR and link management
- Brandwatch - a reputation monitoring tool
- Grader.com - Hubspot’s popular site analysis tool
- Quirk’s Search Status Bar
- And at least three of these top "10 Link Building Tools for Tracking Inbound Links"
Thanks to these partners we’ve doubled the traffic to our APIs to over 4 million hits per day, more than half of which are from external partners! We’re really excited to be working with so many of you.
Competitive Link Research with the Linkscape Index
Posted by Nick Gerner
Just before the SEOmoz PRO Seminar, over the weekend, we updated the Linkscape index. This is great timing because we’re also unveiling (to PRO members only, sorry free members) the prototype for a new tool! We’re calling it our competitive link finder, powered by Linkscape. But Tom Schmitz was good enough to explain things in a blog post some weeks back.
But before I dive into the new tool, as is traditional, some numbers:
- URLs: 39 billion
- Root Domains: 55 million
- Subdomains: 208 million
- Links: 443 billion
The sharp members of our audience will recognize that this index is, in fact, smaller than our last. Our index size is varying from update to update as we tune quality vs coverage. And this creates some issues around historical tracking. Believe me, we are working on the issue, stay tuned for more information around this scenario.
More interesting is an Index Quality Study we finished just before this update. From that study two things are immediately interesting to me.

First, we estimate that between 60 and 70% of what Y!SE might give you (including no follows, duplicate links) are in our index today (the small one, remember?). Moreover, we estimate that nearly 50% of what Y!SE will give you, we could too, but we filter out as duplicates, nofollows, or otherwise less important than other data we’ve got in our top 3000 links.
Next we’ve gotten a lot of feedback about how mozRank matches intuitive understanding. Sure it’s a 10 point scale, similar to Google Toolbar PageRank, but often people are finding it’s off from what they’re expecting. This is because of the data we’ve been optimizing our index for:
.png)
In the past we’ve been concentrating on a more or less random sample of pages users might care about (the red bars). As it turns out, you guys care a lot more about important pages and want mozRank to be focused at describing the authority of these pages (the blue bars). So we’ve dramatically shifted the focus of mozRank toward these pages. Hopefully you should get a better experience out of mozRank and mozTrust for these high authority pages and sites.
We have more data for partners and power users. PM me if you’re interested.
Finally, here’s the new competitive link tool. (I know you guys already took a peek at it!) The idea is to identify authoritative sites and communities you could get links from, but don’t already.
What we do is take your site, and up to five related sites (maybe competitors). From those we find all the links the related sites have, and find the common ones. From that we create a check-list. These are the big important sites your industry is engaging with, but you aren’t.
Of course, there’s no reason you shouldn’t be able to get some of these endorsements too. I mean, you’ve got great content, products, tools, and services. Users want that stuff. Google, et al. want to deliver those search results.
So go check out your latest updated data, our new tool, and stay tuned for a Linkscape FAQ adapted from my PRO training slides. That’s a little something for those of you who couldn’t make it to the seminar
Linkscape Index Update: Now with More Visualization
Posted by Nick Gerner
Last night we rolled out our latest Linkscape index update (we call it "index13" internally). From a data perspective we’ve got a few things wrapped in here that might interest you guys:
- 53 billion urls (our biggest index to date)
- 500 billion links
- Everything crawled within the last two months
- We crawled every blog post pulled from Blogscape up until May 1
- We’re now counting # of Root Domains linking to a Subdomain instead of Subdomains linking to a Subdomain
In this update we’ve focused on a small amount of growth, up-to-date fresh data, and including the fast moving web, which has traditionally been difficult to capture for us. So go check out some reports, links, and top pages.
In light of the recent discussion across the SEO world about revelations about nofollow, here are some stats on nofollow usage we’ve observed:
- nearly 15 billion links (~3% of all links) using the nofollow attribute
- over 11 billion of those were internal (73% of instances of nofollow)
I don’t pretend to know what motivates these internal usage of nofollow, but this is certainly consistent with the hypothesis that nofollow is used extensively for internal architecture reasons. We’re looking into this issue a great deal. Be sure to check out WBF this week
This update calculates mozRank as Rand describes as the "old" way. We’re working on changes to include the "new" behavior and when we get that out (in about a month) we’ll include some notes about correlations and changes.
We’ve also been keeping a close eye on adoption of rel=canonical. Our data shows a low, but growing level of adoption. We’ve got just over 38 million instances. From our anecdotal view we’ve seen it used pretty successfully on a few large sites. But I know there’s still a lot of skepticism about it in some cases, so your mileage may vary. Still, it’s not hard to include, so it might be a worthwhile investment regardless.
Shortly after each index update we also update our list of the Top Domains and Top Pages on the web. So be sure to keep an eye out for that data being updated very soon. Another thing we’ve been doing since launch is saving Linkscape reports. So if you’re looking for history of sites and pages you’ve run in the past, be sure to check those out.
In addition to the index update, some of you may have read about a new addition to our SEOmoz Labs offering, a Linkscape Visualization Tool, which we’re very happy to make available to you. As usual, this is a prototype, providing some advanced functionality we hope to include in future versions of our products. In the meantime head over to Labs and check out what else we’ve got
The visualization tool itself provides a lot of neat features that make what we’re trying to do with Linkscape much more intuitive:
What we’ve done is to lay several key factors onto a radar graph to illustrate the comparison between these two sites. Radar graphs are a bit fancy, but the idea is pretty neat: each leg represents a different dimension. For instance to the upper right we’ve illustrated that shopstyle.com beats revolveclothing.com on a pure external link count, but revolveclothing.com beats shopstyle.com in terms of domain diversity of those links. Overall these two sites are competitive with each other, but the larger shopstyle.com area suggests that shopstyle has a slight edge from a pure link profile perspective.
We’ve powered a few consulting gigs of our own with this kind of visualization and it makes a great way for clients to see visually how they stand against competition, how internal pages compare to a site’s homepage, and where the greatest weaknesses between two pages lies. But we don’t just visualize the data. We also provide the raw data in a table:

So far it seems like that "Overall Score" is pretty well correlated with ranking for sites with similar content. So we feel like this is a pretty good view of a page’s link profile.
More importantly we’ve built some of our SEO experience and analysis into suggestions and next steps:

These suggestions are a good place to look if you want to know your biggest strengths and weaknesses. And we’ve got some contextual links to get some more information.
So try out the tool and the new index and let me know what you think!
mozRank and PageRank for Metrics Driven SEO
Posted by Nick Gerner
Judging by our analytics and the volume of Q+A about mozRank and PageRank, I’d say a lot of you are applying metrics to your SEO. And I don’t just mean search engine referrals. Given the economy, it’s great if you can lay out some hard numbers, connect that to results, and make a strong argument for the work that you do.
I know we’re seeing our SEO business continue to grow, not shrink, even in this harsh economic climate. I think a lot of that is because we’re able to provide good numbers to back up our strategy recommendations. But even with the expertise on our SEO team and out in the community, I still see a few common questions about mozRank and PageRank, and what these mean for real-world SEO:
- What’s the scale for mozRank? And why do I care?
- What does mozRank measure compared to Domain mozRank?
- How does mozRank compare to or differ from PageRank? Why should I use one or the other?
- What does PageRank really tell me about a page? How is it limited and what can I do with this knowledge?
These are great questions. We’ve got some discussion in our Linkscape help center around these, but it’s a little technical and product focused. And we’d prefer to tell everyone (not just PRO members) something about SEO here.
If you’d like, you can jump straight to the takeways. By the way, when I say "PageRank" below I mostly mean Toolbar PageRank, the green fairy dust in your Google Toolbar.
First, on to scales. Both mozRank and PageRank (both academically and in the real world) have very few pages at the top (mR/PR 10) and many, many pages at the bottom (mR/PR 1). Because there’s such a big disparity here, both of them have a handy 10 point scale, as illustrated below.
You’ll notice the Y-axis is showing a hypothetical Link Juice metric on a log scale. So where we have mozRank 4, you’ll see that corresponds to a hypothetical link juice value of 599, and mozRank 5 corresponds to 5,000. This just reflects the relative effort to get these mozRanks, whether this be links, authoritative endorsements, etc.
Take a mozRank or PageRank 5 page: one point above PageRank 4, one point below PageRank 6: one point in both cases. But the work you put in to go from 4 to 5 is quite different from the work you do to go from 5 to 6. For beginners it’s frustrating to hit a particular level of PR or mR and feel like you’ve plateaued. When this happens, it’s time to sharpen your pencils because you’ve got to break out some new techniques.
FYI, we set mozRank so that each level is ~8 times as much link juice as the prior level. To show this in another way, here’s the same graph, but we’ve taken out the fancy scaling and just show the gradations between mozRank 5 and 7.

Suddenly you can see the real difference in SEO effort between a mozRank 5.08 and a mozRank of 6.61, and the work left until mozRank 7. Show this the next time someone gives you a hard time about that link building effort you’re making. Or better yet, the next time you’re link building, be sure to measure where you are today, and choose the right link building tactics. What worked to get you to mozRank or PageRank 5 just isn’t going to cut it if you’re going after that elusive 7.
When people say site PageRank, they’re really talking about the link profile of the whole domain: what other sites are linking back? mozRank and (the original academic form of) PageRank both measure only links between pages. This ignores any factors about content, anchor text, domain age, authority, or trust. Domain mozRank and the concept of site PageRank are both interested in only links between sites. This still ignores factors about content or anchor text or domain age. While mozRank is scoped at the page level and measures reach by links to that page, Domain mozRank is scoped at the whole domain and measures how broadly the domain is referenced across many different domains. In this case, many links from a single domain don’t help, but a few links from each of many different domains does help.
But typically one uses PageRank of the homepage to measure this. PageRank doesn’t do a bad job of this, but it’s not directly measuring this effect. Inside Linkscape (and exposed on the mozBar), we show Domain mozRank, which does directly reflect this, on the same kind of a scale described above.

Each level of Domain mozRank is about five times the juice of the prior level. This reflects the fact that there are many more pages than domains. But you get the same issues trying to jump from DmR 5 to 6 compared to 4 to 5. Getting more links from the same domains already linking to you isn’t going to help your site-wide link profile. So if you’re stuck at DmR 5, it’s time to reach a little more broadly, engage in some new communities, and partner with some new sites.
So what about PageRank? Why do I keep talking about mozRank if Google isn’t using it in their algorithms? That’s a very valid question. We are confident, and plenty of expert SEOs agree that Google cares about links. They care about links from authoritative domains more than links from non-authoritative domains. And once you’ve gotten those links, the links you give out count for more. This is exactly the intuition we capture in mozRank and Domain mozRank. In fact, we’ve done a lot of studying and comparing mozRank and PageRank and we’ve found something really encouraging, and something a bit surprising.

* I’ve included a small amount of noise in PageRank (+/- 0.5) because PageRank is only provided with 10 gradations (e.g., PR 5 or 6 but never 5.34). This causes bunching in graph, which makes interpretation difficult.
This graph visually shows how mozRank compares to PageRank. The x-axis represents toolbar PageRank* of a page, and the y-axis represents mozRank for the same page. I’ve included the line y=x, which shows what perfect correlation would look like. For you stats junkies the Pearson’s correlation coefficient is 0.48, which is good, but not perfect correlation.
We’re pleased with this correlation. But by PR 4, mozRank starts to fall below PR, in some cases by at least a point. Our rule of thumb is that mozRank should be within a point or two of PageRank. This gets at data to support a belief many of you have had for a long time: Toolbar PageRank is correlated with site-wide authority and trust effects, beyond just page-level links. This can make things difficult for the metrics driven SEO: how can you measure your current position, and progress against different ranking factors, when the metrics you’ve got combine effects?

The Pearson’s correlation between Domain mozRank and PageRank of the homepage of the domain of 0.71. This is a much more significant correlation than the page-level correlation between mozRank and PageRank. This time, we see much more significant clustering around the perfect correlation line y=x. And this time we see much less of the underestimating we saw with page-level mozRank. This suggests that Toolbar PageRank is showing several factors, including page-level linking, but also site authority and trust. And those factors are combined into a single score. Using PageRank alone can leave plenty of question marks about your strengths and weaknesses.
For the metrics driven SEO, this implies a few things:
- A high Toolbar PageRank for a page might not indicate a widely popular page. In fact, the page might be very lightly linked to, but might instead be reaping the rewards of being on a strong domain (e.g., some Wikipedia pages).
- Analyze the profile of the whole domain during the initial audit process, and not just specific pages. A new or unknown page might receive a high PageRank just by being on a strong domain. PRO members can try out the labs backlinks analyzer and choose "root domain" or "just this page" to see these two profiles.
- Work on site-wide performance, and then focus it. Gain authority for your whole domain, then focus that strength through link sculpting, on-page key word factors, and anchor text.
- Use fine grained metrics. Where appropriate, metrics like mozRank and Domain mozRank along with some comparisons to the competition can give an audit some powerful, targeted conclusions about strengths and about what is missing. We’re certainly doing a lot of this in our own consulting.
The online marketing space is filled with measurements: analytics, conversions, cost-per-click. A lot of SEO is something of an art requiring high-level expertise. But there’s plenty of room for measurement here too. Check out your site profile, check out your strong and weak pages. Measure the authority of your site. Prioritize your work based on your known strengths and weaknesses. And show your stakeholders not just what you’re doing but why, and how that’s changing.
With the economy in the shape it’s in, the people who can measure their work and validate their assumptions are the people who are going to survive. And they’re not just going to survive, but they’ll thrive as they pick up the pieces the rest of us leave behind.
Top Pages On A Domain - Linkscape Index Update
Posted by Nick Gerner
Just a few short weeks ago we launched the last index update… and here we are to do it again. Ben’s gotten faster at his parts of the update, from four or five weeks to just two, with me dragging my feet for another week or so. So that’s right, we’re down to just three or four weeks between updates, with accompanying data freshness (which we still want to improve substantially).
But I’m not here to tell you about the same stuff you’ve already heard about. Instead, let me unveil another PRO-only SEOmoz Labs project: Top Pages.
We’re building Linkscape not just as a tool, but as a data source on which we and 3rd parties can build new and innovative tools. Top Pages is a showcase of some of the things we’re hoping to enable.
Enter a subdomain and this tool will show you the top pages on that subdomain, based on the same criteria we use for our Top Sites list. Oh, and did I mention that we’ve got a lot of these top pages?
Wow, that’s a lot of pages. We’ll go up as high as 10,000 in this tool. But if you get this far and want more, check out our custom reports. Or get involved with our developer community to stay up-to-date on getting access to more of our data.
So what can you do with this tool?
- Investigate which pages make a big link impact, even if they’re not generating substantial traffic
- See which pages have ranking potential with keyword tweaks, if they’re not already ranking
- See which competitor pages are getting traction
- See where competitors spend link-building energy
By coupling Top Pages with the Labs Backlinks Tool we launched a couple of weeks ago, you can get a pretty good idea of which pages on a domain have ranking potential and what links are powering that potential.
Enjoy, and be sure to send us feedback if you like this!


.png)

.png)



.png)