|
Page Rank Explained Featured Sites |
How is PageRank Used?PageRank is one of the methods Google uses to determine a page's relevance or importance. It is only one part of the story when it comes to the Google listing, but the other aspects are discussed elsewhere (and are ever changing) and PageRank is interesting enough to deserve a paper of its own. PageRank is also displayed on the toolbar of your browser if you've installed the Google toolbar ( http://toolbar.google.com/ ). But the Toolbar PageRank only goes from 0 – 10 and seems to be something like a logarithmic scale:
We can't know the exact details of the scale because, as we'll see later, the maximum PR of all pages on the web changes every month when Google does its re-indexing! If we presume the scale is logarithmic (although there is only anecdotal evidence for this at the time of writing) then Google could simply give the highest actual PR page a toolbar PR of 10 and scale the rest appropriately. Also the toolbar sometimes guesses! The toolbar often shows me a Toolbar PR for pages I've only just uploaded and cannot possibly be in the index yet! What seems to be happening is that the toolbar looks at the URL of the page the browser is displaying and strips off everything down the last “/” (i.e. it goes to the “parent” page in URL terms). If Google has a Toolbar PR for that parent then it subtracts 1 and shows that as the Toolbar PR for this page. If there's no PR for the parent it goes to the parent's parent's page, but subtracting 2, and so on all the way up to the root of your site. If it can't find a Toolbar PR to display in this way, that is if it doesn't find a page with a real calculated PR, then the bar is greyed out. Note that if the Toolbar is guessing in this way, the Actual PR of the page is 0 - though its PR will be calculated shortly after the Google spider first sees it. PageRank says nothing about the content or size of a page, the language it's written in, or the text used in the anchor of a link! DefinitionsI've started to use some technical terms and shorthand in this paper. Now's as good a time as any to define all the terms I'll use:
That's enough of that, let's get back to the meat… So what is PageRank?In short PageRank is a “vote”, by all the other pages on the Web, about how important a page is. A link to a page counts as a vote of support. If there's no link there's no support (but it's an abstention from voting rather than a vote against the page). Quoting from the original Google paper, PageRank is defined like this: We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows: PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one. PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web. but that's not too helpful so let's break it down into sections.
How is PageRank calculated? Find out my guess.. |
||||||||||||||||||||
| © 2005 Copyright - NextPageRank.com | |||||||||||||||||||||