OpenWebSpider (Host/Page)Rank
}OpenWebSpider Host Rank
Reference Table:
|HostID|LinkedHostID|...(More info)...|
.A
./ \.
B C
Ex.
A -> B (A links B) (B is linked from A)
B -> A (B links A)
C -> A
|A|B|...
|B|A|...
|C|A|...
}OpenWebSpider Page Rank
Host : test
HostRank : 5 (example)
MaxPRLev : 10 (definition)
Structure :
p0 Level : 1 PR(p0) : 15
/ | \
p1 p2 p3 : 2 PR(p0,p2,p3) : 10
| | \
p4 p5 p6 : 3 PR(p4,p5,p6) : 8.3
PR(px)=HostRank+(MaxPRLev/Level)
PR(p0)=5+(10/1) = 15
PR(p1)=5+(10/2) = 10
...
PR(p5)=5+(10/3) = 8.3
Note:
- HostRank is calculated as explained in the related doc
- I assume MaxPRLev as the deepest level of the tree of the pages
that can make growing the relevance of a page (as you can see:
page rank for the home page(p0) is the highest value of the tree)
A real situation:
> select * from hosts.hostrank where linkedhost="http://shen139.altervista.org"
host linkedhost
---------------------------- -----------------------------
http://www.openwebspider.org http://shen139.altervista.org
In this case I assume HR(http://shen139.altervista.org)=1
> select * from hosts.hostrank where linkedhost="http://www.openwebspider.org"
host linkedhost
----------------------------- ----------------------------
http://hacklab.altervista.org http://www.openwebspider.org
http://shen139.altervista.org http://www.openwebspider.org
http://www.eviltime.com http://www.openwebspider.org
HR(http://www.openwebspider.org)=3
Now let's see how this reflect the pagerank:
> select hostname,page,level,rank from pagelist order by rank DESC limit 100;
hostname page level rank
-------------------------------- ----------------------------- ------ ------
www.openwebspider.org / 1 14
shen139.altervista.org / 1 12
www.deliciousitaly.com / 1 11
www.rivieraligure.it / 1 11
www.starhotels.it / 1 11
www.viareggino.it / 1 11
...
www.marrosso.net / 1 11
hacklab.altervista.org / 1 11
herzog.splinder.com / 1 11
www.hurricane.it / 1 11
www.fumettopoli.com / 1 11
www.lupinthe3rd.net / 1 11
www.openwebspider.org /feedback.php 2 9
www.openwebspider.org /search.php 2 9
www.openwebspider.org /screenshoots.php 2 9
www.openwebspider.org /documentation.php 2 9
www.openwebspider.org /openwebspider_download.php 2 9
www.openwebspider.org /index.php 2 9
shen139.altervista.org /not_found.html 2 7
shen139.altervista.org /shllcdadv.c 2 7
shen139.altervista.org /ssss.html 2 7
www.openwebspider.org /advanced_search.php 3 7
www.openwebspider.org /openwebspider_arguments.php 3 7
www.openwebspider.org /configure_mysql_server.php 3 7
www.openwebspider.org /openwebspider_php_example.php 3 7
www.viareggino.it /cinema/ 2 6
www.viareggino.it /principedipiemonte/ 2 6
...
NOTE(1):
As you can see we have at the top of the list: www.openwebspider.org (HR=3)
followed by shen139.altervista.org(HR=1) (all other hosts has HR=0)
Another intresting thing is that all the pages of www.openwebspider.org at
level=3 has rank higher than other pages with minor level and that pages of
level=2 for shen139.altervista.org has the same rank of the pages of level=3
for www.openwebspider.org!
NOTE(2):
This may change in future!