Badware URL Analysis



One of the projects I am affiliated with in an advisory capacity is the Berkman Center’s StopBadware.org project. Over the weekend (2007-03-25) I scraped and analyzed the 18328 badware URLs from StopBadware.org’s Badware Website Clearinghouse, a “a collaborative effort to build a comprehensive list of websites that host, link to, or otherwise distribute badware”. The results are available here.

The source of all of the URLs (100%) was Google, one of the corporate sponsors of StopBadware.org. Although there are 18328 URLs there were only 6856 distinct IP addresses and 0.4% of the URL’s were given a decision of “Badware” — “Sites that StopBadware has tested itself and determined to contain or link to badware” –, with the balance being listed as “Undetermined”.

  • The top TLD’s were .com: 10710, .org: 1550, .info: 1352, .net: 1300 followed by the ccTLD’s cn: 1216, .ru: 352, .uk: 275, .ua: 226, .it: 129 and .pl: 118.
  • The top countries (based on IP allocation) in which badware URLs are hosted are US: 10037, CN: 4336, ?? (unknown): 1357, DE: 433, RU: 361, GB: 349, UA: 210, IT: 186, CA: 154 and NL: 81.
  • The top AS number are AS30380: 3435, AS4134: 1819, AS17233: 1537, ASNA (unknown): 1315 and AS21844: 734.
  • The top network names are IPOWER – iPowerWeb, Inc.: 3435, CHINANET-BACKBONE No.31,Jin-rong Street: 1819, ATT-CERFNET-BLOCK – AT&T Enhanced Network Services: 1537 NA (unknown: 1315 and THEPLANET-AS – THE PLANET: 734.

An interesting note is that Google appears as the 13th (GOOGLE – Google Inc.: 169) network name with 169 badware URLs all of which appear to be *.blogspot blogs.

Post a comment.