An index of virtually every English political site on the web. This index contains more than 1.8 million web sites, crawled and classified by language (English/non-English) and political content. Of these, roughly 800,000 are political sites.


This automated snowball census was conducted 8/1/2010.

The complete 2010 index (107 MB, zipped csv)
1% sample of the 2010 index


To conduct this census, we develop snowCrawled, an open-source python library for directed web crawls. The snowCrawl code repository is hosted at http://code.google.com/p/snowcrawl/. Please visit that site to download the library and examples.


For a description of the process used to generate this census, please see the working paper: An automated snowball census of the political web at SSRN.