Self-Organization and Identification of Web Communities

Gary William Flake, Steve Lawrence, C. Lee Giles, Frans Coetzee. Self-Organization and Identification of Web Communities. IEEE Computer, 35(3):66-71, 2002. [doi]

Abstract

Despite the decentralized and unorganized nature of the web, we show that the web self-organizes such that communities of highly related pages can be efficiently identified based purely on connectivity. This discovery allows the identification of communities independent of, and unbiased by, the specific words used by authors. Applications include improved search engines, content filtering, and objective analysis of relationships within and between communities on the web.