The Internet has recently celebrated its thirty-sixth anniversary. Originally designed around 1969 to allow the exchange of packets of bits between computers, it remained for a long time restricted to the exchange of scientific data between scientists and secure information within governments. Then electronic mail and bulletin boards became increasingly popular among those with access to it. Actually, it was only in the 1990’s that the Internet became a popular means of communication. When in 1993, the US federal government opened up the network to commerce, the creation of the Hypertext Mark-Up Language (HTML) laid the basis for universal accessibility.
Since then the growth has been phenomenal. Different surveys today suggest that fifteen percent of people worldwide are using the Internet, or simply “the Net.” The daily use of the World Wide Web is gaining tremendous popularity among those possessing the adequate tools and means to explore it, and the number of users increases by the hour. In fact, the Internet has revolutionized the computer and communications world like nothing before. The invention of the telegraph, telephone, radio, and computer set the stage for this unprecedented integration of capabilities. The Internet is at once a world-wide broadcasting capability, a mechanism for information dissemination, and a medium for collaboration and interaction between individuals and their computers without regard for geographic location.
According to a recent research published by Gulli and Signorini (2005), Google claims to index more than 8 billion pages, MSN Beta claims about 5 billion pages, Yahoo! at least 4 billion and Ask/Teoma more than 2 billion. But estimating the size of the whole Web is quite difficult, due to its dynamic nature. Nevertheless, it is possible to assess the size of the publicly index-able Web. As the two scholars state, the index-able Web is defined as “the part of the Web which is considered for indexing by the major engines.” In their short paper, Gulli and Signorini managed to revise and update the estimated size of the index-able Web to at least 11.5 billion pages as of the end of January 2005. They also estimated the relative size and the overlapping that occurs by the largest Web search engines. Precisely Google was found to be the largest engine, followed by Yahoo!, Ask/Teoma, and MSN Beta. Adopting the methodology proposed in 1997 by two other scholars, Bharat and Broader, but extending the number of queries used for testing from 35,000 in English, to more than 438,141 in 75 different languages, the two researchers remark that an estimate of the size of the web is useful in many situations, such as when compressing, ranking, spidering, indexing and mining the Web.
The Web, as it stands today, has allowed global interpersonal exchange on a scale unprecedented in human history. People separated by vast distances, or even large amounts of time, can use the Web to exchange, or even mutually develop, their most intimate and extensive thoughts, or alternately their most casual attitudes and spirits. Emotional experiences, political ideas, cultural customs, musical idioms, business advice, artwork, photographs, literature, can all be shared and disseminated digitally with less individual investment than ever before in human history.