"Deep web" is distinct from "dark web".(invisible web or hidden web) The "dark web" is the encrypted network that exists between Tor servers and their clients, whereas the "deep web" is simply the content of databases and other web services that for one reason or another cannot be indexed by conventional search engines.
The deep web includes many very common uses such as web mail, online banking but also paid for services with a paywall such as video on demand, and many more.
Computer scientist Mike Bergman is credited with coining the term deep web in 2000 as a search indexing term
The first conflation of the terms "deep web" and "dark web" came about in 2009 when the deep web search terminology was discussed alongside illegal activities taking place on the Freenet darknet.
Since then, the use in the Silk Road's media reporting, many people and media outlets, have taken to using Deep Web synonymously with the dark web or darknet, a comparison Bright Planet rejects as inaccurate and consequently is an ongoing source of confusion. Wired reporters Kim Zetter and Andy Greenberg recommend the terms be used in distinct fashions.
size
- Contextual Web: pages with content varying for different access contexts (e.g., ranges of client IP addresses or previous navigation sequence).
- Dynamic content: dynamic pages which are returned in response to a submitted query or accessed only through a form, especially if open-domain input elements (such as text fields) are used; such fields are hard to navigate without domain knowledge.
- Limited access content: sites that limit access to their pages in a technical way (e.g., using the Robots Exclusion Standard or CAPTCHAs, or no-store directive which prohibit search engines from browsing them and creating cached copies)
- Non-HTML/text content: textual content encoded in multimedia (image or video) files or specific file formats not handled by search engines.
- Private Web: sites that require registration and login (password-protected resources).
- Scripted content: pages that are only accessible through links produced by JavaScript as well as content dynamically downloaded from Web servers via Flash or Ajaxsolutions.
- Software: certain content is intentionally hidden from the regular Internet, accessible only with special software, such as Tor, I2P, or other darknet software. For example, Tor allows users to access websites using the .onion host suffix anonymously, hiding their IP address.
- Unlinked content: pages which are not linked to by other pages, which may prevent web crawling programs from accessing the content. This content is referred to as pages without backlinks (also known as inlinks). Also, search engines do not always detect all backlinks from searched web pages.
- Web archives: Web archival services such as the Wayback Machine enable users to see archived versions of web pages across time, including websites which have become inaccessible, and are not indexed by search engines such as Google