Definition - What does Spider mean?

In the context of the Internet, a spider is a specialized software designed to systematically crawl and browse the World Wide Web usually for the purpose of indexing Web pages in order to provide them as search results for user search queries. The most famous of such spiders is the Googlebot, Google's main crawler, which helps to ensure that relevant results are returned for search queries.

Spiders are also known as Web crawlers, search bots or simply bots.

Techopedia explains Spider

A spider is essentially a program used to harvest information from the World Wide Web. It crawls through the pages of websites extracting information and indexing it for later use, usually for search engine results. The spider visits websites and their pages through the various links to and from the pages, so a page without a single link going to it will be difficult to index and may be ranked really low on the search results page. And if there are a lot of links pointing to a page, this would mean that the page is popular and it would appear higher up on the search results.

Steps involved in Web crawling:

  • The spider finds a site and starts crawling its pages.
  • The spider indexes the words and contents of the site.
  • The spider visits the links found on the site.

Spiders or webcrawlers are just programs and, as such, they follow systematic rules set by the programmers. Owners of websites can also get in on this by telling the spider which portions of the site to index and which should not. This is done by creating a "robots.txt" file that contains instructions for the spider regarding which portions to index and links to follow and which ones it should ignore. The most significant spiders out there are those owned by major search engines such as Google, Bing and Yahoo, and those meant for data mining and research, but there are also some malicious spiders written to find and collect emails for the user to sell to advertisement companies or to find vulnerabilities in Web security.

Share this:

Connect with us

Email Newsletter

Join thousands of others with our weekly newsletter

The 4th Era of IT Infrastructure: Superconverged Systems
The 4th Era of IT Infrastructure: Superconverged Systems:
Learn the benefits and limitations of the 3 generations of IT infrastructure – siloed, converged and hyperconverged – and discover how the 4th...
Approaches and Benefits of Network Virtualization
Approaches and Benefits of Network Virtualization:
Businesses today aspire to achieve a software-defined datacenter (SDDC) to enhance business agility and reduce operational complexity. However, the...
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.