Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
A spider trap is an online feature that traps a web crawler or bot in an infinite loop or other recursive situation that takes up its resources and essentially ties up that crawler for a specific set of iterations.
A spider trap is also known as a crawler trap.
Some spider traps are made to intentionally divert the attention of web crawlers. For example, someone may program a deep directory structure so that the web crawler goes crawling down into that structure instead of moving to other areas of a site or online space. Programmers can also overload the crawler’s lexical analyzer, or load up a session with cookies in order to drain the resources of spambots or other crawlers.
Other spider traps are made unintentionally through programming errors. Some types of calendar references can cause seemingly infinite loops and crash poorly made crawlers.
The use of spider traps and other designs to foil robotic web crawlers is going to change with the emergence of new machine learning and artificial intelligence principles. Since designers have proven that they can now make web crawlers that can respond to on-page commands in the same way that humans do, resources like CAPTCHA are no longer going to be effective against robotic users. Spider traps and crawler traps probably will not, either, because the parties using the crawlers or bots will invest them with the ability to recognize these traps and avoid them.