Cisco CloudCenter: Get the Hybrid IT Advantage

Apache Lucene

Definition - What does Apache Lucene mean?

Apache Lucene is an open source project for a high performance and full-featured text search engine library which is written entirely using Java.

It is capable of full-text search within documents so it is a technology that is suitable for any application which requires this feature, especially if it is cross-platform.

It was first developed by Doug Cutting in 1999 and became officially part of the Apache Foundation’s Jakarta family of open source Java projects in September 2001. It was upgraded to a top level Apache project in February 2005.

Techopedia explains Apache Lucene

Apache Lucene is a high performance search engine with the concept of "a document containing fields of text" at its core logical architecture. This offers great flexibility and allows the Lucene API to become independent of any file format.

Any text from formats such as MS Word, HTML, XML, PDF, and OpenDocument can be indexed as long as the textual information can be extracted, which means that it cannot do anything with images.

Lucene is suitable for any application that needs a full text indexing and search capability, but it is widely recognized as a great utility for implementing Internet search engines and for local, single-site searching.

Features include:

  • Scalable and high performance indexing - it can process over 150 Gb per hour on modern hardware and requires only 1 Mb per heap of memory requirements.
  • Powerful, accurate and efficient search algorithms - it offers many types of powerful queries such as phrase, wildcard, proximity, and range queries. It also has fielded searching and sorting by any field.
  • Cross platform - pure Java implementation and also available in other programming languages.
Share this:

Connect with us

Email Newsletter

Join thousands of others with our weekly newsletter

The 4th Era of IT Infrastructure: Superconverged Systems
The 4th Era of IT Infrastructure: Superconverged Systems:
Learn the benefits and limitations of the 3 generations of IT infrastructure – siloed, converged and hyperconverged – and discover how the 4th...
Approaches and Benefits of Network Virtualization
Approaches and Benefits of Network Virtualization:
Businesses today aspire to achieve a software-defined datacenter (SDDC) to enhance business agility and reduce operational complexity. However, the...
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.