Skip to main content


Showing posts with the label document indexing

Indexing and Searching with Apache Lucene 4.7 with Example

This article is about Indexing and Searching documents with Apache Lucene version 4.7. Before jumping to example and explanation, let's see what Apache Lucene is. Introduction to Apache Lucene Lucene is a high-performance, scalable information retrieval (IR) library. IR refers to the process of searching for documents, information within documents, or metadata about documents. Lucene lets you add searching capabilities to your application. [ ref. Apache Lucene in Action Second edition covers Apache Lucene v3.0 ] The main reason for popularity of Lucene is its simplicity. You don't require in-depth knowledge of indexing and searching process to get started with Lucene. You can start with learning handful of classes which actually do the indexing and searching for Lucene. The latest version released is 4.7 and books are only available for v3.0. Important note Lucene is not ready-to-use application like file-search program, web-crawler or search engine. It is a software t