Using ElasticSearch to drive extremely fast Search applications
More and more IT applications, both transactional as well as reporting and analytics systems, are faced with the need to deal with textual data at a volume & velocity that is pushing traditional text search technologies beyond breaking point. A variety of solutions, both commercial and open-source, have emerged to address the growing need for a high-performance, scalable, yet affordable solution for Text Search.
ElasticSearch seems to be the clear front-runner among the open-source options available currently.
This article examines the architecture and features of ElasticSearch and outlines some use cases that seem to be particularly well-suited for the use of ElasticSearch.
ElasticSearch – Genesis, Architecture & Features
ElasticSearch is the result of a GitHub project that essentially re-architected Lucene, Apache’s very popular Java library for full-text search, for the Hadoop distributed, HDFS-based framework. ElasticSearch therefore inherits all of Hadoop’s Big Data Technology characteristics including its support for horizontally scalability and fault-tolerance.
As with Hadoop, a typical ElasticSearch implementation consists of a cluster of nodes running …..ES..HDFS..MapReduce jobs to convert incoming text to JSON objects for ingestion into ES storage…:
Features of ElasticSearch:
Simply add more servers into your Hadoop farm to scale out.
Response times of a few milliseconds on searches across terabytes of data.
Robust, full-featured, intuitive API to support query applications built on top of your ElasticSearch appliances. JSON support allows for language-independent querying.
Support for queries involving geographic bounding, wildcards, phrases, etc.
Support for flexible and heterogeneous schemas facilitates storage of content in native formats.
Salient Use Cases
Call Center Log Analysis
- typical call center..X calls per minute per person…X megabytes of text from audio conversion. terabytes of logs per month.
- can implement very useful search and text analytics apps on top of call center logs
- e.g. finding products/issues generating most calls, using ES support for geographic queries to identify support call hot spots
- powerful and scalable web log analysis
- Visualization and slice/dice of web log data in ES using http://www.elasticsearch.org/overview/kibana/
E-Commerce Site – Product Search
- Enabling full-text search across product descriptions on tens of thousands of SKUs on a e-commerce site
- improved customer experience and conversion rates
- faceted search
- Highly scalable search apps for media companies
- Access to heretofore inaccessible historical content
- Stream social media feeds into ES and enable fast searches for mentions of specific brands and keywords across terabytes of data