NoSQL and Big Data

With such vast volumes of data available, being able to process it and digest it into useful and manageable information is a task that only NoSQL and specialist big data processors can accomplish.

Hadoop

Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Originally designed for computer clusters built from commodity hardware still the common use it has also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

MongoDB

MongoDB is a scalable and flexible database that keeps its data in JSON-esque documents where fields can vary from doc to doc, and the structure of the data can be developed and changed over time. Data can be accessed in a number of ways including ad hoc queries, real time aggregation, and indexing, and the document model works in such a way that it becomes much easier to work with. MongoDB counts Cisco, Amazon,com, Adobe, Bosch, KPMG, and Expedia amongst its users.

Elasticsearch

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. Official clients are available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine followed by Apache Solr, also based on Lucene.