Leading NoSQL Databases To Consider


NoSQL-DBs

Interactive applications have greatly evolved over the past fifteen years, and so has the database management needs for those applications. Today, a new non-relational database type known as ‘NoSQL’ is gaining traction in the enterprise as an alternative model for database management. Despite the fact that early stacks of code were simply experiments, current NoSQL databases are more mature and ready for action. The adoption of NoSQL technology is largely being driven by three co-related trends; cloud computing, large number of users (millions to billions), and big data.

NoSQL databases have different database models compared to their RDBMS (Relational Database Management Systems) counterparts. These systems can be divided into four distinct groups.

  1. Key/value based: These databases work by matching key with specific values, similar to a map or dictionary. They are efficient, extremely performant, and easily scalable.
  2. Column based: These databases work by creating collections of one or more key/value pairs that match a specific record. They are also referred to as extensible record stores, wide columnar stores, or column oriented stores.
  3. Document based: Key value pairs are encapsulated in JSON or JSON like documents. The keys within each document have to be unique. Unlike key based, the values are not opaque to the system and can be queried.
  4. Graph based: These databases are specialized in efficient management of heavily linked data.

Column-Family

Apache HBase

Known for running on top of HDFS (Hadoop Distributed File System), Apache HBase is secure, scalable, distributed, secure, and offers high availability. HBase is capable of handling large data tables containing millions of columns and billions of rows while utilizing CPU, memory, and storage resources across multiple servers within a cluster. Hadoop’s reduce/map structure is ideal for complex computational jobs or queries that are farmed out to every node.

Cassandra

The Apache Cassandra project emerged out of Facebook in 2008 and has now become a fully mature database tool used for most large data stores. It offers high availability, fault-tolerance, and scalability on cloud infrastructure, virtual systems, or hardware. Cassandra’s mechanism provides a hybrid mixture of a key/value store with a column-oriented database. With log-structured updates, column indexing, materialized and denormalized views, and built in caching, Cassandra has become the ideal tool for large scale organizations that need to store data too large to fit on a server.

Hypertable

Hypertable is modeled after Google’s Bigtable; it uses a block and key-prefix data compression and has a flattened out table structure. Aside from the fact that data is represented in tables of information in columns and rows; Hypertable has little resemblance to a traditional RDBMS. Notable features include ‘realtime’ scaling, cell versioning, namespaces, and column qualifiers. Hypertable can be used as an alternative to HBase or Accumulo.

Other NoSQL databases in the column family include; Accumulo, Amazon SimpleDB, Clouddata, Cloudera, HPCC, Apache Flink (formerly referred to as Stratosphere), and Splice Machine.8

Document Store

CouchDB

CouchDB is a specifically built for web application database needs; it completely lacks a pre-defined schema or data structure. Data arrives in JavaScript’s JSON format, its queries are written in JavaScript, and the data goes back in JSON. CouchDB supports both mobile and web applications (CouchDB can be used offline in the background of mobile apps). Using JavaScript for description, CouchDB aggregates, joins, and reports on database documents without affecting the underlying structure of the documents. It is ideal for accumulating and occasionally changing data, on which pre-defined queries are to be run.

MongoDB

MongoDB is an open source document database written in C++. It has all the traditional features that define NoSQL: JavaScript formatting, value/key storage, and flexible replication for sharding. All data is written based on a philosophy MongoDB refers to as multi-version concurrency control; this is a structure where older versions of the data are kept around to help maintain consistency in complex transactions. A major advantage of MongoDB is the embedded arrays and documents, which reduce the need for expensive joins. On top of that, its dynamic schema supports articulate polymorphism and documents correspond to native data types in most programming languages.

Other NoSQL databases in the Document Store category include; Elasticsearch, Couchbase Server, RethinkDB, RavenDB, MarkLogic Server, NeDB, Terrastore, JasDB, RaptorDB, djondb, EJDB, Amisa Server, densodb, SisoDB, and ThruDB.

Graph Based

Neo4j

Unlike other NoSQL databases that store flexible bundles of values and keys; Neo4j stores the relationships between objects, a structured commonly referred to as ‘graph’ by mathematicians. Neo4j includes several algorithms for analyzing and searching the relationships, enabling users to efficiently search based on different relationships. The use ‘graph traversal’ algorithms eliminate the trouble of chasing pointers. Neo4j is ideally used interconnected, rich or complex, graph-style data.

Other graph based NoSQL databases include; OrientDB, FlockDB, Infinite Graph, DEX, TITAN, InfoGrid, HyperGraphDB, GraphBase, and Trinity.

Key-Value Based

Redis

Redis is an in-memory, networked, key-value data store NoSQL database written in ANSI C. Its key features include: improved performance through in-memory storage, master-slave replication, and dictionary data model key-mapped to values. Redis also provides alpha stage clustering in PaaS and IaaS platforms. It can also be used as a managed service without launching the VM instance of the database.

Riak

Riak can be viewed as both a distributed database and cloud storage solution. It is a database geared towards offering cloud storage to any scale in both public and private clouds; by providing eventual consistency to data stored on a collection of nodes that can grow anytime there is a rise in demand. Map/reduce queries in Riak can be written in either Erlang or JavaScript. Data stored in Riak is private by default; however, data visibility can be refined further using Access Control Lists.

Other key-value based NoSQL databases include: DynamoDB, LevelDB, Aerospike, FoundationDB, Berkeley DB, Oracle NoSQL Database, GenieDB, BangDB, and Scalaris.

Conclusion

NoSQL databases are progressively becoming a key component of the database landscape; especially as more organizations begin to realize that operating at scale is better achieved on clusters of standard, commodity servers, and that a schema-less data model is more ideal for the type and variety of data captured and processed today. When optimally used, NoSQL databases can provide several benefits; however, enterprises should ensure they are fully aware of the legitimate issues and limitations associated with NoSQL databases before adopting them.

 

Gabriel Lando