What is Apache Cassandra?

Connect with

What is Apache Cassandra?What is Apache Cassandra? Key Features: Cassandra operation per sec, History of cassandra, Architecture of Cassandra, Cassandra is Faul-tolerance, massive scaling NOSQL database.

1. What is Apache Cassandra in 50 Words

“Apache Cassandra is an open-source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneably consistent, row-oriented database that bases its distribution design on Amazon’s Dynamo and its data model on Google’s Bigtable. Created at Facebook, it is now used at some of the most popular sites on the Web.”

I had picked above 50 words from Cassandra: The Definitive Guide, 2nd Edition – O’Reilly Media , I highly recommend reading this book if you really want to go in-depth of Apache Cassandra.

If you want to learn Apache Cassandra quickly in few days you can go for Apache Cassandra Essential . I’m the lead reviewer of this book and I know once you pick this you love to learn Cassandra very quickly.

2. Key Features of Cassandra

For understanding whis Cassandra you must know features of Cassandra. Cassandra is feaure rich best of Amazon’s Dynamo and Google’s BigTable.

  1. High Availability
  2. NO SPOF (Single Point of Failure)
  3. Scale Horizontally (Linear Availability / Scale Out)
  4. Peer-to-peer Architecture ( no primary secondary)
  5. Eventual Consistency
  6. Tunable tradoff between consistency and latency
  7. Minimum Administration
Apache Cassandra High Level Features
Fig: Apache Cassandra High Level Features

3. Cassandra Operations per Sec

This graph gives you more insight about how much Cassandra accept read and write operations per second.

Cassandra ops/sec
Fig: Apache Cassandra ops/sec

4. Who are Using Cassandra?

There are so many companies using Apache Cassandra worldwide few of them are: Netflix, Twitter, Cisco, Rackspace, Constant Contact, Reddit, … The largest known Cassandra cluster has over 300 TB of data in over 400 machines.

5. Architecture of Cassandra

Apache Cassandra is a well known columnar database in the NoSQL space. You can find and learn more about its architecture here. In order to understand what is Apache Cassandra, you must understand architecture of Cassandra. Once you start understanding of architecture of Cassandra that will help you to understand what is Apache Cassandra.

5.1 Shared Nothing Architecture

The Cassandra database is a shared-nothing architecture, as it has no central controller and no notion of master/slave; all of its nodes are the same it means peer-to-peer architecture.
Shared-nothing architecture was more recently popularized by Google, which has written systems such as its Bigtable database and its MapReduce implementation that do not share state and are therefore capable of near-infinite scaling.

for more details visit this: http://db.cs.berkeley.edu/papers/hpts85-nothing.pdf

Cassandra is distributed, which means that it is capable of running on multiple machines.
The fact that Cassandra is decentralized means that there is no single point of failure. All of the nodes in a Cassandra cluster function exactly the same. This is sometimes referred to as “server symmetry.” Because they are all doing the same thing.

In short, because Cassandra is distributed and decentralized, there is NO SPOF (single point of failure), which supports high availability.

5.2 Elastic Scalability

Scalability is a feature of the architecture of a system that can continue serving a greater number of requests with little degradation in performance. Vertical scaling—simply adding more hardware capacity and memory to your existing machine—is the easiest way to achieve this. Horizontal scaling means adding more machines.
Elastic scalability refers to a special property of horizontal scalability.

5.3 High Availability in Cassandra

The availability of a system is measured according to its ability to fulfill requests always. it can be measured in how much 9 e.g. four 9 i.e. 99.99. You can visit understaning of High Availability to know more details about availability.

5.3 Fault Tolerance in Cassandra

First of all, understand what is Fault tolerance? Database remains operation even if the physically divides the network into 2 clusters of the database. If you want to deep dive into Fault tolerance, you must visit CAP theorem, in CAP theorem P stands for partition tolerance.

6. History of Apache Cassandra

Cassandra was born by marrying of Google’s Bigtable and Amazon’s Dynamo paper. Started development at Facebook in 2006, in Java Language. The history of Cassandra helps you to understand what is Apache Cassandra.

Apache Cassandra
Fig: Apache Cassandra Born from Bigtable and Dynamo

8. Reference

  1. Apache Cassandra
  2. Apache Cassandra Essential

I hope you enjoyed this post of What is Apache Cassandra, key features, history of Cassandra, Architecture of Cassandra, 50 words about Cassandra, operations per second in Cassandra. And you can visit Apache Cassandra tutorial for more blog post.
Your comments or suggestions are welcome to improve this post. cheers 🙂


Connect with

5 thoughts on “What is Apache Cassandra?”

  1. Pingback: Swati

  2. Pingback: Asmita

  3. Pingback: Swati

  4. Pingback: Balu

  5. Pingback: kan

Leave a Comment

Your email address will not be published. Required fields are marked *