Cassandra – Blog of a Pragmatic Generalist

Overview

Very nice overview video on : http://cassandra.apache.org. A few points were insightful, so I decided to outline them here.

Prehistory

– General purpose DB systems (Oracle, MS SQL, etc… ) solve most of the problems, but point is that they don’t solve ’em all.

– Most obvious problem is scalability, which is solved using vertical scaling in Relational Databases (faster CPUs, faster disks, etc…). To make things worse more serious RDBMS systems cost a lot of money.

– Papers on BigTable and Dynamo present a new approach to solve those problems. Basically the idea is a “system that can handle huge amounts of data horizontally” (over several networked PCs)

– Google’s BigTable entirely depends on distributed file system they already had. In addition, they developed very sophisticated “Sparse” table mechanisms.

– Amazon’s Dynamo is strictly “Distributed Hash Table”. Due to their “high-availability” requirement, it is also one-hop DHT, meaning they should have advanced hashing algorithms. I just guess that they don’t have hierarchical hasing, because if they had, it wouldn’t be called “one-hop” anymore (unless they call O(1) = “one hop”).

– Amazon also uses BASE ( = eventual consistency) because their primary concerns are availability and consistency.

Where does Cassandra fit today?

So called NoSQL projects and those that were specifically designed to scale and deal with huge amounts of data:

Name 				Big Data
- HBase 			[V]	- more like BigTable
- MongoDB			[x]
- Riak				[V]	- more like Dynamo
- Voldemort			[V]	- more like Dynamo
- Neo4J				[x]
- Cassandra			[V]     - combines BOTH
- Hypertable		        [V]	- more like BigTable
- HyperGraphDB		        [x]
- Memcached			[x]
- Tokyo Cabinet		        [x]
- Redis 			[x]
- Couch DB			[x]

CAP Theorem – states that databases can be either CP or AP. Term was coined by Eric Brewer (UC Berkeley).

BigTable approach focuses more on CP(Consistency-PartitionTolerance), while Dynamo provides more AP(Availability-PartitionTolerance).

Cassandra is considered AP. However, I fully agree with the author on his objection of “pick-two” approach that implies that a system can be viewed as “fully consistent and totally not available” or the other way around. There is, rather, a gradual trade-off between those two concepts. “Eventual Consistency”, in fact, is an example of that tradeoff.

Now, more about Cassandra itself:

– It is symmetric, which means all nodes are exactly the same and there is no centralization of control, hence there is no single point of failure.

– It is linear, meaning increasing size of a cluster will increase storage capacity and querying capacity as well.

– Partitioning is flexible

– Easy to grow

– Highly available (due to ideas borrowed from Dynamo)

As I was guessing, P2P routing is used to communicate between clusters, which means no centralized bottlenecks. It is interesting to note that P2P communication is structured in a Ring model, which means each cluster can pass messages to one cluster only. I guess too many clusters might introduce some delays in message propagations and, hence, some inconsistency. It is interesting how they solved those consistency issues.

For instance, if node receives a request it might serve it while update operation is being propagated over the network, causing some inconsistency. However, due to the fact that most of the related requests will be sent from the same location (and hence to the same node) we might think that this is not a big issue. But it is an issue after all…