Nematolah's shared items

Tuesday, July 20, 2010

Cassandra: Predicting the Future of NoSQL


Cassandra: Predicting the Future of NoSQL:

cassandra_ajax_bummer.jpgWhen Twitter announced a few weeks ago that it would not be using Cassandra for tweet storage, there was a flurry of 'I told you so's' from NoSQL skeptics. The folklorist in me found that rather amusing, as Cassandra in Greek mythology was cursed with the ability to see the future. But poor Cassandra could convince no one to believe her predictions, including a rather grim one about a Trojan Horse. The tech blogger in me figured, however, she should probably have a better grasp of Cassandra and NoSQL than just my knowledge of Homer.

SQL RDBMS BBQ

SQL and relational databases have long been the solution standard for data storage and retrieval. But new web applications that are being built today don't necessarily fit into this older schema. There are new demands on databases, not simply in terms of scalability, but also in terms of availability and unpredictability. In response, a number of new databases have been developed, loosely categorized as NoSQL.

Although the name sounds like a repudiation of SQL, it doesn't mean 'no SQL never ever.' It means 'not only SQL,' and offers a far more flexible and targeted response to database management.

NoSQL OMG


In a great summary of the NoSQL movement on Heroku's blog, Adam Wiggins gives the following examples of NoSQL usage:


  • Frequently-written, rarely read statistical data (for example, a web hit counter) should use an in-memory key/value store like Redis, or an update-in-place document store like MongoDB.
  • Big Data (like weather stats or business analytics) will work best in a freeform, distributed db system like Hadoop.
  • Binary assets (such as MP3s and PDFs) find a good home in a datastore that can serve directly to the user's browser, like Amazon S3.
  • Transient data (like web sessions, locks, or short-term stats) should be kept in a transient datastore like Memcache
  • If you need to be able to replicate your data set to multiple locations (such as syncing a music database between a web app and a mobile device), you'll want the replication features of CouchDB.
  • High availability apps, where minimizing downtime is critical, will find great utility in the automatically clustered, redundant setup of datastores like Cassandra and Riak.

  • Oh, Grow Up


    In a recent article on NoSQL in SD Times, Forrester analyst Mike Gualtieri says that NoSQL is 'not a substitute for a database; it can augment a database. For transaction types of processing, you still need a database. You need integrity for those transactions. For storing other data, we don't need that consistency. NoSQL is a great way to store all that extra data.' This cautious sort of approach - use NoSQL for 'extra data' but use SQL for the real stuff - is pretty common.


    It does allow people to take small steps towards NoSQL implementation. 'Don't migrate your existing production data,' suggests Wiggins, 'instead, use one of these new datastores as a supplementary tool.'


    This hesitation is understandable; the legacy of relational databases is substantial. In an interview with ReadWriteWeb, Nati Shalom, CTO of GigaSpaces spoke of the history of databases, with the financial sector being among the first to hit a wall, so to speak with scalability. The rise of social networking and the read/write web, alongside cloud technologies, has vastly reshaped our needs for and demands on database architecture as well as information retrieval.


    Shalom argues that the technology behind NoSQL is sound and will provide the solutions for addressing some of these issues. Nevertheless, he says, NoSQL still requires two things: better implementation and more maturity.

    What the future holds for NoSQL and for database management remains to be seen. There's a Cassandra joke to be made there, I'm sure.

    No comments:

    Post a Comment