Monthly Archives: November 2011

The current CouchDB ecosystem

Over the last few weeks I have started to familiarise myself with the current state of the CouchDB ecosystem. I hope that this will be the first of several posts in which I will be able to detail some of the things I been able to learn about CouchDB so far, and also in the future when we are finally able to put this database into production.

With all the different companies, similar sounding projects names and technology buzzwords surrounding CouchDB, it can often seem very confusing, and some people have found it difficult to sift through all the jargon, and come away with a firm grasp on exactly who and what is behind CouchDB.

Let’s start off by defining some of the players and an overview of what they provide:

1) Apache CouchDB – is an open source, document-oriented database, it is part of the new breed of databases commonly referred to by some as NoSQL datastores.  CouchDB uses JSON to store documents, and you can interact with the database using a RESTful JSON API.

2) Couchbase – is a company that provides software and enterprise support for several projects that are based on the CouchDB source code. Lets take a look at a couple of these projects in more detail:

  • Membase Server (Couchbase) – is an elastic, distributed, key-value database management system optimized for storing data for web applications. For anyone already familiar with with memcached, Membase is basically memcached on steroids, it allows you to build a distributed cluster of memcached instances, and it provides options for both persistent and non persistant storage. Another nice feature provided by Membase is a web gui that is displays all sorts of useful statistics that can help you understand exactly what is going on with your servers.
  • Couchbase Single Server – is the software package that you would download if you were looking for a replacement for (or an equivalent to) Couchdb. This is basically the stock Apache Couchdb source code, with the geocouch extension enabled, as well as some additional patches provided by the developers that work for Couchbase.
  • Couchbase Server 2.0 – represents the future of both the Couchbase and Membase codebases. Server 2.0 basically removes the SQLite backend that is currently being used by Membase, and replaces it with CouchDB. At this point Server 2.0 has been released with a ‘Developer Preview’ status, and thus I do not believe it is quite ready for production use. Currently Couchbase Server 2.0 only allows you to access the data stored in the backend CouchDB database using the memcache protocol (you have to go through memcache to access the data stored in CouchDB). Future versions (3.0, etc) promise to allow you to access the data via both the memcache protocols as well as the CoucbDB RESTful JSON API, but this is currently not the case, and will  most likely not be available for some time.

3) BigCouch – an open source version of CouchDB written in Erlang, that allows scaling beyond a master/slave architecture via database sharding, A BigCouch deployment will be seen as a single large CouchDB instance from the application perspective.

4) Couchdb-lounge – an open source project which uses Nginx and Python to provide a proxy based framework to achieve additional scaling beyond a master/slave architecture for Couchdb.

5) Cloudant – an enterprise software company which provides CouchDB hosting, enterprise support, as well as being the company behind BigCouch.