Cloud Technologies: ZooKeeper

ZooKeeper synchronizes machines across various processes within large clusters. It is one of the cloud based technologies that is gaining notable acceptance. The biggest public adopters include Rackspace, Yahoo and Zynga.

Imagine maintaining configuration information that each machine in a 1000 node cluster must be aware of and sensitive to any changes. ZooKeeper is specifically for solving these types of problems.

Often building applications within a cloud infrastructure can focus on the functionality and business requirements and short change the small details that large processes involve. For example, imagine a hotel company that had a large cluster designed to process business intelligence information about its customers and hotel stays. Within this BI application, perhaps there is a Booking Demographics Report process that depends on a sub process that runs a statistical algorithm on customer and booking information. Perhaps this process is close to real time so is constantly processing new data. The Booking Demographics Report though would have to wait or block on the sub process. If each of these processes involved the coordination of multiple machines, ZooKeeper would maintain that organization and make seamless integration of a hierarchy of processes possible.

Other uses of ZooKeeper include locks, barriers, leader election, configuration management, queues and more.

Cloudera has a good overview article on ZooKeeper. And Dzone has a good primer with overview diagram and Getting Started.

Related Posts


1 comment for “Cloud Technologies: ZooKeeper

  1. April 3, 2012 at 8:30 am

    Nice post, Read more on “Configuration management” on the link above or click here to visit – :)

Leave a Reply

Your email address will not be published. Required fields are marked *

CommentLuv badge
Google Analytics integration offered by Wordpress Google Analytics Plugin