Big Data’s first IPO

April 23, 2012

Big Data’s first IPO has hit NASDAQ and I’m sure there will be plenty more to come. Why’s this a big deal? Big Data, defined as crunching numbers in the Giga, Tera and Petabyte levels, is what is taking business […]

Share
0

Overview of Hive for Hadoop

March 23, 2012

Hive is a data warehousing software system that sits on top of Hadoop and facilitates querying by users not literate in MapReduce. Hive was originally developed by Facebook and now enjoys support by many companies after Facebook donated the software […]

Share
0

Cloud Technologies: ZooKeeper

March 9, 2012

ZooKeeper synchronizes machines across various processes within large clusters. It is one of the cloud based technologies that is gaining notable acceptance. The biggest public adopters include Rackspace, Yahoo and Zynga. Imagine maintaining configuration information that each machine in a […]

Share
1

ETL for Hadoop- Sqoop

October 6, 2011

Enter Sqoop, the ETL (Extraction, Transformation, Load) tool for Hadoop. Hadoop runs on data, the bulk of it might be in flat files, but must include data across a business’ entire platform. In a classic data warehouse, ETL tools are […]

Share
1

Oracle’s new Hadoop product??

October 5, 2011

Oracle, king of normalized, structured data, has announced its entry into the real Big Data field. Oracle’s no fan of open source projects and has “sought to expose their limitations and sow some serious doubt over their open-source roots” of […]

Share
0

Improving Hadoop documentation and configuration

August 10, 2011

Ari Rabkin, an intern with Cloudera and a Ph’d student at UC Berkeley, came up with a tool to improve documentation of configuration options in Hadoop. Many open source projects struggle with documentation. With limited time available, open source developers […]

Share
0

Switch to our mobile site

Google Analytics integration offered by Wordpress Google Analytics Plugin