What is it all about?
Apache HBase is the Hadoop database, a distributed, scalable, big data store. Use Apache HBase when you need random, real-time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al.
* Linear and modular scalability. * Strictly consistent reads and writes. * Automatic and configurable sharding of tables * Automatic failover support between RegionServers. * Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables. * Easy to use Java API for client access. * Block cache and Bloom Filters for real-time queries. * Query predicate push down via server side Filters * Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options * Extensible jruby-based (JIRB) shell * Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX