The core of our technology is HC2 a massively scalable distributed object-store. HC2 contains BSD licensed code from Sun’s Project Honeycomb. Project Honeycomb went through two product releases and won internal and external awards (Sun Chairman’s award, Infoworld 2009 Technology of the Year Award). TierraCloud has “modernized” the code to make it relevant for the next decade.

HC2 runs on a cluster of commodity servers creating a highly reliable storage system that appears as one management entity and a flat name space. It provides http APIs to allow programmatic APIs. The storage system is immutable guaranteeing authenticity and integrity of an object. HC2 also has a built-in storage app for background data integrity checking called Data Doctor. The object-store slices up an object, spreads the chunks across storage nodes and uses erasure coding to protect against disk or node failure. Erasure coding is at least twice as dense as storage systems employing replication for data-protection. The system is highly reliable; for example, two nodes out of sixteen can be lost with the system still fully functional. HC2 is completely distributed and masterless which is the only way to get true scalability. The software provides load balancing, self-healing and other routine management tasks that previously took administrator intervention.
The HC2 object-store architecture has the following properties. These architectural capabilities will be rolled out in different product instantiations over time. For an exact list of product features, please review the product section.
- Massively Scalable
- HC2 is designed to scale to Petabyte and beyond capacity. It is also designed to be able to store billions of objects.
- Fully distributed and masterless
- The design is fully distributed without any master node(s). This is the only way to get true scalability.
- Flat Name Space
- With billions of objects, a tree-like name space of file-systems is unworkable.
- Single management entity
- HC2 is designed in two levels of hierarchy. Storage nodes combine to create a cell. Multiple cells combine to create a hive. The entire hive, however, is a single management entity.
- Multi-cell architecture
- To achieve massive scalability, HC2 uses two levels of hierarchy. Individual storage nodes combine to create a cell. Multiple cells combine to create a hive. This architecture is very useful for data mobility, cloud-bursting, ILM etc. as shown below.
- Immutable
- Objects are immutable. This simplifies management and guarantees that an object cannot be changed by mistake or malice.
- Support for large objects
- Most file-systems have trouble with large objects. HC2 is designed to be able to support TB size objects. In fact there is no artificial limit on the object size even beyond TB sizes.
- Meta-data
- Multiple meta-data objects can be attached to an object. Meta-data may contain any arbitrary information. It could contain extended user meta-data storing attributes of the object. It could also contain administrative meta-data describing the life-cycle of the object e.g. “delete after 7 years”.
Extended user meta-data will also be used in the future to create malleable file-system views where the objects are available through a conventional file-system API. The only trick is the depending on the application the directory tree structure could change. One application could view x-rays with date as the main directory and hospital as the sub-directory; another could view the same data by the doctor being the main director and insurance company being the sub-directory.
- Storage Apps
- Storage apps are applications that operate on the data. HC2 is unique in that it allows 3rd party storage apps to be run on the same physical hardware as storage. Examples are data integrity checking "Data Doctor", data transformations, search, classification, meta-data extraction, virus scans etc.
- Programmatic interfaces
- Like all cloud storage, HC2 provides a programmatic interface which eliminates management functions such as provisioning etc.
- Self healing
- Unlike traditional RAID systems, HC2 provides self-healing. If a disk or node fails, all the object chunks are recreated on the remaining nodes. HC2 is sophisticated to spread the chunks out so that the system is extremely reliable. An added benefit is that healing is extremely quick as compared to say a RAID rebuild. For example, if a disk fails in a system with 16 nodes each with 4 disks, then 63 disks are working in parallel to heal the failure.
- Deferred maintenance model
- A side-effect of self-healing is that disk or node failures need not be attended to right away. This can dramatically reduce management costs because now you no longer need people with pagers responding a 2 a.m. in the morning.
- Load balancing
- HC2 intelligently spreads the data chunks out, and rebalances data when a node is added/ repaired.
- Transparent node upgrades
- A major expense in existing systems is system upgrade. Data migration has to be planned and generally turns out to be a painful task. With HC2, you simply remove an old node and put in a new one.
- Software only
- To truly be able to implement private cloud storage, users need to decouple hardware from software so that users can procure the most optimal industry standard x86 servers. If cost is a concern, “thick” servers with lots of disks may be chosen. If availability is a concern, “thin” servers with fewer disks may be chosen. If performance is a concern, servers with additional compute and memory or flash disks may be chosen. Moreover, this approach also allows users can ride the x86 cost/ performance curve. Proprietary systems are unable to track the Intel roadmap that quickly and are at least a generation or so behind the latest available server hardware.
- Open-Source development
- Open-source is critical for private cloud storage for multiple reasons. Beyond the usual open-source benefits of community development, transparency etc. open source also provides for product availability when companies fail or chose to cancel a product around open-source code.
The following diagram shows the hardware that a customer needs to install to run HC2.

Hardware Required to run Private Cloud Storage
The following diagram shows a diagram of the HC2 architecture.

TierraCloud HC2 Private Cloud Storage Architecture
The above architecture provides three major customer benefits:
- Automated Data Management
- Extreme Data Mobility
- Ability to run 3rd party storage apps
Automated Data Management
Automated data management is enabled by the below architectural features:
These features automates or simplifies manual storage practices in place since the 80s such as provisioning, load-balancing of data, fail-over, fail-back, upgrades, data migration, ILM, deferred hardware maintenance, capacity growth, replication etc.
Extreme Data Mobility
Two levels of hierarchy in HC2 allows for some very unique data mobility applications. Nodes make up cells, and cells make up hives. Different cells in a hive can be in different locations or of different types (cloud-cell, SATA-cell, SATA-cell, tape-cell, Flash cell etc.)
Data-mobility applications possible are:
Cloud-bursting (hybrid cloud)
ILM
Replication
Dispersal (community cloud)
Ability to run 3rd party storage apps
Apps have now become commonplace. Social networking sites, smart phones, televisions, set-top boxes all have apps. The basic idea is that a community of developers can always develop more functionality than one company, no matter how big the single company. A storage platform is no different that these other platforms. Storage apps are light-weight applications that act directly on the data. One company can create a set of vertically integrated storage apps, some good, some not so good. A community on the other hand can create a lot more, and customers can choose best-of-breed storage apps. Furthermore, storage apps promise to dramatically cut down server<->storage bandwidth for routine applications. Instead of moving petabytes of data back and forth, a light storage application can move to the storage platform instead.
HC2 comes with built-in storage apps and also allows for 3rd party storage apps. The first built in appliaction performs long-term data integrity checking called "Data Doctor". There are three different frameworks for running apps. The first is server virtualization where a storage app is a separate VM than HC2 but running on the same physical hardware. The second uses a HADOOP map mechanism. Finally, the third mechanism is using a Java framework.