Our initial product HC2 Preservation Edition (HC2 PE) targets Private Cloud Storage customers who want to preserve large amounts of data. HC2 PE is by far the industry’s best storage solution for preservation archival especially long-term preservation.

Organizations ranging from digital libraries, life science institutions, photo sites, film studios, HADOOP users, governments, law firms, mapping organizations etc. are generating more and more digital data every year. In fact, according to SNIA, 68% of customers surveyed indicated they need 100-year archives. These organizations need to preserve a huge tsunami of new data. Preservation archival is different from compliance archival, a commonplace practice in enterprises, in its objective (access not compliance), amount of data (much larger data-sets) and longevity of data (much longer periods).
Preservation requires the ability to store lots of data cheaply and reliably. Specifically:
- Reduction in total cost of ownership: With storage growing at an exponential pace, the current cost structures are untenable
- Data guarantees: The storage system must guarantee availability, immutability and data integrity over time
- Scalability: The system must scale to multiple Petabytes, 1000s of storage nodes etc.
HC2 PE meets these requirements. Each of these requirements is addressed in more detail.
Reduction in total cost of ownership
HC2 PE has a rich set of features to reduce the cost of every phase of the storage systems’ life-cycle. See figure below.
- TierraCloud PE can be evaluated on Amazon EC2; simplifying process & cutting cost
- TierraCloud PE training can also be obtained on Amazon EC2; cutting cost
-
- Industry standard (x86) servers can be used dramatically cutting cost
- Erasure coding provides reliability with extremely high density (nearly twice that of replication)
-
- Practices in use since mid 80s in block & file storage eliminated/ simplified by automated
- data management e.g. provisioning, load balancing, failover, migration, capacity growth etc.
-
- Combination of meta-data and third party storage apps (e.g. data integrity checking "Data Doctor", transformations,
- search, classification) can prevent your storage from become a digital dumpster
- Self healing allows for deferred maintenance thus dramatically cutting costs
-
- Self-healing allows for a transparent upgrade process - simply remove an aged node
- and put in a new one!
Data Guarantees
HC2 PE provides the data guarantees required by this use-case as per the table below.
- Erasure Coding & Self Healing
- Provides reliability e.g. availability with 2 nodes lost out of 16
- Very high mean-time-to-data-loss (MTTDL)
- Seamless migration from old servers to new ensures data longevity
- Open Source
- Companies come and go, projects come and go; open-source ensures continued code development: TierraCloud is an example
- Fixed content with integrity checks
- Architecture ensures immutable data
- Hashes on data chunks combined with background integrity checks (data doctor) ensure data integrity
- Meta-data
- Meta-data can be used to make the data self-describing ensuring that the data is still usable over time
- Metadata may also be used to store digital signatures for the object
Scalability
The underlying technology was purpose-built for scalability. The architectural foundations of a fully distributed design with no single master and a multi-cell architecture ensure extreme scalability to petabyte and exabyte scale.
Detailed Specifications for HC2 PE, Release Name “Replevin”
- Minimum Hardware Requirements
-
- 8 or 16 servers
- Single socket, single core Intel or AMD Opteron Processor
- Processor frequency 2.2 GHz
- 1MB level 2 cache
- 3GB memory
- 4 x 500GB SATA disks (during evaluation this may be reduced to 2 disks)
- Two Gigabit or 10GE network switches (during evaluation this may be reduced to 1 network switch)
- Features
-
- Object store & retrieve
- Full featured: load balancing, self-healing, hardware migration etc. supported
- Meta-data store & retrieve; query not supported
- Storage Apps
-
- Background data integrity checker “Data doctor” included
- Framework for additional 3rd party VM based storage apps
- Plug-ins/ APIs
-
- Native HC2 client API
- S3 API (experimental)
- Duraspace Fedora Commons plug-in (experimental)
- EPrints plug-in (experimental)