Cleversafe theoretical performance has limits?
Up to Dispersed Storage Users
Hi.
I'm researching distributed storage solutions for a project with large anounts of data (hundreds of TB), and checking Cleversafe as a potential candidate.
After searching the Internet a bit, I found the following article, at storagemojo, but what was most interesting were the comments by storage professionals, which claimed that Cs has quite large performance limits, because of the used Reed-Solomon approach.
http://storagemojo.com/2008/03/03/cleversafes-dispersed-storage-network/
Can someone from CS team provide information about this matter?
Regards.
We are migrating our forums to a new site. If you don't mind, please re-post your question on http://dev.cleversafe.org/forums.
Our new forums are powered by phpBB. It has more features, performs better, and is generally easier for us to manage.
Our new site supports OpenID. Many online email providers also provide OpenID identities. If yours doesn't, you can also get an OpenID at http://myopenid.com. If you don't feel like getting an OpenID you can still register on our forums the old fashion way.
Sorry for the inconvenience,
Wesley Leggette
Open Source Manager
Cleversafe, Inc.
Previously Stas Oskin wrote:
Hi.
I'm researching distributed storage solutions for a project with large anounts of data (hundreds of TB), and checking Cleversafe as a potential candidate.
After searching the Internet a bit, I found the following article, at storagemojo, but what was most interesting were the comments by storage professionals, which claimed that Cs has quite large performance limits, because of the used Reed-Solomon approach.
http://storagemojo.com/2008/03/03/cleversafes-dispersed-storage-network/
Can someone from CS team provide information about this matter?
Regards.
Hi Stas,
Thank you for your inquiry. I will address your questions and concerns. Cleversafe uses a form of Reed-Solomon called Cauchy Reed-Solomon for its core Information Dispersal Algorithm. This is a faster form of Reed-Solomon and Cleversafe has improved the algorithm to perform the calculations even faster. With ample processor capacity, the performance of the IDA algorithm is not the limitting factor in a Cleversafe Dispersed Storage system.
Cleversafe's technology works best in applications with large digital content objects that must be stored and protected for extended periods of time. Digital media, audio, images, log files, etc. are very good candidates for dispersed storage. The larger the data object, the better. The properties of dispersal allow you to store and protect these objects very efficiently with out the need to copy the data, therefore reducing your storage and management costs. We have several customers that are successfully using our technology for storing, protecting and potentially distributing large digital content libraries.
One of the respondents to the StorageMojo article claims that RAID 6 or RAID 5 is "sufficient" for protecting critical digital content and that the performance of these storage subsystems is far superior to dispersed storage. While it's true that RAID based systems offer enhanced protection, most customers mirror or copy their critical digital assets to protect them from a full subsystem failure and may copy them to another site to protect from an entire site failure. This functionality is built in with a dispersed storage system.
From a performace perspective, small block reads/writes will perform better on a single drive or across a small number of drives in a RAID system, but that's not necessarily the case with large digital content objects. Dispersed storage systems peform write and read operations to and from a network of storage nodes in parallel and operate on smaller pieces of the object not the entire object. Assuming sufficient network bandwidth, moving large objects to and from a network of storage nodes is very efficient due to the parallel I/O and processor operations.
I hope this addresses your concerns. Please respond if you need additional information.
Russ

