Personal tools
You are here: Home Forums-old Dispersed Storage Users How does the self-healing/rebuilding work?
Document Actions

How does the self-healing/rebuilding work?

Up to Dispersed Storage Users

How does the self-healing/rebuilding work?

Posted by kaspars at July 02. 2008

Hello,


I've installed Cleversafe 1.0 RC1 on 8 servers with 6 needed to preserve data. I turned of 2 of them and copied large file on the Cleversafe ISCSI disk. Then I turned back on those 2 slicestors and was hoping that the Cleversafe will "self-heal" itself by populating data on those additional 2 servers, but I've not seen any activity for the past several hours since I turned on those 2 slicestors.

So how does the self-healing/rebuilding work? Do I have to run some script manually? Or do I have to be more patient?


 


Thank you,


Kaspars.


Re: How does the self-healing/rebuilding work?

Posted by vthornton at July 02. 2008

Rebuilding is configured to scan the dsNet for data to rebuild once per week.  Rebuilding also occurs when data to rebuild is found during read/write operations. 


Re: How does the self-healing/rebuilding work?

Posted by kaspars at July 03. 2008

Thank you, vthornton!


 


Is there a way to change the rebuilding frequency, say to once a day?


Is there a way to run rebuilding manually if I know that there has been downtime of some slice servers and now they're ok?


 


Thanks,


Kaspars


Re: How does the self-healing/rebuilding work?

Posted by vthornton at July 08. 2008

The rebuilding frequency can be changed by modifying the rebuild-interval and rebuild-detection-interval properties in the properties.xml file in the conf directory.  There is no way currently to run rebuilding manually, but this will be possible in an upcoming release.


Re: How does the self-healing/rebuilding work?

Posted by kaspars at July 09. 2008

Thanks, vthornton!


I guess you mean to change properties.xml on accesser machine, right? My properties.xml file does not have neither rebuild-interval nor rebuild-detection-interval properties in it. Should I add these in form like this:


<rebuild-interval>


   <value>some_value</value>


</rebuild-interval>


If so, what is the scope of 'some_value'? Days, hours, minutes, seconds? And in which properties.xml section should I put it?


 


Thanks,


Kaspars.


Re: How does the self-healing/rebuilding work?

Posted by vthornton at July 11. 2008

Sorry, I forgot that rebuild functionality was not included in that release.  Rebuild functionality will be available starting with the 1.1 release.


Re: How does the self-healing/rebuilding work?

Posted by nick at August 05. 2008

does this mean that there is no (automated) rebuilding capability at all in the 1.0 RC?


after creating a minimal dsnet of width 2 and threshold 1, i shutdown 1 slicestor and copied a movie file into the dsnet with the 1 slicestor still working.  i then proceeded to play back the file, and it worked fine.  while the playback was happening, i brought the 2nd slicestor back online, and the movie stopped playing, and i did not see the used space increasing on the 2nd slicestor.  as soon as i stopped the 2nd slicestor, the playback continued.


 


shouldn't the 1st slicestor be the definitive one since it has the latest version of files, and shouldn't the playback continue even after bringing the 2nd slicestor back online?


i do also notice that the data being sent by the accesser increases significantly, going from 500K to 20MB/s, but it is not reading any data from the slicestors.  at first i thought it was reading data and rebuilding the 2nd slicestor, but that was not the case.


any idea of what is going on, and how to get the 2nd slicestor to rebuild the data?


 


Re: How does the self-healing/rebuilding work?

Posted by vthornton at August 05. 2008

There is no rebuild capability in 1.0 RC.


 


I don't know what could be causing the behavior you are experiencing.  I know that there was a defect related to configurations where the the threshold was half of the width that has been fixed in later versions of the software, but I do not know if this is related to your issue.


Re: How does the self-healing/rebuilding work?

Posted by nick at August 06. 2008

okay.  so just to be clear, there is absolutely no rebuilding and/or repairing available in this release, correct?  if a slicestor is down, and new data placed into the vault, then the slicestors that hold that data need to be available for that data to be available correct?


i tried a width 3, threshold 2 setup, and things appear to be working even after bringing the 3rd slicestor back online, but of course the 3rd slicestor does not have the blocks/files that were put into the vault while it was down.


will the release version of 1.0 have a rebuild/repair facility?


thanks.


 


Re: How does the self-healing/rebuilding work?

Posted by stoledano at August 07. 2008

Hi Nick,


We will be releasing a new version of the software including rebuilder in a couple of weeks.


Powered by Ploneboard