Personal tools
You are here: Home Blog Re-writing vs. Modifying Software
Document Actions

Re-writing vs. Modifying Software

You may have noticed that the open source release we posted last month was a complete re-write of the code based we published last year.  Not a single line of code was unchanged.  This change wasn’t just due to the change from C++ to Java, but was a result of a decision to completely re-architect, re-design and re-write the Dispersed Storage software. 

 

I’ve spoken a few times with Joe Jablonski of Acumence and others about the merits of re-writing software vs. enhancing an existing code base and Joe and I agree that software development organizations too often make the mistake of continuing to enhance weakening code base vs. re-writing from scratch.

 

I think this misjudgment comes from a fundamental misunderstanding of the nature of modern software development.  Modern software development tools in the hands of capable developers can quickly produce complex software.  (We are using SCRUM as our development methodology, by the way, and have been quite pleased with the results.)  But software development does not just mean the act of writing software.  And the time required to write complex software is not just the time required to type the code.  Completing a complex software development project requires dynamic coordination of requirements definition, architecture, design, development, testing, validation, tuning, and enhancement.  If done correctly, the act of writing code is only a portion of the time and effort required for software development, especially for complex software and especially for a new type of complex software.

 

Our goal in the initial production release of Dispersed Storage software is to create an outstanding software foundation on which we and others can build Dispersed Storage solutions.  The work we did in 2005 and 2006 provided many insights in how to build a Dispersed Storage system and that know-how enabled us around the beginning of this year to know that we needed to re-write our software.  That know-how also included knowing how to proceed toward an outstanding initial production release.

 

Whether we realize that goal will ultimately be determined by market acceptance of our software and specifically whether it provides the reliability, security, performance, scalability, longevity and cost-effectiveness benefits we envision.  But the preliminary results we are seeing now from our re-write over the past year so far exceed last year’s results that we know that re-writing was a necessary step.

 

re: rewrite vs modify

Chris,

Fortunately for Cleversafe, the decision to rewrite paid off. Of course, I think you and I can agree that such a decision would have been far more costly and difficult [certainly more painful] down the road with a larger install base and more moving parts. I'm glad you decided to tackle the project upstream while you still can.

It'll be interesting to see if the company is as willing [and able] to rewrite the code base once a substantial number of users have developed application- and system-specific dependencies on it. Cleversafe will then have to take into consideration the potentially costly impact on thousands of customer environments as well. It's certainly what keeps Microsoft engineers up at night.

On a different note...

One of the dangers of software development lies in the psyche of engineers and their tendency to believe they can build a better mousetrap than their predecessors. Obviously, this isn't always true. I'm particularly wary of engineers who carry the dangerous [design] elegance or [platform] bias genes.

In that context, I'm curious about Cleversafe's decision to move from C++ to Java. Would you provide readers with some insight into the design requirements that motivated Cleversafe to move away from C++, and, of the available alternatives, to choose Java? I'm sure they'll find the insight helpful when they face similar requirements in their own projects.
Posted by joseph martins | Dec 13, 2007 09:10 AM

Java vs. C++

Joseph,

Great comments. You are right, it is much easier to make the decision to rewrite before you have an installed base -- so that makes it so much more critical to get your software right BEFORE the initial customer release.

As to why we made the switch from C++ to Java, there were two main reasons. First, we found it much harder to find developers who could write great C++ than finding developers who could write great Java code. And this was especially true for complex software, like Dispersed Storage.

Second, we found Java to have a much deeper set of existing software, like tools, libraries and environments, so we ended up having to write a lot less software ourselves in Java vs. C++. In Java, we ended up only having to write the unique new elements of the Dispersed Storage software ourselves and we were able to find existing Java libraries for everything else. If you look at at the .jar files in the open source release, you'll see that we used a lot of existing Java libraries which I think are...

bcprov-jdk15-136.jar
JSAP
bzip2.jar
commons-logging-1.0.4.jar
commons-math-1.1.jar
commons-net-1.4.1.jar
je-3.2.23.jar
junit-4.1.jar
log4j-1.2.14.jar
mina-core-1.1.0.jar
mina-core-1.1.0-sources.jar
mina-filter-ssl-1.1.0.jar
slf4j-api-1.4.0.jar
slf4j-log4j12-1.4.0.jar
svnkit.jar
wrapper.jar
ws-commons-java5-1.0.1.jar
ws-commons-util-1.0.1.jar
Java Service Wrapper shell scripts
log4j-1.2.14.jar
jscsi-modified-070601.jar

Incorporating these existing libraries along with an easier ability to find great developers in Java ended up increasing our pace of development. My subjective estimate was that we ended up producing functionality at twice the rate in Java that we did in C++ given equal costs.

Yes, C++ can provide faster performance in some cases, but we didn't find these to be that significant. Also, our performance bottleneck is typically the network bandwidth of a Dispersed Storage system, so slightly slower execution of our software doesn't typically translate into a material change in overall system performance. As the use of Dispersed Storage scales up, it may make sense to rewrite some or all of the Dispersed Storage code in C++, but for now our focus in on getting functionality completed and Java has been the right approach.

Chris
Posted by cgladwin | Dec 13, 2007 07:48 PM

thank you

Chris,

Thank you for taking the time to write a clear, concise and [above all] candid response.

While C++ has its performance advantages, I think most customers would prefer to have a usable, stable product they can deploy today - with more capabilities and and at a lower cost - than to wait several months more [or longer] for something questionably different. As you pointed out, this is especially true in environments where the application isn't the bottleneck.

Well, I'm glad to see that you're able to keep the team focused, and mindful of time-to-market. I look forward to taking a deeper look at the product and speaking with you soon - I believe Bill Peterson is supposed to set up a briefing.
Posted by joseph martins | Dec 13, 2007 09:27 PM
Weblog Authors

cgladwin

Location: Cleversafe Chicago
cgladwin
Chris Gladwin wrote the first Dispersed Storage prototype and is the Founder, President and CEO of Cleversafe, a company commercializing this technology.

jbellanca

Location: Chicago
jbellanca
Cleversafe founder. MIT Graduate, history of working for technology startups. Areas of expertise: product design, interaction design, requirements.

rkennedy

Location: Chicago
rkennedy
VP of Product Management and Strategic Alliances for Cleversafe. Responsible for product management and product marketing and ensuring product roadmap and features meet the demands of the marketplace