In Tomorrow’s CS889 class (June 2nd) I am again leading the second half of the discussion. This time my focus is on the work of Audris Mockus’ et al. Their article, “Two case studies of open source software development: Apache and Mozilla” compares the very different development processes of these two projects. I found this article to be quite interesting, and the hypotheses the authors develop (and later modify) are quite informative. In this post, I list some of the more interesting findings:

Apache

  • Apache was, for the most part, entirely volunteer driven. All the core project members had other “day jobs”.
  • Coordination between core members was done almost exclusively through the developer mailing list.
  • The core apache group (AG) has between 8 and 25 members. At the time of the study, the AG had 12 members.
  • At any given time, 4 to 6 members of the AG were actively writing code alongside 2 to 3 AG nominees (highly active, non-AG developers).
  • There was little evidence of code ownership. AG members were trusted to modify any parts of the code.
  • Over 400 developers contributed some code.
  • 15 developers (the “core developers”) contributed 83% of the CVS commits for new functionality.
  • 15 developers produced 66% of the code related to bug fixes.
  • Over 3000 contributors submitted about 4000 problem reports. 591 of those problem reports resulted in a code (or documentation change)
  • The top 15 problem report submitters contributed only 5% of the total PRs.
  • Reported problems are typically addressed very quickly (depending on priority), with half of the problem reports being corrected within a single day.

Mozilla

  • Mozilla had a number of payed employees, many of them originally from Netscape.
  • Code ownership was strongly enforced. For each of Mozilla’s modules, one developer (the module owner) was responsible for reviewing and committing changes.
  • Mozilla modules had 22-35 core members. I.e., 22 to 35 people wrote 83% of the code for new functionality.
  • Apache had 76 or modules. For each module, between 87 and 174 people contributed code.
  • 6837 contributors submitted about 58,000 problem reports. 11,616 of those problem reports resulted in a code (or documentation change)
  • The top 113 submitters reported 50% of all problems. Only 25 of these were external. There were formal testing teams tasked with finding problems.
  • Reported problems required longer to fix. This was mostly because all changes needed to be approved by the module owners.

Hypotheses

From these two studies, the authors made the folowing 7 hypothesis for OSS projects in general (quoted directly, and included in this summary for completeness):

“Hypothesis 1a: Open source developments will have a core of developers who control the code base, and will create approximately 80% or more of the new functionality. If this core group uses only informal, ad hoc means of coordinating their work, it will be no larger than 10-15 people.

Hypothesis 2a: If a project is so large that more than 10-15 people are required to complete 80% of the code in the desired time frame, then other mechanisms, rather than just informal, ad hoc arrangements, will be required in order to coordinate the work. These mechanisms may include one or more of the following: explicit development processes, individual or group code ownership, and required inspections.

Hypothesis 3: In successful open source developments, a group larger by an order of magnitude than the core will repair defects, and a yet larger group (by another order of magnitude) will report problems.

Hypothesis 4: Open source developments that have a strong core of developers but never achieve large numbers of contributors beyond that core will be able to create new functionality but will fail because of a lack of resources devoted to finding and repairing defects.

Hypothesis 5: Defect density in open source releases will generally be lower than commercial code that has only been feature-tested, i.e., received a comparable level of testing.

Hypothesis 6: In successful open source developments, the developers will also be users of the software.

Hypothesis 7: OSS developments exhibit very rapid responses to customer problems.”, [Mockus, et al.]