Saturday, August 7, 2010

The Cathedral and the Bazaar

Musings on Linux and Open Source by an Accidental Revolutionary

This book was published in 1999 by Eric Raymond as a collection of essays. It was revised in 2001 after suggestions from readers and changes due to recent development. And here I am in 2010 reading the book and writing about it. Of course I knew about the book earlier but as I’m not a software developer, much less a hacker, I never thought I would find much of value to me in the book.

By the way, hacker is a term used by the author that is not negative, but positive. Wikipedia describes this type of hacker as, “…a member of the computer programmer subculture originated in the 1960s in the United States academia, in particular around the Massachusetts Institute of Technology (MIT)'s Tech Model Railroad Club (TMRC) and MIT Artificial Intelligence Laboratory. Nowadays, this subculture is mainly associated with the free software movement. Hackers follow a spirit of creative playfulness and anti-authoritarianism, and sometimes use this term to refer to people applying the same attitude to other fields.” Understanding the hacker culture is critical to understanding the success of the “open source” model.

However, I found the book thoroughly enjoyable, in spite of some repetition and structural issues caused by the way it was written, insightful and helpful to me in my thinking about an innovation commons and complexity.

It is amazing to me that someone so close to the open source revolution was able to step back and observe what was going on, and then draw out some principles and rules. Several people told me that I needed to read the book. I’m glad I finally listened.

Bob Young, CEO, Red Hat, writes in the foreword, “Freedom is not an abstract concept in business. The success of any industry is almost directly related to the degree of freedom the suppliers and the customers of that industry enjoy.” I couldn’t agree more.

Raymond comments on hackers in the preface, “I just referred to the "open-source movement". That hints at other and perhaps more ultimately interesting reasons for the reader to care. The idea of open source has been pursued, realized, and cherished over those thirty years by a vigorous tribe of partisans native to the Internet. These are the people who proudly call themselves "hackers" -not as the term is now abused by journalists to mean a computer criminal, but in its true and original sense of an enthusiast, an artist, a tinkerer, a problem solver, an expert.”

The title, The Cathedral and the Bazaar, contrasts two different styles of operating. A cathedral is designed and craftsmen work on realizing their parts of the cathedral. A bazaar, according to Wikipedia, “is a permanent merchandising area, marketplace, or street of shops where goods and services are exchanged or sold. The word derives from the Persian word bāzār, the etymology of which goes back to the Middle Persian word baha-char ,meaning "the place of prices". Although the current meaning of the word is believed to have originated in Persia, its use has spread and now has been accepted into the vernacular in countries around the world.”

“Linux is subversive,” the author writes in an essay in 1996. “Who would have thought even five years ago (1991) that a world-class operating system could coalesce as if by magic out of part time hacking by several thousand developers scattered all over the planet, connected only by the tenuous threads of the Internet.” Linus Torvald’s style of development,” he continues, “release early and often, delegate everything you can, be open to the point of promiscuity – came as a surprise. No quite, reverent cathedral building here – rather, the Linux community seemed to resemble a great babbling bazaar of different agendas and approaches (aptly symbolized by the Linux archive sites, which would take submissions from anyone) out of which came a coherent and stable system could seemly emerge only by a succession of miracles.”

The lessons Raymond extracts from this open source experience are:

  1. Every good work of software starts by scratching a developer’s personal itch
  2. Good programmers know what to write. Great programmers know what to rewrite (and reuse)
  3. Plan to throw one away; you will anyhow.
  4. If you have the right attitude, interesting problems will find you.
  5. When you lose interest in a program, your last duty is to hand it over off to a competent successor.
  6. Treating your users as co-developers is your least-hassle route to rapid improvement and effective debugging.
  7. Release early. Release often. And listen to your customers.
  8. Given a large enough beta – tester and co-developer base, almost every problem will be characterized quickly and the fix will be obvious to someone. (Or less formally, given enough eyeballs, all bugs are shallow.)
  9. Smart data structures and dumb code works a lot better than the other way around.
  10. If you treat your beta-testers as if they’re your most valuable resource, they will respond by becoming your most valuable resource.
  11. The next best thing to having good ideas is recognizing good ideas from your users. Sometimes the latter is better.
  12. Often the most striking and innovative solutions come from realizing that your concept of the problem was wrong.
  13. Perfection in design is achieved not when there is nothing more to add, but rather when there is nothing more to take away.
  14. Any tool should be useful in the expected way, but a truly great tool lends itself to uses you never expected.
  15. When writing gateway software of any kind, take pains to disturb the data stream as little as possible – and never throw away information unless the recipient forces you to.
  16. When your language is nowhere near Turing-complete, syntactical sugar can be your friend.
  17. A security system is only as secure as its secret. Beware of pseudo-secrets.
  18. To solve an interesting problem, start by finding a problem that is interesting to you.
  19. Provided the development coordinator has a communication medium at least as good as the Internet, and knows how to lead without coercion, many heads are inevitably better than one.
“Brook’s Law predicts that the complexity and communication costs of a project rise with the square of the number of developers, while work done rises only linearly,” the author summarizes. “The Brook’s law analysis and the resulting fear of large numbers in development groups rests on a hidden assumption: that the communications structure of the project is necessarily a complete graph, that everybody talks to everybody else. But on open source projects, the halo-developers work on what in effect separable parallel subtasks and interact with each other very little; code changes and bug reports stream through the core group, and only within that small core group do we pay full Brooksian overhead.”

There are two other major areas that Raymond discusses – the culture of open source efforts (The chapter titled Homesteading the Noosphere and why open source software was a good candidate for this new way of collaborating (The chapter titled The Magic Cauldron). Finally, I will close with my observation about the “bazaar” and complexity.

This book, but particularly the chapter Homesteading the Noosphere, has been difficult for me to summarize. I think that this is because as the author states in the subtitle, they are musings. In order to pull out of this chapter some thoughts useful to pass on, I’m going to impose my structure on the chapter. I look at culture as shown in the graphic below taken from Organizational Development.

The core of any culture is its philosophy (although sometimes difficult to describe). Built on that philosophy are a set of beliefs (things that are taken as fact). Values are built from the beliefs (in this values are priorities). Behavior follows values (in some cases people subdivide this into norms). And, the output results from the behaviors. Using this as a model, I will extract what the author has to say about each of these topics.

Raymond doesn’t really identify a philosophy. However, his discussion of why hackers get involved is close:

“All members agree that open source (that is, software that is freely redistributable and can readily evolve and be modified to fit changing needs) is a good thing and worthy of significant and collective effort.”

Beliefs vary depending upon two parameters – zealotry and hostility to commercial software:
“One degree of variation is zealotry; whether open source development is regarded merely as a convenient means to an end (good tools and fun toys and an interesting game to play) or as an end in itself.

A person of great zeal might say, "Free software is my life! I exist to create useful, beautiful programs and information resources, and then give them away." A person of moderate zeal might say, "Open source is a good thing, which I am willing to spend significant time helping happen." A person of little zeal might say, "Yes, open source is okay sometimes. I play with it and respect people who build it."

Another degree of variation is in hostility to commercial software and/or the companies perceived to dominate the commercial software market.

A very anticommercial person might say, "Commercial software is theft and hoarding. I write free software to end this evil." A moderately anticommercial person might say, "Commercial software in general is okay because programmers deserve to get paid, but companies that coast on shoddy products and throw their weight around are evil." An un-anticommercial person might say, "Commercial software is okay; I just use and/or write open-source software because I like it better." (Nowadays, given the growth of the open-source part of the industry since the first public version of this essay, one might also hear, "Commercial software is fine, as long as I get the source or it does what I want it to do.")”

The author views these two categories as orthogonal so that there are nine different beliefs all under the umbrella of open source:

The author writes that there are three basic ways that humans organize to deal with scarcity and want:
  1. Command hierarchy – scarce goods are allocated by one central authority
  2. Exchange economy – scarce goods are allocated through trade and voluntary cooperation
  3. Gift culture – adaptations to abundance
“For examined in this way, it is quite clear that the society of open-source hackers is in fact a gift culture. Within it, there is no serious shortage of the 'survival necessities' -disk space, network bandwidth, computing power. Software is freely shared. This abundance creates a situation in which the only available measure of competitive success is reputation among one's peers.”

Within the gift culture, certain types of gifts are valued more than others and therefore the giver given higher esteem:
  • Accurate and truthful representation of the gift
  • Work that extends the noosphere
  • Work that makes it into a major distribution
  • Work that is utilized by others
  • Continued devotion to hard, boring work
  • Non trivial extensions of function
“In fact (and in contradiction to the anyone-can-hack-anything consensus theory) the open-source culture has an elaborate but largely unadmitted set of ownership customs.
These customs regulate who can modify software, the circumstances under which it can be modified, and (especially) who has the right to redistribute modified versions back to the community.

The taboos of a culture throw its norms into sharp relief. Therefore, it will be useful later on if we summarize some important ones here:

  • There is strong social pressure against forking projects. It does not happen except under plea of dire necessity, with much public self-justification, and requires renaming.
  • Distributing changes to a project without the cooperation of the moderators is frowned upon, except in special cases like essentially trivial porting fixes.
  • Removing a person's name from a project history, credits, or maintainer list is absolutely not done without the person's explicit consent.”
Nothing in the licenses used for open source projects prevents “forking”:

“Nothing prevents half a dozen different people from taking any given open-source product (such as, say the Free Software Foundations's gcc C compiler), duplicating the sources, running off with them in different evolutionary directions, but all claiming to be the product.
This kind of divergence is called a fork. The most important characteristic of a fork is that it spawns competing projects that can¬not later exchange code, splitting the potential developer community. (There are phenomena that look superficially like forking but are not, such as the proliferation of different Linux distributions. In these pseudo-forking cases there may be separate projects, but they use mostly common code and can benefit from each other's development efforts completely enough that they are neither technically nor sociologically a waste, and are not perceived as forks.) “

Ownership is an important issue in this culture:
“What does 'ownership' mean when property is infinitely reduplicable, highly malleable, and the surrounding culture has neither coercive power relationships nor material scarcity economics?
Actually, in the case of the open-source culture this is an easy question to answer. The owner of a software project is the person who has the exclusive right, recognized by the community at large, to distribute modified versions.”

A person can achieve ownership in three ways:
  1. Found the project
  2. Inherit ownership from the previous owner
  3. Take over an abandoned project (with appropriate notifications and permission of the previous owner if he or she can be found, and no objections from members of the project)
I think that the conditions of the output from this culture are best described in the open source definition on the open source web site:

“Open source doesn't just mean access to the source code. The distribution terms of open-source software must comply with the following criteria:
  1. Free Redistribution: The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.
  2. Source Code: The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost preferably, downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.
  3. Derived Works: The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.
  4. Integrity of The Author's Source Code: The license may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.
  5. No Discrimination Against Persons or Groups: The license must not discriminate against any person or group of persons.
  6. No Discrimination Against Fields of Endeavor: The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
  7. Distribution of License: The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.
  8. License Must Not Be Specific to a Product: The rights attached to the program must not depend on the program's being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.
  9. License Must Not Restrict Other Software: The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.
  10. License Must Be Technology-Neutral: No provision of the license may be predicated on any individual technology”
The last thing I want to write about from Raymond’s book is why the open-source project worked for software. Hopefully this will shed some light on the characteristics of other potential projects that might be successful.

One of the key reasons why open-source software was successful has to do with the value structure of the software. Computer programs, like all other kinds of capital goods, have two kinds of economic value – use value and sale value.

According to the author, the software industry has deluded itself into believing that it is like a “manufacturing model”, i.e. one based on sale value. However, he writes, “…approximately 95% of code is still written in-house”. This code has little or no sale value but a lot of use value. He believes, “…that only 5% of the industry is sale-value-driven.” This implies that there would be a large acceptance of collaborative models of software development that would improve the 95% that is not sale value driven.

Raymond writes that we can expect a high payoff from an open source project when:
  • Reliability, stability, and scalability are critical
  • Correctness of design and implementation are not readily verifiable by means other than independent peer review
  • The software is a business-critical capital good
  • It establishes or enables a common computing and telecommunications infrastructure
  • Key methods are part of common engineering knowledge
And, he writes that open source seems to make the least sense when:
  • You have unique possession of value generating software
  • It is relatively insensitive to failure
  • It can be verified by means other than peer review
  • It is not business critical
  • It would not have its value increased by network effects or ubiquity
This is a very valuable book to read and study, especially if you’re interested in learning when and how to apply open-source approaches to projects. Even thought I’ve written eight pages on the book, I have not addressed all of the topics he covers.

The most important insight I gained from studying The Cathedral and the Bazaar was not in the book, but drawn from complexity science. Using the terminology of complexity science, what the open source approach is doing is using a complex system to solve a complicated problem. See 1,2, a Few, Many for more information). The type of complex system being created is one composed of many independent intelligent agents with emergent properties.

Raymond likened the open source model to the magic cauldron. “In Welsh myth, the goddess Ceridwen owned a great cauldron that would magically produce nourishing food – when commanded by a spell known only to the goddess.”

We may be getting close to understanding how to create complex systems that produce abundance.

The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary
Eric Raymond, O’Reilly, 2001, 241 p

Download a copy of this article.

No comments:

Post a Comment