Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 1
Conway’s Law Revisited: Do Modular Organizations develop Modular Products
Alan MacCormack (MIT)
John Rusnak, Carliss Baldwin (HBS)
Drexel University
Philadelphia, May 2009
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 2
Broad Research Context• Increasing importance of Architecture/Modularity in literature
– Industry level: Baldwin and Clark, 2000– Firm level: Henderson and Clark, 1992; Schilling, 2000– Product Line level: Sanderson and Uzumeri, 1995– Project level: Thomke and Reinertson, 1998; MacCormack, 2001
• Little empirical work that develops robust, repeatable measures of Architecture/Modularity and highlights predictive power– Categorical and theoretical work: Ulrich 1995; Schilling 2000– Empirical studies use very different measures; very different levels of
analysis, e.g., outsourcing-Schilling, 2000; patents-Fleming, 2004
Our Research Tackles this Gap
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 3
Source: Adapted from Ulrich, 1995
The Mirroring Hypothesis
FunctionalRequirements
OrganizationalStructure
Choice of Product Architecture
Do the Designs that Emerge from Distributed Organizations differ Systematically from the Designs that Emerge from Firms?
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 4
The Opportunity: Software
• Software = information based product: design consists of instructions (source code) which tells computer what to do– Designs can be processed automatically to capture dependencies
• Can track the “living history” of a design over time– Software tools track versions – open source versions freely available
• Software architecture work has long history, yet few metrics– Parnas, 1972: proposed the concept of information hiding for dividing
code into modular units – separate internal design from interfaces
• Natural Experiment: Different modes of organization; Open Source (large, distributed teams) vs Closed (small, collocated)
H1: Open source products are more modular than proprietary products
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 5
Measuring Modularity:Design Structure Matrices
A B C D E F
A . X B . X C X . X D X . X E X . F X X .
B E A C D F
B . X E X . A . X C X . X D X . X F X X .
• Highlights the structure of a design by examining dependencies between its component tasks/elements in matrix form; (Steward; Eppinger et al; Pimmler and Eppinger; Sosa et al,..)
• Prior work identifies dependencies between Design Tasks; We map dependencies across Design Elements for existing designs
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 6
Choice of Dependency:“Function Calls” between “Source Files”
• Choice 1: Unit of Analysis– Source File = Collection of
related Functions– Work is allocated at this level;
development and version control tools work at this level
– History in Software Literature • Choice 2: Dependencies
– “Function Calls” (Request to perform a specific task)
– One important type of dependency (can extend methods to other types)
– Use commercial “Extractor”
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 7
Definitions:- A calls B- A “uses” B- A “depends on” B
Ultimately, A “needs to know” about B
“Architectural View” of Linux Version 0.01
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 8
Measuring Modularity: “Propagation Cost”
A B C D A 0 1 0 0M = B 0 0 1 0 C 0 0 0 1 D 0 0 0 0
A B C D
A B C D A 0 0 1 0M2 = B 0 0 0 1 C 0 0 0 0 D 0 0 0 0
A B C D A 0 0 0 1M3 = B 0 0 0 0 C 0 0 0 0 D 0 0 0 0
A B C D A 0 0 0 0M4 = B 0 0 0 0 C 0 0 0 0 D 0 0 0 0
A B C D A 1 1 1 1V = B 0 1 1 1 C 0 0 1 1 D 0 0 0 1
Visibility Matrix = all Direct/Indirect ConnectionsPropagation Cost = Density of this Matrix = 62.5%
“On average, a change to a single system element potentially affects 62.5% of all system elements.”
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 9
Systems Differ DramaticallyPostgresql (database) Linux (operating system)
Propagation cost = 32% Propagation cost = 9.7%
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 10
Architects do make a Difference
Propagation cost = 5.8% Propagation cost = near 100%
Both DSMs have same size and dependency density…
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 11
Research Approach: Matched Pair Products
• Compare Products of Similar Size and Function– Open Source Software: globally distributed teams of volunteer
developers (e.g., Raymond; von Hippel and von Krogh)– Closed Source Software: co-located teams in firms; sharing of
information about different parts of the design easier, encouraged
• Problem1: Many Open source projects are tiny, no community– Choose only those widely used and have a minimum size (~300 SFs)
• Problem2: Difficult to access Closed (proprietary) code– 1: “Ideal” Pair – Open and Closed equivalents can be found– 2: “Proxy” for Closed Source Product – First release of Opened Version– 3: “Implied” – Open project has limited source commit; small team
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 12
Sample of Product Pairs*Open Closed Comments
1 Gnucash MyBooks Financial management software. MyBooks is a commercial product that has been disguised. It is what we call an “ideal” pair.
2 Abiword OpenWrite Word processing software. OpenWrite comes from Star Office, a closed commercial product that was released as OpenOffice in 2000.
3 Gnumeric OpenCalc Spreadsheet software. OpenCalc comes from Star Office, a closed commercial product that was released as OpenOffice in 2000.
4 Linux a) OpenSolarisb) XNU
Operating system software. Solaris is an operating system developed by Sun. Its source code was released in 2004. XNU is the kernel from Apple’s Darwin operating system.
5 MySQL DB Database software. Berkeley DB is developed by a team of less than 10 people. MySQL is developed by a large, distributed (global) team.
* Statistical Tests first conducted Within Pairs using a Mann-Whitney U Test of Differences in Component Visibility; thenAcross Pairs: Chances of finding that the open product is more modular than the closed product is (.5)^5 = 0.03125: P<.05
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 13
Pairs: Financial Mgmt (“Ideal”)Financial Management Software
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 14
Pairs: Word Processing (“Proxy”)Word Processing Software
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 15
Pairs: Spreadsheet (“Proxy”)Spreadsheet Software
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 16
Pairs: Operating System (“Ideal”)Linux versus Open Solaris (NB different product scope)
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 17
Pairs: Operating System (“Ideal”)Linux versus Apple Darwin XNU (XNU origins; Mach kernel at CMU)
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 18
Pairs: Database (“Implied”)MySQL versus Berkeley DB (BDB)
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 19
Summary: Hypothesis Holds in all Cases
Even so, there is still a puzzle:WHY IS GNUMERIC SO DIFFERENT?
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 20
Exploring Contributor Data: Credits File
0
50
100
150
200
250
0 500 1000 1500
Entri
es in
Cre
dits
File
Size of System (Source Files)
Number of Credits File Entries versus System Size
Abiword
GnuCash
GnuMeric
Linux
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 21
Gnumeric = Open; but not Distributed
Method: Count appearances in the feature/change log(NB not all projects have it)
One person does ~40% of the workFour people do ~90% of the work
2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 Total
Goldberg 29.2% 32.2% 35.4% 48.7% 43.9% 41.2% 55.0% 46.2% 14.2% 38.1%
Welinder 37.9% 46.4% 32.8% 25.6% 22.0% 18.8% 10.8% 9.5% 11.9% 1.0% 18.4%
Guelzow 18.5% 3.8% 0.3% 9.0% 21.8% 17.9% 3.4% 6.9%
Hellan 2.0% 5.0% 6.8% 5.3% 5.9% 13.0% 4.4% 8.2% 0.9% 5.9%
Iivonen 0.9% 0.6% 3.7% 10.2% 2.4%
Tigelaar 0.2% 11.5% 4.4% 2.6%
Meeks 0.1% 0.2% 10.8% 30.6% 7.7% 6.9%
Icaza 0.1% 5.3% 20.7% 58.8% 6.0%
Kasal 0.3% 11.9% 2.5% 1.0%
Others 12.4% 12.3% 12.9% 8.8% 6.3% 7.9% 14.1% 12.0% 11.5% 32.5% 11.7%
Total 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
Goldberg, 38.1%
Welinder, 18.4%
Guelzow, 6.9%
Hellan, 5.9%
Iivonen, 2.4%
Tigelaar, 2.6%
Meeks, 6.9%
Icaza, 6.0%
Kasal, 1.0%
Others, 11.7%
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 22
Comparison with Apache
Conclusion: Gnumeric much more concentrated
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 23
In fact, Contributors Decline over Time
0
10
20
30
40
50
60
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Number of Unique Names in Feature Log by Year
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 24
Key Conclusions• Substantial differences in levels of Modularity between
Software Systems of similar Size and Function– Systems vary by a factor of 8; implications for performance
• Consistent with a view that Product Designs “Mirror” the Structure of the Organizations that develop them (Conway)– Result holds across ALL pairs; Gnumeric is not distributed
• Rival Hypothesis for why these Dynamics Occur– Designs evolve to reflect their environments;
• Closed teams naturally share information and leverage access to parts of the design in other modules; Even if not explicit, design becomes tightly-coupled
– Purposeful choices made by system architects;• Open teams require modularity to succeed; smaller pieces eases understanding and
reduces cost of contributing; Closed teams focus only on maximizing performance
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 25
Some Limitations
• Small Sample of Pairs (reflects limited access to code)
• Software Industry a Unique Context (Design = Information)– Dominant design (Utterback, 1994) not as constraining?
• Pairs were not Developed Contemporaneously– In general, open source products developed after closed products– Might be “learning” that happens; allows for more modularity
• Only look at Large open source projects; what about small?– Evidence from SourceForge - no pattern in small projects– Suggests modularity is a necessary condition for success…
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 26
Case Study: Changing the Organization
Propagation Cost = 17.3% Propagation Cost = 2.8%
E.g., The Redesign of Mozilla (Source: MgmtSci July 2006)
BEFORE AFTER
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 27
Architecture must be Managed!
Propagation Cost
Conway’s Law Revisited
©Alan MacCormack, John Rusnak, Carliss Baldwin 2009 28
Implications for Managers
• Understand Organizational Biases in Design Choices– Search space is constrained by organizational characteristics– Challenge: Seldom are these influences Explicit
• When you Design Organizations, you also Design Products– Managers must consider potential path dependencies– Consider short-run “sub-optimal” modes (e.g., restrict interaction?)
• Implications for a move to “Open Innovation”– These arrangements have a distinct impact on the design– Performance will differ – for better or worse