From Wired magazine, I came across this fascinating online experiment, where Stanford researcher Erik Steiner is soliciting guesses from the Internet about how many coins are in the pictured coin jar.
I’ll be curious to see how this experiment pans out. His early update on the findings is interesting:
First, thanks for your participation. Second, some early returns…
So far, it turns out that the most accurate guessers are the people who spend the least amount of time thinking about it. Somewhat surprisingly, those that answered “I actually did some math” are the least accurate, on average.
At the risk of exposing my own confirmation bias, I’m not that surprised by the early findings as I suspect it is intuition – gut feel or what Kahneman calls System 1 thinking – at work. System 2 probably fails because there isn’t enough information to analytically come up with a solution.
My interest in the wisdom of the crowd is not just one of pop-science fascination, but I’ve always wondered about its applicability in forecasting large software projects. In a way, the agile world adopts crowd-sourced estimates with techniques like sprint poker and story point estimation. However, those are typically analytical exercises (System 2) and finer grained i.e., at the story level. Story point estimates can of course then be aggregated to come up with an estimate for the entire project. But, for very large projects – think Obamacare or larger – getting a backlog with enough detail and estimating each story can itself be a significant undertaking. And that is where I would be curious to look at research around crowd sourcing estimates for large software projects.
This is how I picture the experiment being structured: Engineers, product managers and program managers in an organization are provided with the project description and a way to anonymously provide a guesstimate. May be, they are even instructed not to discuss the project amongst themselves before providing an estimate so as to not bias[1] their individual estimates. Perhaps, a control question to reveal their biases [2] would also be in order. This would not work in small organizations as you wouldn’t have enough of a “crowd” to crowd-size. The aggregated estimate (mean, geometric mean[3]?) would then have to be compared against the traditionally calculated estimate or tracked against actual project completion.
Even if unsuccessful, these experiments could have interesting results – do engineers tend to be more accurate or inaccurate compared to program managers, do experienced engineers tend to do better or worse than less experienced engineers at forecasting. Software estimation is notoriously hard and error prone and if successful, a crowd-sourced estimate could provide another useful data point to aid long term planning.
- The wisdom of the crowd has been proven to break down in the face of shared biases and social influence, which makes it particularly tricky to apply it to an organization where everyone typically shares the same biases. Kahneman talks about this some more in this interview. ^
- Another study shows that the bias can be eliminated by identifying the independent thinkers or the not-so-easily influenced and aggregating their estimates. They even propose a way to identify the independent thinkers. ^
- There are contradictory studies on the validity of the wisdom of the crowd as this post discusses a study where the geometric mean was used to massage away the wildly inaccurate guesses of the crowd. ^