Crowdsourced Data Management: Industry and Academic Perspectives
A few years ago, Aditya and I were catching up at Voltage cafe in Kendall Square when he asked if I’d be interested in writing a book on crowd-powered data processing systems. At the time, he was a postdoc at MIT and I was in startupland at Locu, and in the time that he became a professor at UIUC and I co-founded a company, we went ahead and finished the book.
The book, which is freely available as a PDF, has two parts. In the first half, we review the state of academic research in crowdsourcing, with a special eye for data processing. The first half was a natural follow-on to our research in grad school. The second half of the book features summaries of 13 interviews with industry users of crowd work and 4 operators of crowdsourcing marketplaces. This half is filled with summary statistics and rich quotes from folks at companies like Google, Facebook, and Microsoft on how they manage large crowd workforces, what their use cases are, which aspects of the research literature they benefit from, and where they could use a little more help from researchers.
I really enjoyed two aspects of working on this book. First, it was wonderful to work with Aditya, who I never got to collaborate with in grad school. Second, the experience opened Aditya and me up to just how much you can learn from qualitative work like the interviews and surveys in the second part of the book. Both of us felt that this second lesson would have a lasting impact on how we approach learning a new topic, and how to keep industry and academia in sync on the most important problems in a field.
My only regret with the book as that, due to the formatting guidelines of our publisher, the Acknowledgements section is at the end of the book. One of my not-so-secret delights is reading the acknowledgements that people put in their Ph.D. theses, and I like it when they can be front and center. Nonetheless, it’s there, and I’m grateful!
To make the book more accessible, we’ll be putting together summaries of our favorite sections as blog posts. You can read the first one on Aditya’s blog.