Keeping science reproducible in a world of custom code and data

Read the full story at Ars Technica.

Since the mid-1600s, the output from a typical scientific study has been an essay-style journal article describing the results. But today, in fields ranging from astronomy to microbiology, much of the technical work for a journal article involves writing code to manipulate data sets. If the data and code are not available, other researchers can’t reproduce the original authors’ work and, more importantly, may not be able to build upon the work to explore new methods and discoveries.

Thanks to cultural shifts and funding requirements, more researchers are warming up to open data and open code. Even 100-year-old journals like the Quarterly Journal of Economics or the Journal of the Royal Statistical Society now require authors to provide replication materials—including data and code—with any quantitative paper. Some researchers welcome the new paradigm and see the value in pushing science forward via deeper collaboration. But others feel the burden of learning to use distribution-related tools like Git, Docker, Jupyter, and other not-quite words.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.