Stanford researchers champion open and reproducible science
Stanford’s Center for Open and Reproducible Science aims to make science – and research in general – more effective and accessible.
Open science is a broad goal that includes making data, data analysis, scientific processes and published results easier to access, understand and reproduce. It’s an appealing concept but, in practice, open science is difficult and, often, the costs seem to exceed the benefits. Recognizing both the shortfalls and the promise of open science, Stanford University’s Center for Open and REproducible Science (CORES) – which is part of Stanford Data Science – hopes to make the practice of open science easier, more accessible and more rewarding.
Since its launch in September 2020, CORES has been hard at work on the center’s first major efforts. These include developing a guide for open science practices at Stanford – called the “Open by Design” handbook – and producing workshops and a lecture series to help people learn about and contribute to open science across the university.
“Stanford is absolutely the right place to have a Center like CORES because we have such a strong tradition of data science,” said Michelle Mello, professor of law in Stanford Law School and professor of medicine in Stanford School of Medicine, who is a member of the CORES executive committee. “And part of that leadership should be helping lead the culture change around how data are used and shared.”
The work being done by CORES focuses on Stanford for now, as a case study, but they aspire to help researchers generally.
Defensive science
In championing open science, the members of CORES hope to address some of the perceived shortcomings of modern science.
“Part of the motivation behind CORES is the desire to make science actually answer questions as effectively as it should be able to,” said Russell Poldrack, the Albert Ray Lang Professor of Psychology in the School of Humanities and Sciences and director of CORES. “I’m a scientist because science offers us the best way to answer a subset of the questions that humans have. But we’ve learned that the way science is actually done often doesn’t lead to those kinds of answers.” As an example, Poldrack points to the challenges of reproducing and successfully applying findings from the field of cancer biology.
At some level, Poldrack thinks of open science as “defensive science” because being more open – documenting and displaying data, methods and results in a transparent way – is a ward against mistakes that might otherwise go unnoticed. In this way, open science also supports one of science’s fundamental features: that it should be reproducible.
“The basic premise of science is that you do it in such a way that others can reproduce the experiment and come up with the same result,” said Chris Mentzel, executive director of Stanford Data Science. “So open science is like putting an exclamation point on science by trying to really make modern science – the complex data, computation, analysis – available to everyone and reproducible by anyone.”
In addition to promoting more credible and justifiable findings, open science also means a richer, fuller collection of data, which can in turn promote collaboration and efficiency.
“Two important goals of CORES, from a scientific perspective, are to expand the number and increase the pace of analyses that we do on the data that we already have,” said Mello.
A culture change
In defining its role in supporting and encouraging open science, CORES also strongly emphasizes the need to make science and research more inclusive.
“In my own work, I’ve personally thought of ‘open science’ as releasing data and releasing code,” said Maya Mathur, assistant professor (research) of pediatrics and of biomedical informatics research, who is associate director of CORES. “But at CORES, we’ve also construed open science to involve openness to people and to diverse voices being involved in science.”
Although research is often supported by money from taxpayers and motivated by some desire to improve our world, it can be surprisingly difficult to connect the average person with scientific data, processes and results. This not only affects our general understanding of science and research but also who can join scientific communities.
“There have been various limitations on who can contribute to science in the past. Some of them are based around personal characteristics, some of them are based around knowledge and skills,” said Poldrack. “And we want to make clear that science should be open to the broadest set of people possible – both because we think it’s the right thing to do, humanistically, and because we think it makes science work better.”
The “closed” status quo of science transparency is also experienced within researcher communities, adding another layer of inequality and inefficiency.
“There are data haves and data have-nots,” said Mello. “And in a world where you have to buy data or where people who already have connections in elite universities have more privileged ability to access data, there’s an obvious barrier to diversifying and expanding the field.”
Expanding data access will be no easy task and will require a culture change among scientists, who are incentivized to keep their data for themselves and their lab members. Awards and funding, for example, tend to center around the idea of superstar scientists making transformative discoveries on their own or in small teams. Sharing your hard-earned data increases the chances that you’ll be scooped on a significant finding.
“Incentive structures drive behavior more than is ideal,” said Mentzel. “And that’s why CORES is not just saying that open science is something that we should do. We hope to make open science something that you’re better off doing.”
To encourage change, CORES conferred their inaugural Champion and Innovator awards in February, with Mello among the recipients. Next, they hope the Open by Design guide enables realistic, reasonable, lasting adoption of open science.
Resources for openness
The Open by Design guide aims to address another recognized obstacle to open science: Making data transparent, accessible and understandable to a broad audience requires time and special skill, and it’s often easier to only design data for your own use.
Different fields of research often report their data, methods and results in different ways. So, the CORES team has been consulting with researchers around the university to curate a set of examples to show best practices from different fields. The guide will focus on how to organize and share research elements – including data, code and analysis platforms – along with open access publication, and reproducible data science procedures and manuscript generation. Ideally, the guide will be applicable across all of the domains of scholarly inquiry, including the humanities and arts. They are hoping to have it available by the end of 2021.
CORES is also working with Stanford researchers to develop additional tools, workshops, a lecture series and open science office hours. Although the efforts are all very Stanford-focused right now, the ultimate goal is to make CORES tools and resources useful far beyond The Farm.
“There have been some excellent open science efforts at other universities, like Berkeley and Harvard. So, we see CORES as trying to build a culture at Stanford that is up to the standards of this institution,” said Mathur. “We hope that will have effects that percolate to other institutions as well when they see how we value being a leader in open and reproducible science.”
CORES is funded by Stanford Data Science.
To read all stories about Stanford science, subscribe to the biweekly Stanford Science Digest.