The Buffet Approach to Open Science

We have written about a few of the open science practices, some of which are becoming the norm, such as preregistration (and whether it prevents creativity). I’ve also been invited a few times to give classes and workshops to introduce various audiences to open science and how to implement the practices associated with the overall term (which covers so much more than changing our experimental and publishing habits, but that’s another blog post). Doing so means engaging with researchers from various disciplines, who conduct dramatically different types of studies, and approach science from a different angle than the prototypical theory-testing experimenter. The discussions around open science I had in these contexts have been extremely useful for me, and led me to promote what I call the “buffet approach” to open science. In short, I think it makes most sense to pick and choose those components and practices from open science that fit a specific project, career stage, personal skills, and institutional support.

To illustrate the buffet approach, I’ll use examples from my own research, but your own constraints will be different. I also need to preface this post by making clear that I was very lucky in working in very supportive environments (big shoutout to Alex Cristia and Caroline Rowland, who recognized the benefit and necessity of open science very early). With less support, adopting open science practices becomes much harder and makes it necessary to find your support community elsewhere (check out for example R-Ladies, ManyBabies, PSA, FORRT, …)

Buffet rule 1: You might not be able to try everything 

At a buffet, what you try also has to fit your diet – not all open science practices are even available for everyone, think preregistration for iterative theory building or data sharing for videos. To stick with the metaphor, some people are allergic to milk, so they will not try the yoghurt dressing with their salad. But balsamic is a good alternative.

So what if I can’t share my raw video data? Is data sharing completely out of the question then? I know we like simple distinctions (think significant vs non-significant) but open data, for example, comes in more shades than just black and white. Of course, it is possible to share everything from raw data, over derived data (in my own work for example I take video recordings, which I often cannot share. But I can share annotations, summary statistics, etc), to meta-data (data about the data: where were videos recorded, for what purpose, with what equipment, …). But it is also possible to keep some parts of this pipeline that contain sensitive information (here: the video, and names and personal information in the annotations) private and share the cleaned annotations only. This is exactly what the CHILDES database of child language has been doing since 1984, and it’s still a key resource in understanding language acquisition. This means, even the transcripts are incredibly useful for researchers when video or audio recordings are not necessarily shareable.

Hearing that you don’t have to go all in with data sharing can, in my experience, take a great burden off researchers’ shoulders. Especially when working with sensitive data, it is important to stress that open science does not have to mean access to everything for everyone. We also need to consider our participants’ rights and our obligation to behave ethically. 

I would say such gradedness exists for most open science practices. Think of preregistration: You can preregister before data collection (as I did here), or write a registered report. Both are (usually) quite different in their level of detail and commitment. Or you can preregister before data analysis (as done here). Or you preregister twice: both before data collection and before data analysis, for example when you need to amend something, because you realized you knew less about your data than you thought (like how many participants of a certain population you could get or whether children will make it through your wonderfully balanced but rather long experiment). We discuss the topic in more depth in this preregistration primer (open access version) your favorite bloggers co-wrote with the fantastic Naomi Havron.

Buffet rule 2: You don’t have to try everything at once

That’s basically the heart of the buffet approach: You eat what you like and feel like, but don’t have to go for everything that is on the table. So if for a specific project you want to share materials and data, but did not preregister (maybe because it will be an exploratory study or simply lacked time and resources), that’s already a great step forward. Open science practices are not all or none, you can pick and choose, match and mix, and do what’s most suitable to your career stage, project, lab, and level of support. 

I am thinking of trying too many things at once as a stuffing yourself: You don’t get to enjoy the benefits of single dishes, and it won’t feel good. I’ve made this mistake myself, both at buffets and with open science. For preregistration, this post shares what I learned during my first forays, which was a lot, because I made a lot of mistakes. I also do think that my past data sharing efforts could have been better, because I didn’t have the time to really think through what other researchers might find useful. This is because I tried to do too much at once with little support (which, by the way, has increased a lot in the past years, but still, sharing data is hard to do right, see the next rule…). 

This aspect of the buffet approach might be particularly useful for those that feel overwhelmed by the host of new things that seem to become the norm faster than you can read up on them. Just stick with what you know but try to sample one new dish (= practice) for every visit (= paper; thanks to the wonderful Elika Bergelson for that very practical suggestion). By picking one new practice and figuring it out instead of scrambling to hit a number of targets, it is much easier to do open science well.

Buffet rule 3: Label everything

For the graded, needs- and skills-based open science approach to work, we need documentation. Only this way, we can be transparent about the steps we did and did not take and our reasoning behind the decisions that led to the final product – usually a paper. Think of labels that list all ingredients for each dish, or at least whether it’s compatible with specific diets and/or allergies. Say you’re vegan, then you want to avoid the brioche, but you can go for the baguette. 

The need for documentation has two sides, one is institutional, and one is on the lab’s shoulders.

In open science, we need to know what each practice should entail for it to be useful and such guides do in my opinion at present not exist for most use cases. Funders, for example, mandate Data Management Plans and therein require that you follow field-specific standards. That’s just a bit circular, because you already have to know the standards (sometimes a question of having been at the right conference or knowing the right people), so it can become an in-group / out-group thing. Much better would for example be a link to known and reviewed field-specific standards. I know of BIDS for neuroscientific data and Psych-DS for behavioral human data, but who knows what other communities use? Should be cross-reference with Anthropology? And what about interdisciplinary research…? In short, I usually end up with more questions when reading guides to open science practices.

Now, what can we still do at the individual or lab-level? We do need to know exactly what we’ve done. An emerging focus on good documentation is probably the most useful thing coming out of many open science practices. Whether or not this documentation is formally preregistered, added to openly shared or privately (securely, I hope) archived materials and data, and/or available as commit history is for me secondary to the key change in our scientific habits, namely that we do not focus on the “end product” – i.e. a paper or thesis, but on the process. 

Documentation, like commenting code and describing data, does not necessarily have to be a lonely task. As a lab or community, it’s probably a good idea to develop standards or templates, not just for data but for all aspects of a study. As consequence, e.g. the same very nicely documented code runs on all data and can therefore be re-used with documentation and meta-data (because column names and content will also be the same). So we need to stop reinventing the wheel so much…

Looking at my own papers, I would say none of them is perfect. Some are great examples where I think the authorship team did a lot of things right, for example ManyBabies 1: Infant-directed Speech Preference. We preregistered, shared (derived and anonymized!!) data, scripts, and even Walkthrough videos to make the procedure more transparent. But even with so much effort, we still find gaps in our documentation (recently fixed: stimulus files were not all in the same folder for some reason).

But is the buffet approach the right way?

Some might think that allowing researchers to sample and take their time means we do not improve at all and questionable research practices such as HARKing, p-hacking, or even outright falsification to comply with current incentives will continue. I have three responses to this worry. First, incentives are (admittedly slowly) changing, as shown by this collection of job ads requiring open science statements, updated open science requirements by key funders such as the ERC, and the Recognition and Rewards initiative in the Netherlands. With a set of incentives being put in place, we ensure that moving towards changed practices will not grind to a halt just because not everyone jumps on board right away and with all they have. 

Second, one-size-fits-all approaches cannot work – basically ever (happy to hear your counterexample that is not breathing air…). For starters, as I mentioned before, not all researchers operate within a very specific theory-testing framework for which many practices were developed or generate the type of data that are easily shareable. Asking them to squeeze themselves into a mold might actually harm science, as diverse approaches are beneficial if we care about expanding knowledge. 

Third, new targets to manipulate (i.e. replacing citation counts and significant p-values with open science badges without any quality control) might make the situation even worse. Indeed, “open-washing” is a new term that refers to innocently or deliberately mimicking open science practices without increasing transparency.

For me, the goal of open science practices such as open data and preregistration is to keep ourselves honest. This does not mean that certain practices necessarily have to stop, e.g. data exploration, but any decision should be clear to everyone, not just the final outcome. Storing everything in the experimenter’s head is not very efficient, because let’s face it, even the experimenter is just human and will forget things or change the story in their mind. You’ll notice this when trying to dig up old data and figure out which of the many versions you might have saved were the ones you wrote the paper about. So, be nice to future you and start somewhere…

Image credit: Igor Ovsyannykov via pexels.com, remixed with Open Science Badges retrieved from OSF.io

License: CC BY-SA 4.0

2019 – What a year!

Snippets from the cover images of the top 5 posts of 2019

As you might have noted, this blog was rather quiet in 2019, and for good reason: It was quite the exciting year for your two favorite bloggers, so exciting in fact that even our traditional review post is a month late. We’ve already shared the major changes on social media, and we’ll tell you here as well why our lives kept us from blogging:

Continue reading 2019 – What a year!

A Beginner’s Guide to Conferencing

It’s just over 10 years ago that I was preparing to attend my first conference, a workshop in the very pretty Dutch city Groningen. I presented preliminary data from my Master thesis as a poster, and was appropriately nervous and excited. Just a few months later, I even travelled overseas to Boston for another poster presentation. 

Looking back I realized that there are many aspects of going to conferences that nobody thought to explain to me. As a consequence, I had to learn to swim pretty fast and in luke-warm to icy water. So to make life a little easier for future generations, here are a few questions I remember having before attending my first conference, and a bunch of “conference hacks” I learned along the way. Of course, they are all based on my personal experiences and viewpoints. Depending on your field and personality, your mileage may vary, as they say. 

Continue reading A Beginner’s Guide to Conferencing

Repost – A scientist’s path to a non-academic job: Tips and tricks

Dr. Marlieke van Kesteren is a Neuroscientist with a pretty impressive CV. She discusses science and life on Twitter and in various blog posts. This one is particularly relevant looking at the academic job market, so Marlieke kindly allowed us to share it on this blog.

Continue reading Repost – A scientist’s path to a non-academic job: Tips and tricks

Deal or no deal: Brexit hurts scientists

Two and a half years ago, just days after the referendum, we asked colleagues to share their thoughts on Brexit. Back then the heartbreak and shock were fresh, but the far-reaching consequences had become apparent quite quickly. Those who hoped that due to the very close result (I would not advise any policy maker to even consider making decisions based on such a study result, nor would we build theories on such weak and inconclusive grounds!) that Article 50 would never be triggered were disappointed almost two years ago. That means the deadline for leaving the EU with or without a deal is almost here. We, as fans and beneficiaries of the EU, thus asked ourselves what it is like right now to be a researcher in the UK and are grateful to our anonymous friend, Remaining researcher, who shares their story and viewpoint from within the UK. Our hearts go out to all our colleagues who only got to lose, be it with or without a deal… Continue reading Deal or no deal: Brexit hurts scientists

2018 in Review

A happy 2019, dear CogTales reader! The time around the change of years is, as is now tradition, a time to look back to 2018, which was an exciting and busy year for your two bloggers, Sho and Christina! (This might also explain the slightly less frequent occurrence of posts, please excuse us, but we’re planning to share what we are learning here, of course).

Continue reading 2018 in Review

What happens when you stand up to the big wigs? A follow-up interview with Anne Scheel

Two years ago, Team CogTales (Sho and Christina) interviewed Anne Scheel. We were impressed how she stood up to ask a tough question at Germany’s largest Psychology conference (the DGPs Kongress) after a keynote presentation. Two years later, Christina and Anne actually met up at the next installment of the very same conference, and a lot has changed in the short time span. So it seems like a perfect moment to catch up and take stock. Continue reading What happens when you stand up to the big wigs? A follow-up interview with Anne Scheel

Screen, Baby  – Let’s Look at Evidence, not Trends

Joint post by Nawal Abboub, PhD, & Sho Tsuji, PhD

French version here

Who hasn’t heard that “Children under two years of age should absolutely not be exposed to screen media, no matter what!” – maybe accompanied by the reasoning that “Screens will hinder the development of children’s intelligence.”

Why does the topic of the effect of screens on young children, especially with regard to their brain development, evoke so much controversy and fear? And should we actually think of all types of screen media as equal? What can scientific research teach us? Continue reading Screen, Baby  – Let’s Look at Evidence, not Trends

Data Visualization – the Why and How

Sho: I recently gave a talk on data visualization at the International Conference on Infant Studies (you can find my slides, along with the other wonderful talks on power, preregistration, and ethical data peeking here on the OSF). I also played the German Cats and Dogs Scientist in the barbarplots campaign on better data visualization (on the same topic, Article 1 and Article 2 on why bar and line plots hide differences in underlying distributions). In fact, being part of the barbarplots team was my entry point into thinking more about the importance of visualizing your data in a maximally informative and honest way. Informative means finding a good balance between simplifying/summarizing and showing the underlying data structure. Honest means not (accidentally) hiding important aspects of your data. Mahiko is my office mate and – this is something I discovered while preparing the above talk – an enthusiastic data visualizer. That’s why I asked him to put together our (mostly, his) favorite data viz resources.

Continue reading Data Visualization – the Why and How