I spent four years writing my thesis, four years of ups and downs, p values smaller than .05, but also some t values smaller than .05. At times, I felt confident and optimistic, at times less so – this was somewhat correlated with the p values, but not significantly so.
Then it came time to start gathering everything I did into an article thesis. In the Hebrew University, at least in my year, not all of these articles had to be published articles, and you could also include manuscripts you did not plan to submit for publication. I thought I would write up everything, even my null results, because whoever reads my dissertation (well, at least all those 1st year PhD students and my grandfather) could probably learn just as much from my failed attempts than from my success stories.
For example, my supervisor and I tried to develop an online method to measure children’s lexical segmentation. We wanted to know what units they divide their input into – is it into words, like adults do, or into larger units, like “gimme” [give me]. I created a list of bigrams (two-word sequences) consisting of monosyllabic words that varied in their frequency of co-occurrence (how often they occur together). The second word was always the same – ‘li’ (“to me” in Hebrew) – but the first word changed – at times it was a word that is very likely to be followed by ‘li’ in normal speech, and at times it was a word that would never be followed by ‘li’. We asked children to listen to a string of nonsense syllables and press a key when they heard the “sound li” (we didn’t tell them there will be real words embedded in the stream). We hypothesized that when ‘li’ is part of a very cohesive chunk (e.g., “ten-li” – Hebrew for “gimme”), children will have a hard time separating the word ‘li’ from the chunk, and will respond slower than when the word before ‘li’ is unrelated to it (e.g., “shan li” – “sleep to me*”). We did not find any sign that this was the case – children’s response times were not affected by the frequency of the two-word units. We have put so much work into this task: Reading and thinking, planning, creating the lists of words and matching their frequency and their frequency of co-occurrence with ‘li’, recording, programming and running the task with children – it seemed to me that all this effort had to go into the text.
I was then explained, by well-intentioned people, that unlike an MA thesis, which is an exercise in science-making, a dissertation will be judged like any manuscript submitted to a scientific journal. That meant that reviewers didn’t care about my failed attempts, they wanted to hear about significant effects. Nulls results mean I did not plan my experiments well enough, did not think everything though, was basically not a good scientist – and I better not mention them. So that’s what I did, generally, I did not write chapters about these studies, and this meant that even my grandfather will never be able to learn from my failures, let alone any 1st year PhD student.
However, I did find a nice way to mention (some of) them somewhere in my dissertation. In the discussion, I wrote a little sub-section about the limitations of my studies, where I also explained that I tried several other methods, but that these were unsuccessful. Most of all, I wrote these paragraphs for me. My supervisor and I have spent a long time developing these methods, and had some clear ideas about why they didn’t work. For example, with the ‘li’ task, I thought the task might work better if the two-word units were embedded in real sentences rather than in the context of nonsense syllable strings. I thought my readers could learn from experiments that didn’t work – but I also knew my grandfather was not planning on doing any studies on language processing in the near future (he is a biologist). After submitting my dissertation, I was happy to move on to an exciting post-doc project and put my student discounts behind me for a chance to do something completely different.
I had not thought about the fact that my dissertation will have a few more readers – my reviewers. And one of these reviewers made me feel that failures were nothing to be ashamed of, nothing we should hide. He commented that the fact I wrote about these failures, and discussed the limitations of my project openly, is primarily an index of my integrity as a scholar. This made me so proud of the choice to write about these failed experiments, and about the limitations of my studies, that I was not disappointed at all when my grandfather said he had read “most of” the thesis. I wrote a good piece of text there, I made a great effort, and my effort (not only my successes) was appreciated by someone other than my supervisor and my grandfather. Appreciated by someone – as far as I know – not related to me.
I personally feel that every experiment that was well thought of and well performed should be made public somehow, either on a repository (e.g., in the Open Science Framework) or, if feasible, as a publication (for example in PLOS ONE). I still hope to be able to publish one of my “failed attempts” that way (it is another study that I did not write about here). When it is not feasible or cost effective to do so, then finding a way to mention it in the thesis – for example, in the discussion – is also a possibility. For the ‘li’ study, I feel like this was a good option. Maybe I am wrong, but I imagine that the main audience that could take interest in a new method that didn’t work is the 1st year PhD students who will read my thesis when trying to come up with their own new methods to measure online lexical segmentation.