What is (good) science?

My latest article has raised a discussion on philosophy and methodology of cognitive science among the reviewers and editor. I use this opportunity to continue my own pondering on what science—and especially good science—is.

Part 1: My quest for science

26.04.2019

My latest piece of work has finally been published in Cognitive Processing [1]! And the icing on the cake: this paper raised a debate in the form of opinion papers by the handling editor and the reviewers with an invitation to the whole readership to participate!

The longer story: after I had gone through the first review round ("major revisions"), one reviewer evaluated my paper with "accept", the other with "reject"—an uncomfortable situation for an editor. Don Ross, the handling editor, therefore asked a third reviewer, who answered along the lines "it depends on what the journal wants". Consequently, the editors of Cognitive Processing started to discuss what the journal wanted, and decided that this discussion was too interesting to have it behind closed doors. The result are three discussion papers by Don Ross and the disagreeing reviewers Konstantinos Katsikopoulos and Marc Jekel in the current issue of Cognitive Processing.

All this brings me back to a question that has been haunting me for more than seven years: What is good science? I was asked this question when I applied as an assistant professor in Tübingen and I was shocked that I could not give any good answer (I suppose other candidates did not do much better, because I got the job in the end). I was applying for a position as a scientist and could not define what good science is. So what was I doing all day long? How could I know whether I was doing a good job? And the more I thought about it: I could not even define what science is, let alone good one.

After years of pondering and reading, I am relieved to know that I am not alone and I may never be satisfied with any answer I come up with. What I am sure about is that a good scientist should think about this question now and then. My last major attempt ended in a rainbow course I gave at the Interdisciplinary College 2017 entitled "Really Bad Science". The approach was to turn the question around: if I don't know what good science is, maybe I can characterize bad science (and it mirrors my bad mood regarding science at that time). Some of the content was inspired by the wonderful journal club at Felix Wichmann's lab in Tübingen in the winter term 2016/17 with the telling title Everything is fucked.

I am very grateful to Cognitive Processing for publishing the debate my paper has initiated, showing that there may still be some hope for science. I use this opportunity to overhaul my concept of science, this time in a more positive and constructive light as a series of blog posts.

Part 2: The question of generality

30.04.2019

Generalization is one of the defining criteria of science—at least for me. In the discussion about my paper I discovered that not everybody shares this opinion. I try to understand the origins of these different viewpoints. …

I have always assumed that the goal of science is generalization. The debate in Cognitive Processing reveals that this assumption is not generally accepted. This came as rather a shock for my since I never thought about questioning my assumption. So this is a good opportunity to do so.

The process of scientific work

Every human being performs experiments. When you try a new muffin recipe, you are experimenting. You may find that you like the recipe and will keep on using it, or you may discard it, or you may make notes such as "next time use less sugar". In the latter case, your experiment may lead to a whole series of experiments where you modify the recipe slightly in each step until you have your perfect muffin recipe. Depending on how accurately you measure your ingredients and how well you note down the changes in the recipe, you employ scientific methodology to some degree. How scientific your methodology is, is measurable by how well others can reproduce your experiment, i.e. when they take your notes whether they will obtain the same kinds of muffins.

Is this a useful approach? Yes. Is this science? No.

Why not? Well, if it were, everybody would be doing science, so there would be no need to pay scientists. Science is more than just performing one experiment after another. It starts when you extract general principles from your experiments that you can apply to a wider class of cases than you have experimented on. Let's say you have experimented with different cake recipes and found that for your taste you should use less sugar. So you could extract the general rule "in cake recipes use half the sugar as indicated in the original recipe". This rule will probably save you time in the future, because you can focus your experiments on the specifics of each cake, while not having to waste an experiment on the quantity of sugar. But you must also be aware that this rule is not completely backed up by empirical evidence, since you have only observed it on a handful of recipes. The moment you use a recipe written by someone who shares your taste preferences, the rule will be wrong. But this is what abstraction is about: you have to make reasonable assumption to generalize your observations. This brings the risk of overgeneralizing, for the benefit of saving time in future experiments.

You can carry this process to more abstract levels. For example, you could note how other people like your cakes and also record properties of your participants. Over time and with a bit of statistics you might observe that dark-haired people prefer sweet cakes while red-haired ones love chocolate cakes. If this were the case and you wrote your findings down in a book, this would help others to find cake variations that suit their and their guests' taste. The more you abstract your findings, the more your rules would deserve to be called a theory.

It is clear that the more you abstract the less detailed your theory will become. While for your own personal taste you may develop the perfect muffin recipe, for a more general rule you may only be able to say that redheads tend to prefer chocoloate cakes, but you may not be able to provide a recipe for a cake that every redhead in the world will like. To use your theory, others will have to make their own experiments to fit it to their specific needs. But instead of starting from scratch, your theory gives them a reasonable starting point, saving them time and effort.

Note also that the more you abstact, your theory will either be more speculative as you will have to add more assumptions to your original observations, or you will have to perform a lot more experiments to give your theory the same empirical backup as your original less-sugar-theory. Normally, both aspects come into play. For a more abstract theory, you generate more data than for a narrow theory, but still you will never reach the same level of detail and empirical backup.

Taking all this together, the scientific is an interplay of

performing experiments to discover general principles,
formulating theories,
performing more experiments to fit theories to specific situations with the goal of testing and refining the theory.

The misunderstanding

This is my theory of science, but the reality looks different. I think only a small minority of scientists even thinks about creating theories, and this shows how broken the system is. Here are some reasons why theories have lost their appeal:

Confusion between experiments as a means and experiments as an end.
Important for step 1 and 3 in my process is that scientists perform experiments in order to build and verify theories. Lots of other people perform experiments, too. They will use a theory as a short-cut to design the experiments for their specific purpose. As stated before, the more abstract a theory gets, the more experiments you need as a basis (or the more you have to speculate), and the more experiments you can think of for testing and refining your theory. So, naturally, the bulk of scientists' time has to go into experiments. What is more, for experiments you can establish a set of standards that you can teach young scientists, whereas theory building is more of an unordered pondering that is never trained or formalized (and that is probably as it should be). The combination of a lot of time spent on performing and publishing experiments compared to no training and little reward for theory construction, seems to lead many to believe that science is about isolated experiments. But what would be the value to society if science is about answering small, well-specified questions that nobody has asked?
The lack of courage.
Abstraction is impossible without speculation. A theory that just explains one experiment is not a theory, and as soon as you abstract from several experiments, you have to add assumptions that were not tested explicitly in any of the experiments. Does this give scientists the right for pure speculation, without any empirical evidence? I would go so far to say yes. For a theory to be used and accepted by other scientists and practitioners, it must be backed up with good reasons. One reason could be a solid empirical underpinning. Another could be an analogy to other fields, establishing a similarity of processes and thereby possibly a re-use of tools and thinking patterns. In a (hypothetical, because currently inexistent) system of critical scientists and practitioners, people would use and nourish theories that work and abandon those that don't. The simple measure for practitioners would be whether the theory helps them guide their own experiments, and for scientists how well it fits to observations and other theories. The problem is not speculative theories, the problem is people blindly keeping on following wrong theories just because everybode else does. For an example I recommend Lakoff's wonderful book Women, Fire and Dangerous Things [1] where he disproves the philosophical view of objectivism with strong empirical evidence, while objectivism continues to be used as an implicit assumption in practically all of science. The scientific system encourages small steps. You make an experiment, do some statistics on the outcome, write it all down into a paper. The process follows a certain pattern that any reviewer recognizes and your paper will be published. If you develop a theory, you have to rely on the common sense and judgement of the reviewers. Sometimes this does work, but most reviewers are not used to building theories or using their judgement. They are trained on doing experiments and following that pattern like little science robots. A paper not following this pattern is very easily rejected. Add to this the publish-or-perish culture and you know why people are reluctant to develop abstract theories.
The impossibility of reflection.
Since a theory cannot be mechanically derived from experimental results, it needs deep reflection. A theory has to grow in ones mind, you have to allow yourself to follow wrong paths, to reformulate a theory, to make assumptions thay may or may not be established later on by experiments, until it all falls into place and feels right. But the luxury of reflection is only for hobby scientists or those who accept an end of their career as a consequence. I have seen publication lists with more than 24 publications per year, every year! This is a (published, not just written) paper every two weeks! Even assuming a big research group behind such people, just reading these papers (and this is strictly speaking less than you should invest to have the right to be listed as an author) together with other duties such as teaching and acquisition of funds, means that there cannot be much thinking going on (unless those people employ highly parallel thinking processes still to be discovered in experiments). But these are the publication lists you compete with in funding decisions and job applications. And the more decision makers are pressured by reporting criteria and time constraints, the more they opt for the easy choice of the higher numbers. So by a simple evolutionary process, people who take the time to think, are eliminated from the system.

I think those are the main reasons why the scientific process stated above is not working. People are doing experiments, which is justified by steps 1 and 3, but leave out step 2, rendering steps 1 and 3 meaningless. Now and then people seem to feel the awkwardness of science not generalizing and a common argument is that it is just too difficult. Ross puts it like this:

In this teeming research ecosystem, prospects for a general theory of cognition must now be regarded as exceedingly remote. [...] In one respect, the picture of science, and cognitive science in particular, as an archipelago instead of a tower makes the Popperian attitude more plausible, since only relatively isolated theories can face the tribunal of experience straightforwardly. [2]

Of course we are far from a grand unifying theory. And as we have learned from Douglas Adams, the only answer such a theory might give is "42". Maybe the one and only theory to explain everything does not even exist. But that is a lame excuse for accepting isolated experiments as science. We may never establish one overarching theory, but the process of formulating smaller theories and along the way discovering the right questions to ask for such theories, is a valuable process, and the only purpose of science.

My own little theory

As an afterword I want to put some perspective on my own little "Unifying Model of Decision-Making". It is far from being a suprising, general new theory. The brave assumptions that abstract from single observations to a general mechanism had mostly been made by others before (see references in [3]). What I did was to combine three theories that had been formulated qualitatively, i.e. expressed in text, into a formal model that is expressed in an algorithm. I provide something similar to a software library of decision making: a general algorithm and some functions that can serve to fill in some details of that algorithm.

Even this little step needed a lot of reading, pondering, coding, re-reading, re-pondering, re-coding, ... I am happy and pround to have done it, and grateful that this work was published, showing that my views on the scientific system may be a bit too gloomy after all. My first feeling after having come up with the model was "wow, this theory explains decision-making". The process of writing the paper and some sleepless nights were enough to convice me that the model actually raises and leaves unanswered more questions than it answers. But this is how it should be. Maybe a (general and informal) rule for recognizing good science is this:

A piece of good science raises more questions than it answers.

Maybe journals and conferences should add this review criterion: "What is the proportion of questions raised to questions ansered by this paper?"

Part 3: How to define science

17.05.2019

The question of good scientific methodology is interwoven with the definition of science. I discuss traditional definitions of science and show how the superficial application of principles and methods leads to the justification of inappropriate methodology and narrow research questions. …

Defining science is not just a philosophical exercise. On the one hand it is a societal question of attributing authority and trust into scientific findings. On the other hand it is the question of scientific methodology: only those works that adhere to accepted scientific methodology should be published in scientific journals. Methodological disagreement is almost a standard in the reviews of my papers. In the debate on my latest paper in Cognitive Processing, Ross [1] points out the disaccord of scientific methods in different disciplines inside cognitive sciences. In the following I discuss different approaches to defining science in general, the wrong assumptions that have been underlying such attempts, and a possible remedy.

Dictionary definitions

Dictionaries and encyclopedias are usually a good starting point for definitions:

Collins Free Online Dictionary: Science is the study of the nature and behaviour of natural things and the knowledge that we obtain about them.
Cambridge Dictionary: (knowledge from) the careful study of the structure and behaviour of the physical world, especially by watching, measuring, and doing experiments, and the development of theories to describe the results of these activities
Oxford Living Dictionaries: The intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through observation and experiment.
Wikipedia: Science (from the Latin word scientia, meaning "knowledge") is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe.

All definitions agree that science has to do with obtaining knowledge, possibly also the knowledge itself as the latin origin suggests. And this knowledge is about natural and/or physical things in the universe. The German understanding of scientific disciplines is wider as we include humanities (Geistes-Wissenschaften):

Duden: (ein begründetes, geordnetes, für gesichert erachtetes) Wissen hervorbringende forschende Tätigkeit in einem bestimmten Bereich
Wikipedia: Das Wort Wissenschaft (mittelhochdeutsch wizzen[t]schaft = Wissen, Vorwissen, Genehmigung für lat. scientia) bezeichnet die Gesamtheit des menschlichen Wissens, der Erkenntnisse und der Erfahrungen einer Zeitepoche, welches systematisch erweitert, gesammelt, aufbewahrt, gelehrt und tradiert wird.

The German dictionaries say nothing about the subject of the knowledge to be gathered. They are offer more alternatives concerning the method: it has to be systematic, justified and orderly, it includes gathering, storing and teaching things in order to obtain knowledge, while the English definitions mention observation and experiment.

Popper's legacy

For the moment, let us take the narrow view that we have to obtain knowledge by observation and experiment.

The most famous philosopher of science among scientists is certainly Sir Karl Popper (1902–1994). One reason for this is that Popper was particularly concerned to demarcate science from other activities, especially from ‘pseudo- sciences’ that try to illegitimately borrow the epistemic authority that is institutionally conferred upon scientists.[1]

In brief, Popper's theory says that

The scientific objective is to build theories.
A theory must be formulated in a way that it is falsifiable by experiments.
A falsified theory must be abandoned or needs to be adapted to the new observations.

So take the statement "everybody with red hair loves chocolate cake". We could design an experiment where we invite red-haired participants to taste chocolate muffins and ask them whether they liked them. If any of the participants answers that they did not, the statement would be proven wrong. We would either have to find a completely new hypothesis on the relation of hair color and cake preferences or adjust our statement to something like "the majority of people with red hair loves chocolated cake".

Popper's theory is basically an attempt to overcome the confirmation bias—a natural human tendency to look for confirmation of one's beliefs. By forbidding to use experiments as a confirmation of a theory, Popper forces scientists to think about ways of dismantling their own theories.

I will not repeat the many criticisms that have been raised against Popper's argument. More important is its influence on current scientific practices:

Notwithstanding the complexities involved in what preoccupies philosophers, namely trying to exactly state a viable formulation of falsificationism, contemporary scientists often defend and impress upon their students a general ‘Popperian attitude’ that is unarguably a core part of scientific culture. Setting aside pure mathematics as a domain that receives its own extensive philosophical consideration, theory or speculation that floats free of empirical implications is not scientific.[1]

And here is the problem: I think this general 'Popperian attitude' has focused on the wrong part of Popper's argument, (mis-)using it to justify narrow research questions and answers by idealized experimental settings.

A more useful (and I think truer to Popper's intentions) approach is teaching students to be critical about their own work. We all get carried away by our ideas and try to see the world fitting them. We need to train ourselves to see the world through other people's eyes and to find faults with our own ideas. In my time at the university I observed extremely intelligent students producing mediocre theses and I wondered what you need to produce execellent theses. The really successful students were constantly questioning their own work (without being shy or unsure about their abilities) and were happy to receive criticism of others. This attitude requires self-awareness, confidence and courage. Of course this is much harder to teach than mechanical procedures of hypothesis testing.

Science as a radial category

The dictionary definitions and Popper all treat science like a mathematical set, trying to provide conditions of what belongs into this set. The elements of the set can either be theories (as the knowledge contained in science) or methods (as the way to obtain the knowledge).

Lakoff [2] shows (based on empirical evidence!) that human categorization differs from set theory in fundamental ways. One common representation of categories are radial categories, which means that a concept is defined by a central element and reasons that justify the inclusion of other elements, e.g. by shared properties.

Talk to different people and ask them to characterize a "cup". You will find that this simple concept is not as simple as you may have thought. Does a cup have a certain minimum and maximum size? Does a cup need a handle? If so, is a cup with a broken handle still a cup?

Instead of trying to find conditions that include objects into the set of cups, we can consider a radial category, starting with an uncontroversial typical cup: a small drinking vessel with a handle. Each of us may have a slightly different image in our minds what such a cup would look like, but I suppose we would be able to find one cup that we could all agree on that it is a cup. We can then accept objects as cups that share properties with our central notion: a large mug could still be called a cup, a bowl with the right size could be considered a cup, a tiny espresso cup misused as a container for paper-clips could still be a cup. They all have some (but not necessarily all) important aspects in common with our central idea of a cup: you can drink from them, you can somehow lift them to your mouth without getting burnt, they have the right form. But this does not mean that anything can be considered a cup. If my central cup concept is white and made of porcelain, this does not turn a white porcalain plate into a cup (as a side observation: in German you can call a small plate you put underneath a cup "Untertasse"—literally "under-cup", so we actually turn a plate into a cup by its function and fitting material, but not by its material alone).

Let's come back to science. What if we define science as a radial category? Philosophy of science clearly defines physics as the central element. Physics studies physical processes in the universe by building theories based on experiments. Physical processes are also active in substances and living beings, so chemistry and biology can be justified to be science, too. Humans are a kind of living beings and since it happens to be us, the study of humans seems important. Thus we can include medicine, phsychology and neuroscience. Humans live in societies that have evolved over time, thus we include history, archaeology, political science, or social sciences of any kind. Societies have been shaped by human beliefs and thoughts, so we can include philosophy, theology, literature studies, etc. Getting back to the physical objects studied by physics, we also have an interest in exploiting this knowledge to build things. With this argument we can include all the engineering disciplines into science.

So by a simple extension of the type of knowledge we are interested in, we can easily derive any discipline that is usually taught and practiced at universites and call them science. But what about the methods? And here is the problem: at least the English definitions above, as well as the 'Popperian attitude' generally spread in science, take over the methods of physics, even though the subject matters of other disciplines differ greatly from the questions in physics. I am not the first to note this:

Lo and Mueller (2010) complain about how economics models cannot grapple with uncertainty which cannot be reduced to probability. The authors ‘blame’ an attitude of over-reliance on the modeling approach of physics and of the natural sciences more generally. Each discipline, while it should definitely be informed by ideas in other disciplines—remember Simon?—must ultimately develop its own models, catering to its own unique needs.[3]

The German definitions of science, which includes humanities, try to generalize the motivation of scientific methodology. The Duden says that the goal of science is to achieve justified, orderly and trusted knowledge. These are good reasons to base theories on experiments, as physics and related disciplines do. A structured argument in ethics, a documented literary review, or an ordered collection of language samples in a dictionary fulfill these claims in the context of other disciplines. Instead of badly copying methods from physics we should better try to understand the motivation behind the methods, copy the motivation, and use it to build reasonable methods for each discipline.

Part 4: Science and reality

30.05.2019

The public demand for science to produce absolute truths limits the topics and methods of science. Some fields are starting to liberate themselves from the standard objectivist claim of truth and I hope others, and the whole of society, are going to follow. …

In the discussion in Cognitive Processing, Katsikopoulos [1] raises the important question of How to build models for the wild?. I wonder how many scientists have ever asked themselves whether their work is of any use outside their laboratory. I suspect that most researchers are primarily concerned with doing things "right" by using "correct" methodology. This attitude comes not only from the pressure of publishing and acquiring funds, it is also connected with the almost religious reverence of science in the public opinion. The expectations are that science delivers the Truth. It is time our society thought about what we can realistically know about the world. I am sure that philosophers have dealt with this topic at length. But since I am not into philosophy, I present my own naive view.

My impression is that most people hold an implicit belief in objectivism [2], especially in the detail that things are objectively true or false. As limited humans we may not be able to differentiate between truth and falsehood at any time, but the belief goes that there is an underlying truth in the world, and we (especially scientists) have to find it.

I have only encountered one complete truth in this world: Nothing is completely true or false.

With a little thought about our everyday life, I think most people would agree with me. I am fascinated by the dual nature of our thinking: on the one hand we cope with uncertainty and ambiguity all day long without even noticing, but when we plug into the more "conscious" thinking mode we start to make unrealistic simplifications. Maybe this is just how our brains work.

But it becomes a problem for science. With the high expectations of delivering only ultimate truths, research focuses on questions that are so tiny and/or artificial that they can indeed be (relatively) well defined and measured. In this way science builds itself an artificial world of imaginary research questions. With the exception of mathematics, whose only purpose is to build artificial worlds with imaginary rules and questions, science makes itself meaningless.

An especially sad example is linguistics. Language is for me one of the most fascinating objects of study, from the development of languages to their structure, and of course how to extract meaning from a sequence of words. Sadly, (computational) linguistics has been stuck on Chomsky's assumption that natural languages are the same as formal languages. Recently I attended a lecture on computational linguistics. Having been explained how the meaning of sentences is defined by truth values, someone in the audience asked "What about sentences that have no truth value?". The presenter answered calmly and smilingly "Linguists disregard them".

And this is the reason why the practical approach of language processing by Google, Apple, etc. is so much more effective than the language theories of the last 50 years. Statistical methods for machine translation tell us nothing about how language works, so from a scientific perspective they are almost useless. But from a practical perspective they are extremely useful. What makes me sad is that I think this level of precision in speech-to-text or automatic translation systems can also be achieved by more informed methods than statistics, but now that statistics works so well it is hard to justify basic research into more interesting methods.

My gloomy view on science is of course limited by my own area of research. Katsikopoulos [1] describes how the field of operations research has found a way to allow less specific, but more realistic models that are useful in real life. This gives me some hope that other communities might follow this example and scientists get the freedom to explore things that are really relevant.

But this is not just a problem of science, it is a problem of our culture. We must learn to recognize that the world is complicated (which, by the way, also makes it interesting) and deal with it accordingly. Instead of trying to predict the future based on unrealistic models, we had better recognize the implicit limitations of understanding imposed by our environment. Luckily, Gerd Gigerenzer and his collaborators [3] have started to draw attention to the fact that we mostly deal with large world situations, where we have limited information, and how simple decision methods work better in such situations than "clean" mathematical models. I hope that over time this view will be generally adapted, both in the public view and in science.

Postscript: Interdisciplinarity

Katsikopoulos identifies the lack of interdisciplinary work (and respective diversity of methods) as a reason for the bind in cognitive science:

If we cognitive scientists today take seriously Herbert Simon’s example of studying decision making in an interdisciplinary way, then we should also try to do the same and move outside of our comfort zone.[1]

I have been lucky to work in interdisciplinary environments for much of my scientific career and this is a good place to thank the institutions that made it possible (in order of appearance):

← Zurück zur Blog-Übersicht