Monday, 27 February 2012

Embodied Cognition: How not to write a paper...or design an experiment...or analyse results or...

As part of my Embodied Cognition module, we have to practise reviewing papers, and provide a recommendation on whether the article should be accepted, revised or rejected. If you're sad, like me, this is quite a fun task. Picking out methodological and conceptual errors is a great way to test if you actually understand what you are reading or not. It is also an essential component of the scientific method. The review encourages a rigorous inspection of the experimental design, statistical analysis and conclusions drawn. The particular paper (published in Brain Research) up for review this week was...

Scorolli, C., Borghi, A.M. (2007). Sentence comprehension and action: effector specific modulation of the motor system
... unfortunately, they fell at the first hurdle and carried on tumbling. I thought this was a nice illustration of how not to conduct and write up an experiment and would provide a good template of things to avoid.

The paper tries to provide evidence for the weak form of embodied cognition, which basically argues that concepts are not transduced into amodal symbols but re-enacted in the same or similar modality specific systems used in perception and action. This is often demonstrated as bodily states influencing mental states.

The authors hypothesise that "reading sentences related to actions to be performed with different effectors (mouth and foot) [would] activate the same neural systems activated during the effective execution of these actions". In other words, the same neurons should be activated while understanding an action sentence as while performing an action with a specific effector. They, therefore, predict that participants will respond more quickly when the action in the sentence is related to the response modality. Mouth sentences such as 'sucking the sweet' will be recognised as meaningful when responding with a mic more quickly than hand or feet sentences and the same with foot sentences when respnding with a pedal.

There is actually a huge conceptual issue with this weaker form of embodied cognition, which still relies on representations and classical computation but I shall leave this in the capable hands of Andrew and Sabrina. I shall focus on the more standard, bread and butter problems of the paper, which are bad methodology, bad analyses and a bad conclusion.

Think carefully about the design and implications for the necessary statistics to analyse it
On the word of my lecturer, the design the researchers chose was a '2x2x3 partially nested design'.  The participants had 2 response modalities - mic vs pedal, there were two between-participant blocks, and each compared either 'mouth vs hand sentences' or 'feet vs hand sentences' for each of the two response modalities.

If this sounds complicated - it is -it's a joke. I was going to attempt to explain the methodology further but the risk of boring myself and you is just not worth it and the paper is free to marvel at here. Besides, the fact the experiment is such a bloody pain to explain and understand, serves to illustrate just how overly and unnecessarily complicated it is.

The design of an experiment is crucial and if you get it wrong, no amount of complex post hoc analyses will cover it up. As a general rule the simpler and more elegant the better - otherwise a ridiculously complicated design leads to ridiculously uninterpretable results.

Know your stats - p values, effect sizes, interactions and the difference between the differences
Faced with some pretty complicated results, the authors were forced to conduct two separate mixed factor ANOVAs for each block. Straight away, this meant they couldn't report any interaction and certainly couldn't compare results from each ANOVA.

The results however weren't too convincing. Expectantly, they found that mouth sentences were quicker than hand sentences when responding using the mic but they also found that feet sentences were also quicker than hand sentences for mic - this does not support their hypothesis. In order to justify their findings the authors argued "the marked difference between the effect sizes (p < 0.009 vs p < 0.05) confirms that the simulation is effector specific".

There is so much wrong with this statement it is almost comical.

The authors do not report the effect sizes so we don't know what the difference between them is.

Instead they quote the difference in the p values which aren't defined (p less than...well how much less than!? We need the exact values.)

But don't worry because even if we did have the exact values, this tells us nothing about the difference in effect sizes. The p value gives us the chance of making a type 1 error - it tells us nothing about the how large the effect size is. In other words we could have a really good p value < 0.0001 but with a really small effect size, conversely we could have a larger p value say 0.05 with a large effect size.

Alas, most importantly, even if the authors did report the effect sizes and they seemed to be quite different, they cannot conclude that there is a significant or 'marked difference'. This is because the results came from two separate ANOVAs. The authors have correctly found a significant difference between hand and mouth (for mic) and between hand and feet (for mic) but it is incorrect to say there is a significant 'difference between the differences'. Any differences or correlations require analyses to show if they are significant or simply occured by chance and this example is!

Be critical and tentative when drawing conclusions from your results
Even in the most straight forward experiment, with large effect sizes or whatever, it is important to consider what else might be happening. It is commonplace to see "our results support our hypothesis but [enter potential confounds + limitations]". Results almost never prove a hypothesis.

Having found that half their results don't support their hypothesis and following a botched attempt to salvage them, Scorolli and Borghi, conclude their results "support the view that the act of comprehending sentences leads to the creation of internal simulation of the action read".

No they don't. At a basic level half the results do not support the hypothesis and there is no way to investigate further interactions, since the poor design has forced the use of two separate ANOVAs.


There's so much more wrong with this paper, but I want to talk a bit about the implications of bad research. If anyone is interested, I welcome comments on further problems!

Back to Science and Faith

The main point of this blog was to give a few important pointers with regards to a scientific approach in psychology. But I do think there is another message here. I wrote a previous blog on the importance of faith in science. That is, whilst (arguably?) scientists can be confident of the scientific method, in order to move forward we must have faith that others have successfully implemented this method. This may sound obvious or even trivial but I think it has stark implications. If this kind of research, which makes the basic errors can be trashed by a bunch of 3rd year psych students, can get published, how much other dodgy research is out there?

Of course, it isn't just the job of the Journal (ahem, sorry, reviewers) to critique the paper and any scientist should be reading through skeptical lenses. But this just doesn't seem to happen. This particular paper has been cited 51 times and after a brief gander, not by papers pointing out how crap it is. It's easy to see how a completely unscientific and unsupported model or theory could become popular to the point of becoming a given.

A quick caveat

I think it's important to note the aim of a review is not to metaphorically (or literally) obliterate the article in front of you. As Robert Sternberg points out in his book, Reviewing scientific works in psychology, whilst it is tempting, especially for young reviewers, to measure their own competence in terms of how many faults they can find, a good reviewer finds a balance between the errors in a paper and possible reasons why the paper would be beneficial or insightful. This is termed the 'gate keeper vs generative' mindset.

In other words, there is nothing impressive about pointing out the many faults in a paper, if you fail to spot that, with minor or major revision, the paper actually has something very important, insightful, surprising or even paradigm shifting, to contribute. I also think it is much easier to spot the floors in someone else's experiment than to critically design your own. This blog is not a quick fix ego boost in the form of slating someone else's hard work.

Scorolli and Borghi make some classic school-boy (or school-girl rather) errors that even students should not make, let alone professional psychologists. What's worse is the paper isn't even particularly insightful or even conceptually well grounded. I suppose at least they have served here to provide us with a few do's and don'ts in experimental design.

Oh and in case you're wondering the recommendation was a unanimous REJECT.


  1. When reviewing papers, I always try to find a path to publication for the paper, and have luckily only had to recommend rejection in a few cases. Only once, that I remember, was this because of an unsalvagable research design. More typically, it was because of a mismatch between an article's topic and journal topic, and even then I try to give suggestions for improvement.

    With only a few exceptions, I have also thought I was treated fairly by those who reviewed my article submissions. Exceptions include A) the overly aggressive nit-picker, who seems to think there is such a thing as a perfectly definitive experiment, and is intolerant of incremental work, B) the self-centered navel-gazer, who only comments on what he wishes your experiment would have been, and C) the speed reviewer, who either writes nothing of substance, or gives a list of things you did not address, all of which were clearly addressed.

    All in all though, as I said, my experiences have been very positive (even when frustrating). I have greatly appreciated the care most reviewers used in giving me feedback.

  2. I'm glad you have found your experience the peer review process to be a positive one.

    Do think there is a problem with a) poor research getting published in the first place, b)a lack of critical thinking when reading a paper and c) sloppy secondary citing such that unsupported trivia, claims, or even theories + models themselves become the generally accepted norm? Or would you disagree?

    One recent (although relatively trivial) example is the commonly quoted '1 in 4 will get mental illness' here

    What shocked me was not so much that the Scoroli paper got published in the state it was in (I guess from time to time this will happen and the whole point of the exercise was have us a review a problematic paper) but that it had been cited 51 times by people who clearly hadn't read the paper. Then they get cited and so on and so on and what started out as a crappy misleading paper that should never have been published, is now being regularly cited to the point it is assumed it must be correct.

    I'm speculating here, but if this kind of bad research took place in Medicine the outcomes could be fatal (and at times it does and is - Wakefield and MMR springs to mind!) But I get the impression that it seems to happen a lot more in Psychology, perhaps because the implications are less, well, deadly. But it's still bad science and it just makes me wonder how many avoidable myths, misconceptions, errors and just bad research papers are out there...

  3. "Do think there is a problem with...."

    Yes, yes, and yes.

    I'll take a look at the frontier article when I get a chance. I'd be thrilled to have 50 citations to any of my papers. Maybe I should copy them ;- )

    When the pressure to publish is high, and the outlets many, a lot will get through that shouldn't. That is not even taking into account the strong effects of academic politics and trendiness.

  4. what is scientific notation
    Scientific Notation include in the mathematics course. In the world of science some time we deal with numbers which are very small and those which are very large. In some branches of science large numbers while in others very small numbers are used.