Peer review in public: Rise of the cyber-bullies?


Originally published at physicsfocus.

A week ago in a news article in Science – and along with my colleagues and collaborators, Julian Stirling and Raphael Levy – I was accused of being a cyber-bully. This, as you might imagine, was not a particularly pleasant accusation to face. Shortly following publication of the piece in Science, one of the most popular and influential science bloggers on the web, Neuroskeptic, wrote an insightful and balanced blog post on what might be best described as the psychology underpinning the accusation. This prompted a flow of tweets from the Twitterati…


As one of the scientists at the eye of the storm, I wanted to take some time to explain in this blog post just how this unfortunate and distressing situation (for all involved) arose because it has very important implications for the future of peer review. I’ll try to do this as dispassionately and magnanimously as possible, but I fully realise that I’m hardly a disinterested party.

The science and the censure

The back-story to the claim of cyber-bullying is lengthy and lively. It spans almost 30 published papers (very many in the top tier of scientific journals – see the list here), repeated refusals to provide raw data and samples to back up those published claims, apathetic journal editors (when it comes to correcting the scientific record),  strong public criticism of the research from a PhD student initially involved in the contested work, years of traditional peer review before a critique could make it into the literature, a bevy of blog posts, a raft of tweets, and, most recently, the heaviest volume of comments on a PubPeer paper to date.

For those of you who have the stamina to follow the entire, exhausting story, Raphael Levy has recently put together a couple of very helpful compendia of blog posts and articles. I’ve given myself the challenge here at physicsfocus of condensing all of that web traffic down into a short(-ish) Q&A to provide a summary of the controversy and to address the questions that crop up repeatedly. As a case study in post-publication peer review (PPPR), there is an awful lot to learn from this controversy.

Q. What scientific results are being challenged?

In 2004, Francesco Stellacci and co-workers published a paper in Nature Materials in which they interpreted scanning tunnelling microscopy (STM) images of nanoparticles covered with two different types of molecule as showing evidence for stripes in the molecular ‘shell’. They followed this paper up with a large number of other well-cited publications which built on the claim of stripes to argue that, for example, the (bio)chemistry and charge transport properties of the particles are strongly affected by the striped morphology.

Q. How has the work been criticised?

In a nutshell, the key criticism is that imaging artefacts have been interpreted as molecular features.

In slightly more detail…

  • The stripes in the images arise from a variety of artefacts due to poor experimental protocols and inappropriate data processing/analysis.
  • The strikingly clear images of stripes seen in the early work are irreproducible (both by Stellacci’s group and their collaborators) when the STM is set up and used correctly.
  • The data are cherry-picked; there is a lack of appropriate control samples; noise has been misinterpreted, and there is a high degree of observer bias throughout.
  • Experimental uncertainties and error bars are estimated and treated incorrectly, from which erroneous conclusions are reached.

That’s still only a potted summary. For all of the gory detail, it’s best to take a look at a paper we submitted to PLOS ONE at the end of last year, and uploaded at the same time to the Condensed Matter arXiv and to PubPeer.

Q. …but that’s just your opinion. You, Levy, and Stirling could be wrong. Indeed, didn’t leading STM groups co-author papers with Francesco Stellacci last year? Don’t their results support the earlier work?

First, I am not for one second suggesting that I don’t get things wrong sometimes. Indeed, we had to retract a paper from Chem. Comm. last year when we found that the data suffered from an error in the calibration of the oscillation amplitude of a scanning probe sensor. Embarrassing and painful, yes, but it had to be done: errare humanum est sed perserverare diabolicum.

The bedrock of science is data and evidence, however, not opinion (although, as Neuroskeptic highlighted, the interpretation of data is often not cut-and-dried). It took us many months to acquire (some of) the raw data for the early striped nanoparticle work from the authors, but when it finally arrived, it incontrovertibly showed that STM data in the original work suffered from extreme feedback loop instabilities which are very well-known to produce stripes aligned with the (slow) scan direction. This is exactly what is seen in this (from the very first paper on striped nanoparticles):


What is remarkable is that Francesco Stellacci’s work with those leading STM groups last year not only doesn’t support the earlier data/analysis, it clearly shows that images like that above can’t be reproduced when the experiment is done correctly. (Note that I contacted those groups by e-mail more than a week in advance of writing this post. They did not respond.)

But that’s more than enough science for now. The technical aspects of the science aren’t the focus of this post (because they’ve been covered at tedious length previously).

Q. Why do you care? For that matter, why the heck should I care?

I care because the flaws in the striped nanoparticle work mislead other researchers who may not have a background in STM and scanning-probe techniques. I care because funding of clearly flawed work diverts limited resources away from more deserving science. I care because errors in the scientific record should not stand uncorrected – this severely damages confidence in science. (If researchers in the field don’t correct those errors, who will?). And I care because a PhD student in the Stellacci research group was forced into the unfortunate position of having to act as a whistleblower.

If you’re a scientist (or, indeed, a researcher in any field of study), you should care because this case highlights severe deficiencies in the traditional scientific publishing and peer review systems. If you’re not, then you should care because, as a taxpayer, you’re paying for this stuff.

Q. But can’t you see that by repeatedly describing Francesco Stellacci’s work as “clearly flawed” online, he may well have a point about cyber-bullying?

Can I understand why Francesco might feel victimised? Yes. Can I empathise with him? Yes, to an extent. As a fellow scientist, I can entirely appreciate that our work tends to be a major component of our self-identity and, as Neuroskeptic explains, a challenge to our research can feel like a direct criticism of ourselves.

But as I said in response to the Science article, to describe criticism of publicly-funded research results published in the public domain as cyber-bullying is an insult to those who have had to endure true cyber-bullying. If public criticism of publicly-funded science is going to be labelled as cyber-bullying, then where do we draw the line? Should we get rid of Q&A sessions at scientific conferences? Should we have a moratorium on press releases and press conferences in case the work is challenged? Should scientists forgo social media entirely?

Q. Don’t you, Levy, and Stirling have better things to do with your time? Aren’t you just a little, ahem, obsessive about this?

Yes, we all have other things to do with our time. Julian recently submitted his thesis, had his viva voce examination, passed with flying colours, and is off to NIST in March to take up a postdoctoral position. Raphael was recently promoted and is ‘enjoying’ the additional work-load associated with his step up the career ladder. And I certainly could find other things to do.

I can only speak for myself here. I’ve already listed above a number of the many reasons why I care about this striped nanoparticle issue. If the work was restricted to one paper in an obscure journal that no-one had read then I might be rather less exercised. And I certainly don’t make a habit of critiquing other groups’ work in such forensic detail. (Nor have I got a particular axe to grind with Francesco – I have never met the man and am certainly not pursuing this in order to “tarnish his reputation”.)

But the striped nanoparticle ‘oeuvre’ is riddled with basic errors in STM imaging and analysis – errors that I wouldn’t expect to find in an undergraduate project report, let alone in Nature Publishing Group and American Chemical Society journals. This is why we won’t shut up about it! That this research has been published time and time again when there are gaping holes in the methodology, the data, and the analyses is a shocking indictment of the traditional peer review system.

Q. But then surely the best way to deal with this is through the journals, rather than scrapping it out online?

Raphael Levy spent more than three years getting a critique of the striped nanoparticle data into print before he started to blog about it. I’ve seen the exchange of e-mails with the editors for just one of the journals to which he submitted the critique – all taken, it runs to thirty pages (over ninety e-mails) over three years. While this was going on, other papers based on the same flawed data acquisition and analysis processes were regularly being published by Francesco and co-workers. There is no question that traditional peer review and the associated editorial processes failed very badly in this case.

But is PPPR via sites such as PubPeer the way forward? I have previously written about the importance of PPPR (in this article for Times Higher Education), and some of my heroes have similarly sung the praises of online peer review. I remain of the opinion that PPPR will continue to evolve such that it will be de rigueur for the next generation of scientists. However, the protracted and needlessly tortuous discussion of our paper over at PubPeer has made me realise that there’s an awful lot of important work left to do before we can credibly embed post-publication peer review in the scientific process.

Although PubPeer is an extremely important – indeed, I’d go so far as to say essential and inevitable – contribution to the evolution of the peer review system, the approach as it stands has its flaws. Moderation of comments is key, otherwise the discussion can rapidly descend into a series of ad hominem slurs (as we’re seeing in the comments thread for our paper). But even if those ad hominems are sifted out by a moderator, those with a vested interest in supporting a flawed piece of work – or, indeed, those who may want to attack a sound paper for reasons which may not be entirely scientific – can adopt a rather more subtle approach, as Peer 7 points out in response to a vociferous proponent of Stellacci et al’s work:

“You are using a tactic[al] which is well known by online activists which consists of repeating again and again the same series of arguments. By doing so you discourage the reasonable debaters who do not have the time/energy to answer these same arguments every day. In the same time, you instil doubt in less knowledgeable people’s mind who could think that, considering the number of your claims, some might be at least partly true.”

Moderation to identify this type of ‘filibustering’ will not come cheap and it will not be easy – there will always be the issue of finding truly disinterested parties to act as moderators. A colleague (not at Nottingham, nor in the UK) who wishes to remain anonymous – the issue of online anonymity is certainly vexed – and who has been avidly following the striped nanoparticle debate at PubPeer, put it like this in an e-mail to me:

The way this thing is panning out makes me actually more convinced that a blog is not a proper format for holding scientific debates. It might work to expose factually proven fraud. The peer-reviewed, one-argument-at-a-time format does one fundamental thing for the sanity of the conversation which is that it “truncates” it. It serves the same purpose of the clock on politicians’ debates. And protects, at least to an extent the debater from Gish gallop[s]… and the simple denial techniques. Just because you cannot just say that somebody is wrong on a paper and get away with it. At least it is harder than on a blog

As I said in that Times Higher article, much of the infrastructure to enable well-moderated online commentary is in principle already in place for the traditional journal system. We need to be careful not to throw the baby out with the bathwater in our efforts to fix the peer review system: PPPR should be facilitated by the journals – in, of course, as open a fashion as possible – and embedded in their processes instead of existing in a parallel online universe. When it takes more than three years to get criticism of flawed research through traditional peer review channels, the journal system has to change.


P.S. The image we wanted to use for this post was this, which, as the Whovians amongst you will realise, would have rather neatly tied in with the title. The BBC refused permission to use the image. If they’re going to be like that, they’re not getting their Tardis back

Image: Scientists online want your clothes, your boots and your motorcycle. Or maybe just to correct the scientific record. Credit: DarkGeometryStudios/Shutterstock