From the peer-reviewed pages of Springer Nature…a theory more bonkers than a conference of Flat Earthers.

“Wow. Just wow. What the f**k?!?!”

That was the opening line of my e-mail reply to Ivan Oransky, MD, and co-founder of Retraction Watch, when I’d picked myself up off the floor after reading the paper he sent me earlier this week. Ivan wanted my reaction to…deep breath…”Development of a safe antiparasitic against scuticociliates (Miamiensis avidus) in olive flounders: new approach to reduce the toxicity of mebendazole by material remediation technology using full-overlapped gravitational field energy”, Parasitology Research https://doi.org/10.1007/s00436-018-6010-8 (2019). 

That paper has now been retracted for reasons that will become very clear, very soon.

The more puzzling question is how the hell it got accepted in the first place.

Scroll down to page 5 of the paper (linked above) and find the section headed “Production of material remediated MBZ using full-overlapped gravitational field energy“. Actually, I’ll save you the bother. The section is reproduced in all its glory below. Are you sitting comfortably? Then we’ll begin…

SilkwormNonsense_1

…wait, there’s more…

SilkwormNonsense_2

And in case that didn’t make sense, there’s a helpful figure to explain everything:

SilkwormNonsense_3

I haven’t read anything quite as superbly crackpot as this since Jordan Peterson’s “Maps Of Meaning”.

As Ivan discusses over at the Retraction Watch blog, this, um, seminal example of truly innovative scientific reasoning was submitted on March 18. The editors and reviewers then took four months to consider the paper. And subsequently accepted it for publication.

Peer review. The gold standard on which all of science stands or falls.

Sloppy Science: Still Someone Else’s Problem?

“The Somebody Else’s Problem field is much simpler and more effective, and what’s more can be run for over a hundred years on a single torch battery… An SEP is something we can’t see, or don’t see, or our brain doesn’t let us see, because we think that it’s somebody else’s problem…. The brain just edits it out, it’s like a blind spot”.

Douglas Adams (1952 – 2001) Life, The Universe, and Everything

The very first blog post I wrote (back in March 2013), for the Institute of Physics’ now sadly defunct physicsfocus project, was titled “Are Flaws in Peer Review Someone Else’s Problem?” and cited the passage above from the incomparable, and sadly missed, Mr. Adams. The post described the trials and tribulations my colleagues and I were experiencing at the time in trying to critique some seriously sloppy science, on the subject of ostensibly “striped” nanoparticles, that had been published in very high profile journals by a very high profile group. Not that I suspected it at the time of writing the post, but that particular saga ended up dragging on and on, involving a litany of frustrations in our attempts to correct the scientific record.

I’ve been put in mind of the stripy saga, and that six-year-old post, for a number of reasons lately. First, the most recent stripe-related paper from the group whose work we critiqued makes absolutely no mention of the debate and controversy. It’s as if our criticism never existed; the issues we raised, and the surrounding controversy, are simply ignored by that group in their most recent work.

More importantly, however, I have been following Ken Rice‘s (and others’) heated exchange with the authors of a similarly fundamentally flawed paper very recently published in Scientific Reports [Oscillations of the baseline of solar magnetic field and solar irradiance on a millennial timescale, VV Zharkova, SJ Shepherd, SI Zharkov, and E Popova, Sci. Rep. 9 9197 (2019)]. Ken’s blog post on the matter is here, and the ever-expanding PubPeer thread (225 comments at the time of writing, and counting) is here. Michael Brown‘s take-no-prisoners take-down tweets on the matter are also worth reading…

The debate made it into the pages — sorry, pixels — of The Independent a few days ago: “Journal to investigate controversial study claiming global temperature rise is due to Earth moving closer to Sun.

Although the controversy in this case is related to physics happening on astronomically larger length scales than those at the heart of our stripy squabble, there are quite a number of parallels (and not just in terms of traffic to the PubPeer site and the tenor of the authors’ responses). Some of these are laid out in the following Tweet thread by Ken…

The Zharkova et al. paper makes fundamental errors that should never have passed through peer review. But then we all know that peer review is far from perfect. The question is what should happen to a paper that is not fradulent but still makes it to publication containing misleadingly sloppy and/or incorrect science? Should it remain in the scientific record? Or should it be retracted?

It turns out that this is a much more contested issue than it might appear at first blush. For what it’s worth, I am firmly of the opinion that a paper containing fundamental errors in the science and/or based on mistakes due to clearly definable f**k-ups/corner-cutting in experimental procedure should be retracted. End of story. It is unfair on other researchers — and, I would argue, blatantly unethical in many cases — to leave a paper in the literature that is fundamentally flawed. (Note that even retracted papers continue to accrue citations.) It is also a massive waste of taxpayers’ money to fund new research based on flawed work.

Here’s one example of what I mean, taken from personal, and embarrassing, experience. I screwed up the calibration of a tuning fork sensor used in a set of atomic force microscopy experiments. We discovered this screw-up after publication of the paper that was based on measurements with that particular sensor. Should that paper have remained in the literature? Absolutely not.

Some, however, including my friend and colleague Mike Merrifield, who is also Head of School here and with whom I enjoy the ever-so-occasional spat, have a slightly different take on the question of retractions:

Mike and I discussed the Zharkova et al. controversy both briefly at tea break and via an e-mail exchange last week, and it seems that there are distinct cultural differences between different sub-fields of physics when it comes to correcting the scientific record. I put the Gedankenexperiment described below to Mike and asked him whether we should retract the Gedankenpaper. The particular scenario outlined in the following stems from an exchange I had with Alessandro Strumia a few months back, and subsequently with a number of my particle physicist colleagues (both at Nottingham and elsewhere), re. the so-called 750 GeV anomaly at CERN…

“Mike, let’s say that some of us from the Nanoscience Group go to the Diamond Light Source to do a series of experiments. We acquire a set of X-ray absorption spectra that are rather noisy because, as ever, the experiment didn’t bloody well work until the last day of beamtime and we had to pack our measurements into the final few hours. Our signal-to-noise ratio is poor but we decide to not only interpret a bump in a spectrum as a true peak, but to develop a sophisticated (and perhaps even compelling) theory to explain that “peak”. We publish the paper in a prestigious journal, because the theory supporting our “peak” suggests the existence of an exciting new type of quasiparticle. 

We return to the synchrotron six months or a year later, repeat the experiment over and over but find no hint of the “peak” on which we based our (now reasonably well-cited) analysis. We realise that we had over-interpreted a statistical noise blip.

Should we retract the paper?”

I am firmly of the opinion that the paper should be retracted. After all, we could not reproduce our results when we did the experiment correctly. We didn’t bend over backwards in the initial experiment to convince ourselves that our data were robust and reliable and instead rushed to publish (because we were so eager to get a paper out of the beamtime.) So now we should eat humble pie for jumping the gun — the paper should be retracted and the scientific record should be corrected accordingly.

Mike, and others, were of a different opinion, however. They argued that the flawed paper should remain in the scientific literature, sometimes for the reasons to which Mike alludes in his tweet above [1].  In my conversations with particle physicists re. the 750 GeV anomaly, which arose from a similarly over-enthusiastically interpreted bump in a spectrum that turned out to be noise, there was a similarly strong inertia to correct the scientific record. There appeared to be a feeling that only if the data were fabricated or fraudulent should the paper be retracted.

During the e-mail exchanges with my particle physics colleagues, I was struck on more than one occasion by a disturbing disconnect between theory and experiment. (This is hardly the most original take on the particle physics field, I know. I’ll take a moment to plug Sabine Hossenfelder’s Lost In Math once again.) There was an unsettling (for me) feeling among some that it didn’t matter if experimental noise had been misinterpreted, as long as the paper led to some new theoretical insights. This, I’ll stress, was not an opinion universally held — some of my colleagues said they didn’t go anywhere near the 750 GeV excess because of the lack of strong experimental evidence. Others, however, were more than willing to enthusiastically over-interpret the 750 GeV “bump” and, unsurprisingly, baulked at the suggestion that their papers should be retracted or censured in any way. If their sloppy, credulous approach to accepting noise in lieu of experimental data had advanced the field, then what’s wrong with that? After all, we need intrepid pioneers who will cross the Pillars of Hercules

I’m a dyed-in-the-wool experimentalist; science should be driven by a strong and consistent feedback loop between experiment and theory. If a scientist mistakes experimental noise (or well-understood experimental artefacts) for valid data, or if they get fundamental physics wrong a la Zherkova et al, then there should be — must be — some censure for this. After all, we’d censure our undergrad students under similar circumstances, wouldn’t we? One student carries out an experiment for her final year project carefully and systematically, repeating measurements, bringing her signal-to-noise ratio down, putting in the hours to carefully refine and redefine the experimental protocols and procedures, refusing to make claims that are not entirely supported by the data. Another student instead gets over-excited when he sees a “signal” that chimes with his expectations, and instead of doing his utmost to make sure he’s not fooling himself, leaps to a new and exciting interpretation of the noisy data. Which student should receive the higher grade? Which student is the better scientist?

As that grand empiricist Francis Bacon put it centuries ago,

The understanding must not therefore be supplied with wings, but rather hung with weights, to keep it from leaping and flying.

It’s up to not just individual scientists but the scientific community as a whole to hang our collective understanding with weights. Sloppy science is not just someone else’s problem. It’s everyone’s problem.

[1] Mike’s suggestion in his tweet that the journal would like to retract the paper to spare their blushes doesn’t chime with our experience of journals’ reactions during the stripy saga. Retraction is the last thing they want because it impacts their brand.

 

At sixes and sevens about 3* and 4*

The post below appears in today’s Times Higher Education under the title “The REF’s star system leaves a black hole in fairness.” My original draft was improved immensely by Paul Jump‘s edits (but I am slightly miffed that my choice of title (above) was rejected by the sub-editors.) I’m posting the article here for those who don’t have a subscription to the THE. (I should note that the interview panel scenario described below actually happened. The question I asked was suggested in the interview pack supplied by the “University of True Excellence”.)


“In your field of study, Professor Aspire, just how does one distinguish a 3* from a 4* paper in the research excellence framework?”

The interviewee for a senior position at the University of True Excellence – names have been changed to protect the guilty – shuffled in his seat. I leaned slightly forward after posing the question, keen to hear his response to this perennial puzzler that has exercised some of the UK’s great and not-so-great academic minds.

He coughed. The panel – on which I was the external reviewer – waited expectantly.

“Well, a 4* paper is a 3* paper except that your mate is one of the REF panel members,” he answered.

I smiled and suppressed a giggle.

Other members of the panel were less amused. After all, the rating and ranking of academics’ outputs is serious stuff. Careers – indeed, the viability of entire departments, schools, institutes and universities – depend critically on the judgements made by peers on the REF panels.

Not only do the ratings directly influence the intangible benefits arising from the prestige of a high REF ranking, they also translate into cold, hard cash. An analysis by the University of Sheffield suggests that in my subject area, physics, the average annual value of a 3* paper for REF 2021 is likely to be roughly £4,300, whereas that of a 4* paper is £17,100. In other words, the formula for allocating “quality-related” research funding is such that a paper deemed 4* is worth four times one judged to be 3*; as for 2* (“internationally recognised”) or 1* (“nationally recognised”) papers, they are literally worthless.

We might have hoped that before divvying up more than £1 billion of public funds a year, the objectivity, reliability and robustness of the ranking process would be established beyond question. But, without wanting to cast any aspersions on the integrity of REF panels, I’ve got to admit that, from where I was sitting, Professor Aspire’s tongue-in-cheek answer regarding the difference between 3* and 4* papers seemed about as good as any – apart from, perhaps, “I don’t know”.

The solution certainly isn’t to reach for simplistic bibliometric numerology such as impact factors or SNIP indicators; anyone making that suggestion is not displaying even the level of critical thinking we expect of our undergraduates. But every academic also knows, deep in their studious soul, that peer review is far from wholly objective. Nevertheless, university senior managers – many of them practising or former academics themselves – are often all too willing, as part of their REF preparations, to credulously accept internal assessors’ star ratings at face value, with sometimes worrying consequences for the researcher in question (especially if the verdict is 2* or less).

Fortunately, my institution, the University of Nottingham, is a little more enlightened – last year it had the good sense to check the consistency of the internal verdicts on potential REF 2021 submissions via the use of independent reviewers for each paper. The results were sobering. Across seven scientific units of assessment, the level of full agreement between reviewers varied from 50 per cent to 75 per cent. In other words, in the worst cases, reviewers agreed on the star rating for no more than half of the papers they reviewed.

Granted, the vast majority of the disagreement was at the 1* level; very few pairs of reviewers were “out” by two stars, and none disagreed by more. But this is cold comfort. The REF’s credibility is based on an assumption that reviewers can quantitatively assess the quality of a paper with a precision better than one star. As our exercise shows, the effective error bar is actually ± 1*.

That would be worrying enough if there were a linear scaling of financial reward. But the problem is exacerbated dramatically by both the 4x multiplier for 4* papers and the total lack of financial reward for anything deemed to be below 3*.

The Nottingham analysis also examined the extent to which reviewers’ ratings agreed with authors’ self-scoring (let’s leave aside any disagreement between co-authors on that). The level of full agreement here was similarly patchy, varying between 47 per cent and 71 per cent. Unsurprisingly, there was an overall tendency for authors to “overscore” their papers, although underscoring was also common.

Some argue that what’s important is the aggregate REF score for a department, rather than the ratings of individual papers, because, according to the central limit theorem, any wayward ratings will “wash out” at the macro level. I disagree entirely. Individual academics across the UK continue to be coaxed and cajoled into producing 4* papers; there are even dedicated funding schemes to help them do so. And the repercussions arising from failure can be severe.

It is vital in any game of consequence that participants be able to agree when a goal has been scored or a boundary hit. Yet, in the case of research quality, there are far too many cases in which we just can’t. So the question must be asked: why are we still playing?

Blast from the past

While searching my e-mail archive for a message from years ago, I stumbled across this unpublished submission to the letters page of the Times Higher Education. More than a decade later, I’m still smarting a little that they didn’t accept it for publication…

From: Moriarty Philip
Sent: 30 November 2008 20:48
To: letters@tsleducation.com
Subject: Comment on “‘Clever crazies quitting science” (THE 27 Nov)

Bruce Charlton of the University of Buckingham argues that modern scientists are boring because they are mild-mannered, agreeable, and socially inoffensive (News, 27 November).

What a dickhead.

Philip Moriarty, Condensed Matter Scientist

School of Physics & Astronomy
University of Nottingham
Nottingham NG7 2RD

 

How Not To Do Spectral Analysis 101

I will leave this here without further comment…

JesusHChrist

*bangs head gently on desk and sobs quietly to himself*

Source (via Sam Jarvis. Thanks, Sam.):

The original ‘peer-reviewed’ paper is this: Găluşcă et al., IOP Conf. Ser. Mater. Sci. Eng. 374 012020 (2018)

 

 

Bullshit and Beyond: From Chopra to Peterson

Harry G Frankfurt‘s On Bullshit is a modern classic. He highlights the style-over-substance tenor of the most fragrant and flagrant bullshit, arguing that

It is impossible for someone to lie unless he thinks he knows the truth. Producing bullshit requires no such conviction. A person who lies is thereby responding to the truth, and he is to that extent respectful of it. When an honest man speaks, he says
only what he believes to be true; and for the liar, it is correspondingly indispensable that he considers his statements to be false. For the bullshitter, however, all these bets are off: he is neither on the side of the true nor on the side of the false. His eye
is not on the facts at all, as the eyes of the honest man and of the liar are, except insofar as they may be pertinent to his interest in getting away with what he says. He does not care whether the things he says describe reality correctly. He just picks them out, or makes them up, to suit his purpose.

In other words, the bullshitter doesn’t care about the validity or rigour of their arguments. They are much more concerned with being persuasive. One aspect of BS that doesn’t quite get the attention it deserves in Frankfurt’s essay, however, is that special blend of obscurantism and vacuity that is the hallmark of three world-leading bullshitters of our time:  Deepak Chopra, Karen Barad (see my colleague Brigitte Nerlich’s important discussion of Barad’s wilfully impenetrable language here), and Jordan Peterson. In a talk for the University of Nottingham Agnostic, Secularist, and Humanist Society last night (see here for the blurb/advert), I focussed on the intriguing parallels between their writing and oratory. Here’s the video of the talk.

Thanks to UNASH for the invitation. I’ve not included the lengthy Q&A that followed (because I stupidly didn’t ask for permission to film audience members’ questions). I’m hoping that some discussion and debate might ensue in the comments section below. If you do dive in, try not to bullshit too much…

 

 

The war on (scientific) terror…

I’ve been otherwise occupied of late so the blog has had to take a back seat. I’m therefore coming to this particular story rather late in the day. Nonetheless, it’s on an exceptionally important theme that is at the core of how scientific publishing, scientific critique, and, therefore, science itself should evolve. That type of question doesn’t have a sell-by date so I hope my tardiness can be excused.

The story involves a colleague and friend who has courageously put his head above the parapet (on a number of occasions over the years) to highlight just where peer review goes wrong. And time and again he’s gotten viciously castigated by (some) senior scientists for doing nothing more than critiquing published data in as open and transparent a fashion as possible. In other words, he’s been pilloried (by pillars of the scientific community) for daring to suggest that we do science the way it should be done.

This time, he’s been called a…wait for it…scientific terrorist. And by none other than the most cited chemist in the world over the last decade (well, from 2000 – 2010): Chad A Mirkin. According to his Wiki page, Mirkin “was the first chemist to be elected into all three branches of the National Academies. He has published over 700 manuscripts (Google Scholar H-index = 163) and has over 1100 patents and patent applications (over 300 issued, over 80% licensed as of April 1, 2018). These discoveries and innovations have led to over 2000 commercial products that are being used worldwide.”

With that pedigree, this guy must really have done something truly appalling for Mirkin to call him a scientific terrorist (oh, and a zealot, and a narcissist), right? Well, let’s see…

raphaportrait2The colleague in question is Raphael Levy. Raphael (pictured to the right) is a Senior Lecturer — or Associate Professor to use the term increasingly preferred by UK universities and traditionally used by our academic cousins across the pond — in Biochemistry at the University of Liverpool. He has a deep and laudable commitment to open science and the evolution of the peer review system towards a more transparent and accountable ethos.

Along with Julian Stirling, who was a PhD student here at Nottingham at the time, and a number of other colleagues, I collaborated closely with Raphael and his team (from about 2012 – 2014) in critiquing and contesting a body of work that claimed that stripes (with ostensibly fascinating physicochemical and biological properties) formed on the surface of suitably functionalised nanoparticles. I’m not going to revisit the “stripy” nanoparticle debate here. If you’re interested, see Refs [1-5] below. Raphael’s blog , which I thoroughly recommend, also has detailed bibliographies for the stripy nanoparticle controversy.

More recently, Raphael and his co-workers at Liverpool have found significant and worrying deficiencies in claims regarding the efficacy of what are known as SmartFlares. (Let me translate that academically-nuanced wording: Apparently, they don’t work.) Chad Mirkin played a major role in the development of SmartFlares, which are claimed to detect RNA in living cells and were sold by SigmaMilliPore from 2013 until recently, when they were taken off the market.

The SmartFlare concept is relatively straight-forward to understand (even for this particular squalid state physicist, who tends to get overwhelmed by molecules much larger than CO): each ‘flare’  probe comprises a gold nanoparticle attached to an oligonucleotide (that encodes a target sequence) and a fluorophore, which does not emit fluorescence as long as it’s near to the gold particle. When the probe meets the target RNA, however, this displaces the fluorophore (thus reducing the coupling to, and quenching by, the gold nanoparticle) and causes it to glow (or ‘flare’). Or so it’s claimed.

As described in a recent article in The Scientist, however, there is compelling evidence from a growing number of sources, including, in particular, Raphael’s own group, that SmartFlares simply aren’t up to the job. Raphael’s argument, for which he has strong supporting data (from electron-, fluorescence- and photothermal microscopy), is that the probes are trapped in endocytic compartments and get nowhere near the RNA they’re meant to target.

Mirkin, as one might expect, vigorously claims otherwise. That’s, of course, entirely his prerogative. What’s most definitely not his prerogative, however, is to launch hyperbolic personal attacks at a critic of his work. As Raphael describes over at his blog, he asked the following question at the end of a talk Mirkin gave at the American Chemical Society meeting in Boston a month ago:

In science, we need to share the bad news as well as the good news. In your introduction you mentioned four clinical trials. One of them has reported. It showed no efficacy and Purdue Pharma which was supposed to develop the drug decided not to pursue further. You also said that 1600 forms of NanoFlares were commercially available. This is not true anymore as the distributor has pulled the product because it does not work. Finally, I have a question: what is the percentage of nanoparticles that escape the endosome?

According to Raphael’s description (which is supported by others at the conference — see below), Mirkin’s response was ad hominem in the extreme:

[Mirkin said that]…no one is reading my blog (who cares),  no one agrees with me; he called me a “scientific zealot” and a “scientific terrorist”.

Raphael and I have been in a similar situation before with regard to scientific critique not exactly being handled with good grace. We and our colleagues have faced accusations of being cyber-bullies — and, worse, fake blogs and identity theft were used –to attempt to discredit our (purely scientific) criticism.

Science is in a very bad place indeed if detailed criticism of a scientist’s work is dismissed aggressively as scientific terrorism/zealotry. We are, of course, all emotional beings to a greater or lesser extent. Therefore, and despite protestations to the contrary from those who have an exceptionally naive view of The Scientific Method, science is not some wholly objective monolith that arrives at The Truth by somehow bypassing all the messy business of being human. As Neuroskeptic described so well in a blog post about the stripy nanoparticle furore, often professional criticism is taken very personally by scientists (whose self-image and self-confidence can be intimately connected to the success of the science we do). Criticism of our work can therefore often feel like criticism of us.

But as scientists we have to recognise, and then always strive to rise above, those very human responses; to take on board, rather than aggressively dismiss out of hand, valid criticisms of our work. This is not at all easy, as PhD Comics among others has pointed out:

One would hope, however, that a scientist of Mirkin’s calibre would set an example, especially at a conference with the high profile of the annual ACS meeting. As a scientist who witnessed the exchange between Raphael and Mirkin put it,

I witnessed an interaction between two scientists. One asks his questions gracefully and one responding in a manner unbecoming of a Linus Pauling Medalist. It took courage to stand in front of a packed room of scientists and peers to ask those questions that deserved an answer in a non-aggressive manner. It took even more courage to not become reactive when the respondent is aggressive and belittling. I certainly commended Raphael Levy for how he handled the aggressive response from Chad Mirkin.

Or, as James Wilking put it somewhat more pithily:

An apology from Mirkin doesn’t seem to be forthcoming. This is a shame, to put it mildly. What I found rather more disturbing than Mirkin’s overwrought accusation of scientific terrorism, however, was the reaction of an anonymous scientist in that article in The Scientist:

“I think what everyone has to understand is that unhealthy discussion leads to unsuccessful funding applications, with referees pointing out that there is a controversy in the matter. Referee statements like these . . . in a highly competitive environment for funding, simply drain the funding away of this topic,” he writes in an email to The Scientist. He believes a recent grant application of his related to the topic was rejected for this reason, he adds.

This is a shockingly disturbing mindset. Here we have a scientist bemoaning that (s)he did not get public funding because of what is described as “unhealthy” public discussion and controversy about an area of science. Better that we all keep schtum about any possible problems and milk the public purse for as much grant funding as possible, right?

That attitude stinks to high heaven. If it takes some scientific terrorism to shoot it down in flames then sign me up.


[1] Stripy Nanoparticle Controversy Blows Up

[2] Peer Review In Public: Rise Of The Cyber-Bullies? 

[3] Looking At Nothing, Seeing A Lot

[4] Critical Assessment of the Evidence for Striped Nanoparticles, Julian Stirling et al, PLOS ONE 9 e108482 (2014)

[5] How can we trust scientific publishers with our work if they won’t play fair?