Guilty Confessions of a REFeree

#4 of an occasional series

At the start of this week I spent a day in a room in a university somewhat north of Nottingham with a stack of research papers and a pile of grading sheets. Along with a fellow physicist from a different university (located even further north of Nottingham), I had been asked to act as an external reviewer for the department’s mock REF assessment.

I found it a deeply uncomfortable experience. My discomfort had nothing to do, of course, with our wonderfully genial hosts — thank you all for the hospitality, the conversation, the professionalism, and, of course, lunch. But I’ve vented my spleen previously on the lack of consistency in mock REF ratings (it’s been the most-viewed post at Symptoms… since I resurrected the blog in June last year) and I agreed to participate in the mock assessment so I could see for myself how the process works in practice.

Overall, I’d say that the degree of agreement on “star ratings” before moderation of my co-marker’s grading and mine was at the 70% level, give or take. This is in line with the consistency we observed at Nottingham for independent reviewers in Physics and is therefore, at least, somewhat encouraging. (Other units of assessment for Nottingham’s mock REF review had only 50% agreement.)  But what set my teeth on edge for a not-insignificant number of papers — including quite a few of those on which my gradings agreed with those of my co-marker — was that I simply did not feel at all  qualified to comment.

Even though I’m a condensed matter physicist and we were asked to assess condensed matter physics papers, I simply don’t have the necessary level of hubris to pretend that I can expertly assess any paper in any CMP sub-field. The question that went through my head repeatedly was “If I got this paper from Physical Review Letters (or Phys. Rev. B, or Nature, or Nature Comms, or Advanced Materials, or J. Phys. Chem. C…etc…) would I accept the reviewing invitation or would I decline, telling them it was out of my field of expertise?”  And for the majority of papers the answer to that question was a resounding “I’d decline the invitation.”

So if a paper I was asked to review wasn’t in my (sub-)field of expertise, how did I gauge its reception in the relevant scientific community?

I can’t quite believe I’m admitting this, given my severe misgivings about citation metrics, but, yes, I held my nose and turned to Web of Science. And citation metrics also played a role in the decisions my co-marker made, and in our moderation. This, despite the fact that we had no way of normalising those metrics to the prevailing citation culture of each sub-field, nor of ranking the quality as distinct from the impact of each paper. (One of my absolutely favourite papers of all time – a truly elegant and pioneering piece of work – has picked up a surprisingly low number of citations, as compared to much more pedestrian work in the field.)

Only when I had to face a stack of papers and grade them for myself did I realise just how exceptionally difficult it is to pass numerical judgment on a piece of work in an area that lies outside my rather small sphere of research. I was, of course, asked to comment on publications in condensed matter physics, ostensibly my area of expertise. But that’s a huge field. Not only is no-one a world-leading expert in all areas of condensed matter physics, it’s almost impossible to keep up with developments in our own narrow sub-fields of interest let alone be au fait with the state of the art in all other sub-fields.

So we therefore turn to citations to try to gauge the extent to which a paper has made ripples — or perhaps even sent shockwaves – through a sub-field in which we have no expertise. My co-marker and I are hardly alone in adopting this citation-counting strategy. But that’s of course no excuse — we were relying on exactly the type of pseudoquantitative heuristic that I have criticised in the past and I felt rather “grubby” at the end of the (rather tiring) day. David Colquhoun made the following point time and again in the run up to the last REF  (and well before):

All this shows what is obvious to everyone but bone-headed bean counters. The only way to assess the merit of a paper is to ask a selection of experts in the field.

Nothing else works.

Nothing.

Bibliometrics are a measure of visibility and “clout” in a particular (yet often nebulously defined) research community; they’re not a quantification of scientific quality. Therefore, very many scientists, and this most definitely includes me, have deep misgivings about using citations to judge a paper’s — let alone a scientist’s — worth.

Although I agree with that quote from David above, the problem is that we need to somehow choose the correct “boundary conditions” for each expert; I can have a reasonable level of expertise in one sub-area of a field — say, scanning probe microscopy or self-assembly or semiconductor surface physics — and a distinct lack of working knowledge, let alone expertise, in another sub-area of that self-same field. I could list literally hundreds of topics where I would, in fact, be winging it.

For many years, and because of my deep aversion to simplistic citation-counting and bibliometrics, I’ve been guilty of the type of not-particularly-joined-up thinking that Dorothy Bishop rightly chastises in this tweet…

We can’t trust the bibliometrics in isolation (for all the reasons (and others) that David Colquhoun lays out here), so when it comes to the REF the argument is that we have to supplement the metrics with “quality control” via another round of ostensibly expert peer review. But the problem is that it’s often not expert peer review; I was certainly not an expert in the subject areas of very many of the papers I was asked to judge. And I’ll hold that no-one can be a world-leading expert in every sub-field of a given area of physics (or any other discipline).

So what are the alternatives?

David has suggested that we should, in essence, retire what’s known as the “dual support” system for research funding (see the video embedded below): “…abolish the REF, and give the money to research councils, with precautions to prevent people being fired because their research wasn’t expensive enough.” I have quite some sympathy with that view because the common argument that the so-called QR funding awarded via the REF is used to support “unpopular” areas of research that wouldn’t necessarily be supported by the research councils is not at all compelling (to put it mildly). Universities demonstrably align their funding priorities and programmes very closely with research council strategic areas; they don’t hand out QR money for research that doesn’t fall within their latest Universal Targetified Globalised Research Themes.

Prof. Bishop has a different suggestion for revamping how QR funding is divvied up, which initially (and naively, for the reasons outlined above) I found a little unsettling. My first-hand experience earlier this week with the publication grading methodology used by the REF — albeit in a mock assessment — has made me significantly more comfortable with Dorothy’s strategy:

.”..dispense with the review of quality, and you can obtain similar outcomes by allocating funding at institutional level in relation to research volume”.

Given that grant income is often taken as yet another proxy for research quality, and that there’s a clear Matthew effect (rightly or wrongly) at play in science funding, this correlation between research volume and REF placement is not surprising. As the Times Higher Education article on Dorothy’s proposals went on to quote,

The government should, therefore, consider allocating block funding in proportion to the number of research-active staff at a university because that would shrink the burden on universities and reduce perverse incentives in the system, [Prof Bishop] said.

Before reacting strongly one way or another, I strongly recommend that you take the time to listen to Prof. Bishop eloquently detail her arguments in the video below.

Here’s the final slide of that presentation:

DorothyBishopRecommendations

So much rests on that final point. Ultimately, the immense time and effort devoted to/wasted on the REF boils down to a lack of trust — by government, funding bodies, and, depressingly, often university senior management — that academics cannot motivate themselves without perverse incentives like aiming for a 4* paper. That would be bad enough if we all could agree on what a 4* paper looks like…

Pressure vessels: the epidemic of poor mental health among academics

This post takes its title from a talk that will be given by Liz Morrish here at UoN next week. (5:00 pm on May 21 in The Hemsley.) Here’s the outline:

Liz Morrish will present findings that show how staff employed at Higher Education Institutions/ Universities are accessing counselling and occupational health services at an increasing rate. Between 2009 and 2015, counselling referrals have risen by 77 per cent, while staff referrals to Occupational Health services during the same period have risen by 64 per cent. This attests to an escalating epidemic of poor mental health among the sector’s employees. I will consider some of the factors which weigh on the mental health of academic staff: escalating and excessive workloads; the imposition of metric surveillance; outcomes-based performance management; increasing precarity and insecure contracts. Universities have been characterised as ‘anxiety machines’ which purposefully flout legal requirements to prevent stress in the workplace. Given the urgency of the situation, I will propose some recommendations which if institutions were to follow, might alleviate some of the pressures.

…and here’s Liz’s bio:

Liz Morrish is an independent scholar and activist for resistance to managerial appropriation of the university. She is a visiting fellow at York St John University. She was principal lecturer and subject leader of linguistics at Nottingham Trent University until speaking out and writing about the mental health of academics brought about her resignation in 2016. She is completing a co-authored book on managerial discourse in the neoliberal academy, entitled Academic Irregularities (Routledge forthcoming) and she also writes a blog with the same name: https://academicirregularities.wordpress.com/. Having exited the academy, Liz now has more time for other activities, and she now spends time as a marathon swim observer.

I met Liz a number of years ago, when she was principal lecturer at Nottingham Trent University. Not so long after we met, NTU disgracefully brought disciplinary proceedings against Liz when she spoke out about the mental health of academics, ultimately causing her to resign. For the full story on NTU’s shocking behaviour — driven, of course, by its metrics-and-league-table-infected management ‘strategy’ — an exceptionally important article written for the Times Higher Education shortly after Liz’s resignation is a must-read. Here’s a taster, but you should read the entire article for deep insights into just how low a university will go in its attempts to protect its reputation and pressure its staff:

In March last year [2016], Times Higher Education republished a blog piece that I wrote on the causes of stress and threats to mental health in academic life. The piece recounted how, on University Mental Health Day, I opened up to students about some of the pressures their lecturers were under. Many readers were kind enough to retweet the link, respond under the line or email me personally to let me know that my article resonated for colleagues around the world. But after it had received 10,000 hits on my own blog and spent four days trending on THE’s website, my previous employer objected to it and I was obliged to ask for it to be taken down. This inaugurated a disciplinary process that I felt curbed my ability to write further on the topic, or to have a frank dialogue with students on mental health in universities.

I feel very fortunate indeed that I am employed by the “other” university in Nottingham. Although I have had, and continue to have, my spats with senior management here, they have not once asked me to constrain or curtail my criticism of university (and University) culture; there’s been not so much as a quiet word in my ear following even rather scathing public critiques. Thank you, UoN, for your commitment to academic freedom.

I’d very much appreciate it if those of you who are Twitter-enabled UoN academics could spread the word about Liz’s talk. (I’ve forgone that particular form of communication.)  I hope to see you there on May 21.

 

20,000 Leagues under the THE

This monstrous tome arrived yesterday morning…

THE-rankings.png

I subscribe to the Times Higher Education and generally look forward to the analogue version of the magazine arriving each week. Yesterday, however, it landed with a fulsome house-rattling thud as it hit the floor, prompting Daisy, the eight year old miniature dachshund whose duty it is to ward off all visitors (friend, foe, or pizza), to attempt to shred both the magazine and the 170 page glossy World University Ranking ‘supplement’ pictured above that accompanied it.

I should have smeared the latter with a generous helping of Cesar dog food [1] and have her at it.

Yes, it’s yet another rant about league tables, I’m afraid. I’ve never been one to hold back on the piss and vinegar when it comes to bemoaning the pseudostatistics underpinning education league tables (be they primary school OFSTED placements or the leaderboards for august higher education institutions). I’m lucky to be in very good company. Peter Coles’ annual slamming of the THE rankings is always worth reading. (He’s on especially good form for the 2019 season.) And our very own Head of School, Mike Merrifield, has described in no uncertain terms just why university league tables are bad for you.

But this time round, and notwithstanding that WB Yeats quote I love so much [2], there’s going to be a slightly more upbeat message from yours truly. We need to give students rather more credit when it comes to seeing through the league table guff. They’re a damn sight more savvy than some imagine. Before I describe just why I have this degree of faith in the critical thinking capabilities of the next generation of undergrads, let’s take a look at a few representative (or not, as the case may be) league tables.

I’ve got one more year to go (of a five year ‘gig’) as undergraduate admissions tutor for the School of Physics & Astronomy at Nottingham. Throughout that time, I have enjoyed the healthy catharsis of regularly lambasting league tables during not only my University open day talks (in June and September) but for every week of our UCAS visit/interview days (which kick off again in mid-November).

I routinely point to tables like this, taken from the annual Graduate Market report [3]:

GraduateMarket2017-2018

Tsk. Nottingham languishing at #8. Back in 2014-2015 we were at # 2:

GraduateMarket2014-2015.png

Clearly there’s been a drop in quality to have slipped six places, right?

No. There’s nothing “clear” about that supposition at all. Universities and university departments are not football teams: it’s ludicrous to judge any institution (or department therein) on the basis of a single number.

Not convinced? Just sour grapes because Nottingham has ‘slipped’?

Well, take a slightly closer look at Table 5.8 directly above. Let’s leave the Nottingham “also-ran”s to one side, and focus on the top of the pops, Manchester. They’re an impressive #1 when it comes to employer perception…yet #28 in the Good University Guide. So which number do you prefer? Which has more credibility? Which is more robust?

Still have residual doubts? OK, let’s instead focus in on individual schools/departments rather than consider entire universities. (And don’t get me started on the university-wide Teaching Excellence Framework (TEF)’s gold, silver, and bronze medals…) Here’s where Nottingham stands in The Times’ Physics and Astronomy league table:

TimesTop10.png

Yay! Go Nottingham! In at #5 with a bullet. Up a whopping thirteen places compared to last year. (Incidentally, our undergraduate applications were also up by over 20%. This correlation between league table placement and application numbers may not be entirely coincidental…)

Wow. We must really have worked hard in the intervening year. Or perhaps we brought in “star world-class players” on the academic transfer market to “up our game”?

Nope.

So what was radically different about our teaching and/or research compared to the previous year that led to this climb into the Top Ten?

Nothing. Zilch. Nada.

Feck all.

Indulge me with one last example.  Here’s the most recent (2014) Research Excellence Framework ranking for physics…

REF2014.png

Nottingham is the only school/department to remain in the Top 5 over two rounds of this national research assessment exercise. (Last time round (in 2008) we were joint second with Bath and Cambridge). Again, Yay Nottingham!, right? Or does it perhaps speak rather more to a certain volatility in the league table placements because any peer review process like the REF is very far from being entirely objective?

Both Peter Coles and Mike Merrifield (among many others) have pointed out key reasons underpinning league table volatility. I’m not about to rehearse those arguments here. Instead, I’ll highlight a couple of rather encouraging Reddit threads I’ve read recently — and that’s not something I tend to write too often — related, at least partially, to Nottingham’s open days. The first of these Mike has very helpfully highlighted via Twitter:

 

There is indeed a lot to be said for brutal honesty and I am delighted that the pseudostats of league table placements are being questioned by open day audiences.

The responses to this rather snobbishly overwrought comment elsewhere on Reddit also made my heart sing:

Reddit.png

You can read the responses at the thread itself but I especially liked this, from ‘Matthew3_14’:

Reddit_response.png

I’d quibble with the “outside of the top 5ish” proviso (as you might expect), but otherwise “Matthew3_14” echoes exactly what I’ll be telling visiting applicants for our courses in the coming months…

If you like Nottingham, the rankings are irrelevant.

If you don’t like Nottingham, the rankings are still irrelevant.

Go to the place where you feel best.


[1] …for small, yappy-type dogs.

[2] “Being Irish, he had an abiding sense of tragedy that sustained him through temporary periods of joy.”

[3] Yes, it’s irritating that we now unblinkingly refer to students as a market. That’s a whole other blog post or five.