PeerJ’s peer-review problem

Of all the scientific journals in the wild, there are a few I keep a closer eye on: they publish interesting results but more importantly they have been forward-thinking on matters of scientific publishing and they’ve also displayed a tendency to think out loud (through blog posts, say) and actively consider public feedback. Reading what they publish in these posts, and following the discussions that envelope them, has given me many useful insights into how scientific publishing works and, perhaps more importantly, how the perceptions surrounding this enterprise are shaped and play out.

One such journal is eLife. All their papers are open access, and they also publish the papers’ authors’ notes and reviewers’ comments with each paper. They also have a lively ‘magazine’ section in which they publish articles and essays by working scientists – especially younger ones – relating to the extended social environments in which knowledge-work happens. Now, for some reason, I’d cast PeerJ in similarly progressive light, even though I hadn’t visited their website in a long time. But on August 16, PeerJ published the following tweet:

It struck me as a weird decision (not that anyone cares). Since the article explaining the journal’s decision appears to be available under a Creative Commons Attribution license, I’m reproducing it here in full so that I can annotate my way through it.

Since our launch, PeerJ has worked towards the goal of publishing all “Sound Science”, as cost effectively as possible, for the benefit of the scientific community and society. As a result we have, until now, evaluated articles based only on an objective determination of scientific and methodological soundness, not on subjective determinations of impact, novelty or interest.

At the same time, at the core of our mission has been a promise to give researchers more influence over the publishing process and to listen to community feedback over how peer review  should work and how research should be assessed.


In recent months we have been thinking long and hard about feedback, from both our Editorial Board and Reviewers, that certain articles should no longer be considered as valid candidates for peer review or formal publication: that whilst the science they present may be “sound”, it is not of enough value to either the scientific record, the scientific community, or society, to justify being peer-reviewed or be considered for publication in a peer-reviewed journal. Our Editorial Board Members have asked us that we do our best to identify such submissions before they enter peer review.

This is the confusing part. To the uninitiated: One type of the scientific publishing process involves scientists writing up a paper and submitting it to a journal for consideration. An editor, or editors, at the journal checks the paper and then commissions a group of independent experts on the same topic to review it. These experts are expected to provide comments to help the journal decide whether it should publish the paper, and if yes, if the paper can be improved. Note that they are usually not paid for their work or time.

Now, if PeerJ’s usual reviewers are unhappy with how many papers the journal’s asking them to review, how does it make sense to impose a new, arbitrary and honestly counterproductive sort of “value” on submissions instead of increasing the number of reviewers the journal works with?

I find the journal’s decision troublesome because some important details are missing – details that encompass borderline-unethical activities by some other journals that have only undermined the integrity and usefulness of the scientific literature. For example, the “high impact factor” journal Nature has asked its reviewers in the past to prioritise sensational results over glamorous ones, overlooking the fact that such results are also likelier to be wrong. For another example, the concept of pre-registration has started to become more recently simply because most journals used to refuse (and still do) negative results. That is, if a group of scientists set out to check if something was true – and it’d be amazing if it was true – and found that it was false instead, they’d have a tough time finding a journal willing to publish their paper.

And third, preprint papers have started to become an acceptable way of publishing research only in the last few years, and that too only in a few branches of science (especially physics). Most grant-giving and research institutions still prefer papers being published in journals, instead of being uploaded on preprint repositories, not to mention a dominant research culture in many countries – including India – still favouring arbitrarily defined “prestigious journals” over others when it comes to picking scientists for promotions, etc.

For these reasons, any decision by a journal that says sound science and methodological rigour alone won’t suffice to ‘admit’ a paper into their pages risks reinforcing – directly or indirectly – a bias in the scientific record that many scientists are working hard to move away from. For example, if PeerJ rejects a solid paper, to speak, because it ‘only’ confirms a previous discovery, improves its accuracy, etc. and doesn’t fill a knowledge gap, per se, in order to ease the burden on its reviewers, the scientific record still stands to lose out on an important submission. (It pays to review journals’ decisions assuming that each journal is the only one around – à la the categorical imperative – and that other journals don’t exist.)

So what are PeerJ‘s new criteria for rejecting papers?

As a result, we have been working with key stakeholders to develop new ways to evaluate submissions and are introducing new pre-review evaluation criteria, which we will initially apply to papers submitted to our new Medical Sections, followed soon after by all subject areas. These evaluation criteria will define clearer standards for the requirements of certain types of articles in those areas. For example, bioinformatic analyses of already published data sets will need to meet more stringent reporting and data analysis requirements, and will need to clearly demonstrate that they are addressing a meaningful knowledge gap in the literature.

We don’t know yet, it seems.

At some level, of course, this means that PeerJ is moving away from the concept of peer reviewing all sound science. To be absolutely clear, this does not mean we have an intention of becoming a highly-selective “glamour” journal publisher that publishes only the most novel breakthroughs. It also does not mean that we will stop publishing negative or null results. However, the feedback we have received is that the definition of what constitutes a valid candidate for publication needs to evolve.

To be honest, this is a laughable position. The journal admits in the first sentence of this paragraph that no matter where it goes from here, it will only recede from an ideal position. In the next sentence it denies (vehemently, considering in the article on its website, this sentence was in bold) its decision is a move that will transform it into a “glamour” journal – like Nature, Science, NEJM, etc. have been – nor, in the third sentence, that it will stop publishing “negative or null results”. Now I’m even more curious what these heuristics could be which specify that a) submissions have to have “sound science”, b) “address a meaningful knowledge gap”, and c) don’t exclude negative/null results. It’s possible to see some overlap between these requirements that some papers will occupy – but it’s also possible to see many papers that won’t tick all three boxes yet still deserve to be published. To echo PeerJ itself, being a “glamour” journal is only one way to be bad.

We are being influenced by the researchers who peer review our research articles. We have heard from so many of our editorial board members and reviewers that they feel swamped by peer review requests and that they – and the system more widely – are close to breaking point. We most regularly hear this frustration when papers that they are reviewing do not, in their expert opinion, make a meaningful contribution to the record and are destined to be rejected; and should, in their view, have been filtered out much sooner in the process.

If you ask me (as an editor), the first sentence’s syntax seems to suggest PeerJ is being forced by its reviewers, and not influenced. More importantly, I haven’t seen these bespoke problematic papers that are “sound” but at the same time don’t make a meaningful contribution. An expert’s opinion that a paper on some topic should be rejected (even though, again, it’s “sound science”) could be rooted either in an “arrogant gatekeeper” attitude or in valid reasons, and PeerJ‘s rules should be good enough to be able to differentiate between the two without simultaneously allowing ‘bad reviewers’ to over-“influence” the selection process.

More broadly, I’m a science journalist looking into science from the outside, seeing a colossal knowledge-producing machine that’s situated on the same continuum on which I see myself to be located. If I receive too many submissions at The Wire Science, I don’t make presumptuous comments about what I think should and shouldn’t belong in the public domain. Instead, I pitch my boss about hiring one more person on my team and, second, I’m honest with each submission’s author about why I’m rejecting it: “I’m sorry, I’m short on time.”

Such submissions, in turn, impact the peer review of articles that do make a very significant contribution to the literature, research and society – the congestion of the peer review process can mean assigning editors and finding peer reviewers takes more time, potentially delaying important additions to the scientific record.

Gatekeeping by another name?

Furthermore, because it can be difficult and in some cases impossible to assign an Academic Editor and/or reviewers, authors can be faced with frustratingly long waits only to receive the bad news that their article has been rejected or, in the worst cases, that we were unable to peer review their paper. We believe that by listening to this feedback from our communities and removing some of the congestion from the peer review process, we will provide a better, more efficient, experience for everyone.

Ultimately, it comes down to the rules by which PeerJ‘s editorial board is going to decide which papers are ‘worth it’ and which aren’t. And admittedly, without knowing these rules, it’s hard to judge PeerJ – except on one count: “sound science” is already a good enough rule by which to determine the quality of a scientist’s work. To say it doesn’t suffice for reasons unrelated to scientific publishing, and the publishing apparatus’s dangerous tendency to gatekeep based on factors that have little to do with science, sounds at least precarious.


India’s missing research papers

If you’re looking for a quantification (although you shouldn’t) of the extent to which science is being conducted by press releases in India at the moment, consider the following list of studies. The papers for none of them have been published – as preprints or ‘post-prints’ – even as the people behind them, including many government officials and corporate honchos, have issued press releases about the respective findings, which some sections of the media have publicised without question and which have quite likely gone on to inform government decisions about suitable control and mitigation strategies. The collective danger of this failure is only amplified by a deafening silence from many quarters, especially from the wider community of doctors and medical researchers – almost as if it’s normal to conduct studies and publish press releases in a hurry and take an inordinate amount of time upload a preprint manuscript or conduct peer review, instead of the other way around. By the way, did you know India has three science academies?

  1. ICMR’s first seroprevalence survey (99% sure it isn’t out yet, but if I’m wrong, please let me know and link me to the paper?)
  2. Mumbai’s TIFR-NITI seroprevalence survey (100% sure. I asked TIFR when they plan to upload the paper, they said: “We are bound by BMC rules with respect to sharing data and hence we cannot give the raw data to anyone at least [until] we publish the paper. We will upload the preprint version soon.”)
  3. Biocon’s phase II Itolizumab trial (100% sure. More about irregularities here.)
  4. Delhi’s first seroprevalence survey (95% sure. Vinod Paul of NITI Aayog discussed the results but no paper has pinged my radar.)
  5. Delhi’s second seroprevalence survey (100% sure. Indian Express reported on August 8 that it has just wrapped up and the results will be available in 10 days. It didn’t mention a paper, however.)
  6. Bharat Biotech’s COVAXIN preclinical trials (90% sure)
  7. Papers of well-designed, well-powered studies establishing that HCQ, remdesivir, favipiravir and tocilizumab are efficacious against COVID-19 🙂

Aside from this, there have been many disease-transmission models whose results have been played up without discussing the specifics as well as numerous claims about transmission dynamics that have been largely inseparable from the steady stream of pseudoscience, obfuscation and carelessness. In one particularly egregious case, the Indian Council of Medical Research announced in a press release in May that Ahmedabad-based Zydus Cadila had manufactured an ELISA test kit for COVID-19 for ICMR’s use that was 100% specific and 98% sensitive. However, the paper describing the kit’s validation, published later, said it was 97.9% specific and 92.37% sensitive. If you know what these numbers mean, you’ll also know what a big difference this is, between the press release and the paper. After an investigation by Priyanka Pulla followed by multiple questions to different government officials, ICMR admitted it had made a booboo in the press release. I think this is a fair representation of how much the methods of science – which bridge first principles with the results – matter in India during the pandemic.


The costs of correction

I was slightly disappointed to read a report in the New York Times this morning. Entitled ‘Two Huge COVID-19 Studies Are Retracted After Scientists Sound Alarms’, it discussed the implications of two large studies of COVID-19 recently being retracted by two leading medical journals they were published in, the New England Journal of Medicine and The Lancet. My sentiment stemmed from the following paragraph and some after:

I don’t know if just these two retractions raise troubling questions as if these questions weren’t already being asked well before these incidents. The suggestion that the lack of peer-review, or any form of peer-review at all in its current form (opaque, unpaid) could be to blame is more frustrating, as is the article’s own focus on the quality of the databases used in the two studies instead of the overarching issue. Perhaps this is yet another manifestation of the NYT’s crisis under Trump? 😀

One of the benefits of the preprint publishing system is that peer-review is substituted with ‘open review’. And one of the purposes of preprints is that the authors of a study can collect feedback and suggestions before publishing in a peer-reviewed journal instead of accruing a significant correction cost post-publication, in the form of corrections or retractions, both of which continue to carry a considerable amount of stigma. So as such, the preprints mode ensures a more complete, a more thoroughly reviewed manuscript enters the peer-review system instead of vesting the entire burden of fact-checking and reviewing a paper on a small group of experts whose names and suggestions most journals don’t reveal, and who are generally unpaid for their time and efforts.

In turn, the state of scientific research is fine. It would simply be even better if we reduced the costs associated with correcting the scientific record instead of heaping more penalties on that one moment, as the conventional system of publishing does. ‘Conventional – which in this sphere seems to be another word for ‘closed-off’ – journals also have an incentive to refuse to publish corrections or perform retractions because they’ve built themselves up on claims of being discerning, thorough and reliable. So retractions are a black mark on their record. Elisabeth Bik has often noted how long journals take to even acknowledge entirely legitimate complaints about papers they’ve published, presumably for this reason.

There really shouldn’t be any debate on which system is better – but sadly there is.

Op-eds Science

Poor journalism is making it harder for preprints

There have been quite a few statements by various scientists on Twitter who, in pointing to some preprint paper’s untenable claims, point to the manuscript’s identity as a preprint paper as well. This is not fair, as I’ve argued many times before. A big part of the problem here is bad journalism. Bad preprint papers are a problem not because their substance is bad but because people who are not qualified to understand why it is bad read it and internalise its conclusions at face value.

There are dozens of new preprint papers uploaded onto arXiv, medRxiv and bioRxiv every week making controversial arguments and/or arriving at far-fetched conclusions, often patronising to the efforts of the subject’s better exponents. Most of them (at least according to what I know of preprints on arXiv) are debated and laid to rest by scientists familiar with the topics at hand. No non-expert is hitting up arXiv or bioRxiv every morning looking for preprints to go crazy on. The ones that become controversial enough to catch the attention of non-experts have, nine times out of then, been amplified to that effect by a journalist who didn’t suitably qualify the preprint’s claims and simply published it. Suddenly, scores (or more) of non-experts have acquired what they think is refined knowledge, and public opinion thereafter goes against the scientific grain.

Acknowledging that this collection of events is a problem on many levels, which particular event would you say is the deeper one?

Some say it’s the preprint mode of publishing, and when asked for an alternative, demand that the use of preprint servers be discouraged. But this wouldn’t solve the problem. Preprint papers are a relatively new development while ‘bad science’ has been published for a long time. More importantly, preprint papers improve public access to science, and preprints that contain good science do this even better.

To making sweeping statements against the preprint publishing enterprise because some preprints are bad is not fair, especially to non-expert enthusiasts (like journalists, bloggers, students) in developing countries, who typically can’t afford the subscription fees to access paywalled, peer-reviewed papers. (Open-access publishing is a solution too but it doesn’t seem to feature in the present pseudo-debate nor does it address important issues that beset itself as well as paywalled papers.)

Even more, if we admitted that bad journalism is the problem, as it really is, we achieve two things: prevent ‘bad science’ from reaching the larger population and retain access to ‘good science’.

Now, to the finer issue of health- and medicine-related preprints: Yes, acting based on the conclusions of a preprint paper – such as ingesting an untested drug or paying too much attention to an irrelevant symptom – during a health crisis in a country with insufficient hospitals and doctors can prove deadlier than usual. But how on Earth could a person have found that preprint paper, read it well enough to understand what it was saying, and act on its conclusions? (Put this way, a bad journalist could be even more to blame for enabling access to a bad study by translating its claims to simpler language.)

Next, a study published in The Lancet claimed – and thus allowed others to claim by reference – that most conversations about the novel coronavirus have been driven by preprint papers. (An article in Ars Technica on May 6 carried this provocative headline, for example: ‘Unvetted science is fuelling COVID-19 misinformation’.) However, the study was based on only 11 papers. In addition, those who invoke this study in support of arguments directed against preprints often fail to mention the following paragraph, drawn from the same paper:

… despite the advantages of speedy information delivery, the lack of peer review can also translate into issues of credibility and misinformation, both intentional and unintentional. This particular drawback has been highlighted during the ongoing outbreak, especially after the high-profile withdrawal of a virology study from the preprint server bioRxiv, which erroneously claimed that COVID-19 contained HIV “insertions”. The very fact that this study was withdrawn showcases the power of open peer-review during emergencies; the withdrawal itself appears to have been prompted by outcry from dozens of scientists from around the globe who had access to the study because it was placed on a public server. Much of this outcry was documented on Twitter and on longer-form popular science blogs, signalling that such fora would serve as rich additional data sources for future work on the impact of preprints on public discourse. However, instances such as this one described showcase the need for caution when acting upon the science put forth by any one preprint.”

The authors, Maimuna Majumder and Kenneth Mandl, have captured the real problem. Lots of preprints are being uploaded every week and quite a few are rotten. Irrespective of how many do or don’t drive public conversations (especially on the social media), it’s disingenuous to assume this risk by itself suffices to cut access.

Instead, as the scientists write, exercise caution. Instead of spoiling a good thing, figure out a way to improve the reporting habits of errant journalists. Otherwise, remember that nothing stops an irresponsible journalist from sensationalising the level-headed conclusions of a peer-reviewed paper either. All it takes is to quote from a grossly exaggerated university press-release and to not consult with an independent expert. Even opposing preprints with peer-reviewed papers only advances a false balance, comparing preprints’ access advantage to peer-review’s gatekeeping advantage (and even that is on shaky ground).

Science Uncategorized

Distracting from the peer-review problem

From an article entitled ‘The risks of swiftly spreading coronavirus research‘ published by Reuters:

A Reuters analysis found that at least 153 studies – including epidemiological papers, genetic analyses and clinical reports – examining every aspect of the disease, now called COVID-19 – have been posted or published since the start of the outbreak. These involved 675 researchers from around the globe. …

Richard Horton, editor-in-chief of The Lancet group of science and medical journals, says he’s instituted “surge capacity” staffing to sift through a flood of 30 to 40 submissions of scientific research a day to his group alone.

… much of [this work] is raw. With most fresh science being posted online without being peer-reviewed, some of the material lacks scientific rigour, experts say, and some has already been exposed as flawed, or plain wrong, and has been withdrawn.

“The public will not benefit from early findings if they are flawed or hyped,” said Tom Sheldon, a science communications specialist at Britain’s non-profit Science Media Centre. …

Preprints allow their authors to contribute to the scientific debate and can foster collaboration, but they can also bring researchers almost instant, international media and public attention.

“Some of the material that’s been put out – on pre-print servers for example – clearly has been… unhelpful,” said The Lancet’s Horton.

“Whether it’s fake news or misinformation or rumour-mongering, it’s certainly contributed to fear and panic.” …

Magdalena Skipper, editor-in-chief of Nature, said her group of journals, like The Lancet’s, was working hard to “select and filter” submitted manuscripts. “We will never compromise the rigour of our peer review, and papers will only be accepted once … they have been thoroughly assessed,” she said.

When Horton or Sheldon say some of the preprints have been “unhelpful” and that they cause panic among the people – which people do they mean? No non-expert person is hitting up bioRxiv looking for COVID-19 papers. They mean some lazy journalists and some irresponsible scientists are spreading misinformation, and frankly their habits represent a more responsible problem to solve instead of pointing fingers at preprints.

The Reuters analysis also says nothing about how well preprint repositories as well as scientists on social media platforms are conducting open peer-review, instead cherry-picking reasons to compose a lopsided argument against greater transparency in the knowledge economy. Indeed, crisis situations like the COVID-19 outbreak often seem to become ground zero for contemplating the need for preprints but really, no one seems to want to discuss “peer-reviewed” disasters like the one recently publicised by Elisabeth Bik. To quote from The Wire (emphasis added),

[Elisabeth] Bik, @SmutClyde, @mortenoxe and @TigerBB8 (all Twitter handles of unidentified persons), report – as written by Bik in a blog post – that “the Western blot bands in all 400+ papers are all very regularly spaced and have a smooth appearance in the shape of a dumbbell or tadpole, without any of the usual smudges or stains. All bands are placed on similar looking backgrounds, suggesting they were copy-pasted from other sources or computer generated.”

Bik also notes that most of the papers, though not all, were published in only six journals: Artificial Cells Nanomedicine and BiotechnologyJournal of Cellular BiochemistryBiomedicine & PharmacotherapyExperimental and Molecular PathologyJournal of Cellular Physiology, and Cellular Physiology and Biochemistry, all maintained reputed publishers and – importantly – all of them peer-reviewed.

Analysis Science

The scientist as inadvertent loser

Twice this week, I’d had occasion to write about how science is an immutably human enterprise and therefore some of its loftier ideals are aspirational at best, and about how transparency is one of the chief USPs of preprint repositories and post-publication peer-review. As if on cue, I stumbled upon a strange case of extreme scientific malpractice that offered to hold up both points of view.

In an article published January 30, three editors of the Journal of Theoretical Biology (JTB) reported that one of their handling editors had engaged in the following acts:

  1. “At the first stage of the submission process, the Handling Editor on multiple occasions handled papers for which there was a potential conflict of interest. This conflict consisted of the Handling Editor handling papers of close colleagues at the Handling Editor’s own institute, which is contrary to journal policies.”
  2. “At the second stage of the submission process when reviewers are chosen, the Handling Editor on multiple occasions selected reviewers who, through our investigation, we discovered was the Handling Editor working under a pseudonym…”
  3. Many forms of reviewer coercion
  4. “In many cases, the Handling Editor was added as a co-author at the final stage of the review process, which again is contrary to journal policies.”

On the back of these acts of manipulation, this individual – whom the editors chose not to name for unknown reasons but one of whom all but identified on Twitter as a Kuo-Chen Chou (and backed up by an independent user) – proudly trumpets the following ‘achievement’ on his website:

The same webpage also declares that Chou “has published over 730 peer-reviewed scientific papers” and that “his papers have been cited more than 71,041 times”.

Without transparencya and without the right incentives, the scientific process – which I use loosely to denote all activities and decisions associated with synthesising, assimilating and organising scientific knowledge – becomes just as conducive to misconduct and unscrupulousness as any other enterprise if only because it allows people with even a little more power to exploit others’ relative powerlessness.

a. Ironically, the JTB article lies behind a paywall.

In fact, Chen had also been found guilty of similar practices when working with a different journal, called Bioinformatics, and an article its editors published last year has been cited prominently in the article by JTB’s editors.

Even if the JTB and Bioinformatics cases are exceptional for their editors having failed to weed out gross misconduct shortly after its first occurrence – it’s not; but although there many such exceptional cases, they are still likely to be in the minority (an assumption on my part) – a completely transparent review process eliminates such possibilities as well as, and more importantly, naturally renders the process trustlessb. That is, you shouldn’t have to trust a reviewer to do right by your paper; the system itself should be designed such that there is no opportunity for a reviewer to do wrong.

b. As in trustlessness, not untrustworthiness.

Second, it seems Chou accrued over 71,000 citations because the number of citations has become a proxy for research excellence irrespective of whether the underlying research is actually excellent – a product of the unavoidable growth of a system in which evaluators replaced a complex combination of factors with a single number. As a result, Chou and others like him have been able to ‘hack’ the system, so to speak, and distort the scientific literature (which you might’ve seen as the stack of journals in a library representing troves of scientific knowledge).

But as long as the science is fine, no harm done, right? Wrong.

If you visualised the various authors of research papers as points and the lines connecting them to each other as citations, an inordinate number would converge on the point of Chou – and they would be wrong, led there not by Chou’s prowess as a scientist but misled there by his abilities as a credit-thief and extortionist.

This graphing exercise isn’t simply a form of visual communication. Imagine your life as a scientist as a series of opportunities, where each opportunity is contested by multiple people and the people in charge of deciding who ‘wins’ at each stage aren’t some or all of well-trained, well-compensated or well-supported. If X ‘loses’ at one of the early stages and Y ‘wins’, Y has a commensurately greater chance of winning a subsequent contest and X, lower. Such contests often determine the level of funding, access to suitable guidance and even networking possibilities, so over multiple rounds, by virtue of the evaluators at each step having more reasons to be impressed by Y‘s CV because, say, they had more citations, and fewer reasons to be impressed with X‘s, X ends up with more reasons to exit science and switch careers.

Additionally, because of the resources that Y has received opportunities to amass, they’re in a better position to conduct even more research, ascend to even more influential positions and – if they’re so inclined – accrue even more citations through means both straightforward and dubious. To me, such prejudicial biasing resembles the evolution of a Lorenz attractor: the initial conditions might appear to be the same to some approximation, but for a single trivial choice, one scientist ends up being disproportionately more successful than another.

The answer of course is many things, including better ways to evaluate and reward research, and two of them in turn have to be to eliminate the use of numbers to denote human abilities and to make the journey of a manuscript from the lab to the wild as free of opaque, and therefore potentially arbitrary, decision-making as possible.

Featured image: A still from an animation showing the divergence of nearby trajectories on a Lorenz system. Caption and credit: MicoFilós/Wikimedia Commons, CC BY-SA 3.0.