A trumpet for Ramdev

The Print published an article entitled ‘Ramdev’s Patanjali does a ‘first’, its Sanskrit paper makes it to international journal’ on February 5, 2020. Excerpt:

In a first, international science journal MDPI has published a research paper in the Sanskrit language. Yoga guru Baba Ramdev’s FMCG firm Patanjali Ayurveda had submitted the paper. Switzerland’s Basel-based MDPI … published a paper in Sanskrit for the first time. Biomolecules, one of the peer-reviewed journals under MDPI, has carried video abstracts of the paper on a medicinal herb, but with English subtitles. … The Patanjali research paper, published on 25 January in a special issue of the journal titled ‘Pharmacology of Medicinal Plants’, is on medicinal herb ‘Withania somnifera’, commonly known as ‘ashwagandha’.

This article is painfully flawed.

1. MDPI is a publisher, not a journal. It featured on Beall’s list (with the customary caveats) and has published some obviously problematic papers. I’ve heard good things about some of its titles and bad things about others. The journalist needed to have delineated this aspect instead of taking the simpler fact of publication in a journal at face value. Even then, qualifying a journal as “peer-reviewed” doesn’t cut it anymore. In a time when peer-review can be hacked (thanks to its relative opacity) and the whole publishing process subverted for profit, all journalists writing on matters of science – as opposed to just science journalists – need to perform their own checks to certify the genealogy of a published paper, especially if the name of the journal(s) and its exercise of peer-review are being employed in the narrative as markers of authority.

2. People want to publish research in English so others can discover and build on it. A paper written in Sanskrit is a gimmick. The journalist should have clarified this point instead of letting Ramdev’s minions (among the authors of the paper) claim brownie points for their feat. It’s a waste of effort, time and resources. More importantly The Print has conjured a virtue out of thin air and broadcast asinine claims like “This is the first step towards the acceptance of ‘Sanskrit language’ in the field of research among the international community.”

3. The article has zero critique of the paper’s findings, no independent comments and no information about the study’s experimental design. This is the sort of nonsense that an unquestioning commitment to objectivity in news allows: reporters can’t just write someone said something if what they said is wrong, misleading, harmful or all three. Magnifying potentially indefensible claims relating to scientific knowledge – or knowledge that desires the authority of science’s approval – without contextualising them and fact-checking them if necessary may be objective but it is also a public bad. It pays to work with the assumption (even when it doesn’t apply) that at least 50% of your readers don’t know better. That way, even if 1% (an extremely conservative estimate for audiences in India) doesn’t know better, which can easily run into the thousands, you avoid misinforming them by not communicating enough.

4. A worryingly tendentious statement appears in the middle of the piece: “The study proves that WS seeds help reduce psoriasis,” the journalist writes, without presenting any evidence that she checked. It seems possible that the journalist believes she is simply reporting the occurrence of a localised event – in the form of the context-limited proof published in a paper – without acknowledging that the act of proving a hypothesis is a process, not an event, in that it is ongoing. This character is somewhat agnostic of the certainty of the experiment’s conclusions as well: even if one scientist has established with 100% confidence that the experiment they designed has sustained their hypothesis and published their results in a legitimate preprint repository and/or a journal, other scientists will need to replicate the test and even others are likely to have questions they’ll need answered.

5. The experiment was conducted in mice, not humans. Cf. @justsaysinmice

6. “‘We will definitely monetise the findings. We will be using the findings to launch our own products under the cosmetics and medicine category,’ Acharya [the lead author] told ThePrint.” It’s worrying to discover that the authors of the paper, and Baba Ramdev, who funded them, plan to market a product based on just one study, in mice, in a possibly questionable paper, without any independent comments about the findings’ robustness or tenability, to many humans who may not know better. But the journalist hasn’t pressed Acharya or any of the other authors on questions about the experiment or their attempt to grab eyeballs by writing and speaking in Sanskrit, or on how they plan to convince the FSSAI to certify a product for humans based on a study in mice.

The cycle

Is it just me or does everyone see a self-fulfilling prophecy here?

For a long time, and assisted ably by the ‘publish or perish’ paradigm, researchers sought to have their papers published in high-impact-factor journals – a.k.a. prestige journals – like Nature.

Such journals in turn, assisted ably by parasitic strategies, made these papers highly visible to other researchers around the world and, by virtue of being high-IF journals, tainted the results in the papers with a measure of prestige, ergo importance.

Evaluations and awards committees in turn were highly aware of these papers over others and picked their authors for rewards over others, further amplifying their work, increasing the opportunity cost incurred by the researchers who lose out, and increasing the prestige attached to the high-IF journals.

Run this cycle a few million times and you end up with the impression that there’s something journals like Nature get right – when in fact it’s just mostly a bunch of business practices to ensure they remain profitable.

The case for preprints

Daniel Mansur, the principal investigator of a lab at the Universidade Federal de Santa Catarina that studies how cells respond to viruses, had this to say about why preprints are useful in an interview to eLife:

Let’s say the paper that we put in a preprint is competing with someone and we actually have the same story, the same set of data. In a journal, the editors might ask both groups for exactly the same sets of extra experiments. But then, the other group that’s competing with me works at Stanford or somewhere like that. They’ll order everything they need to do the experiments, and the next day three postdocs will be working on the project. If there’s something that I don’t have in the lab, I have to wait six months before starting the extra experiments. At least with a preprint the work might not be complete, but people will know what we did.

Preprints level the playing field by eliminating one’s “ability to publish” in high-IF journals as a meaningful measure of the quality of one’s work.

While this makes it easier for scientists to compete with their better-funded peers, my indefatigable cynicism suggests there must be someone out there who’s unhappy about this. Two kinds of people come immediately to mind: journal publishers and some scientists at highfalutin universities like Stanford.

Titles like NatureCellNew England Journal of Medicine and Science, and especially those published by the Elsevier group, have ridden the impact factor (IF) wave to great profit through many decades. In fact, IF continues to be the dominant mode of evaluation of research quality because it’s easy and not time-consuming, so – given how IF is defined – these journals continue to be important for being important. They also provide a valuable service – the double-blind peer review, which Mansur thinks is the only thing preprints are currently lacking in. But other than that (and with post-publication peer-review being largely suitable), their time of obscene profits is surely running out.

The pro-preprint trend in scientific publishing is also bound to have jolted some scientists whose work received a leg-up by virtue of their membership in elite faculty groups. Like Mansur says, a scientist from Stanford or a similar institution can no longer claim primacy, or uniqueness, by default. As a result, preprints definitely improve the forecast for good scientists working at less-regarded institutions – but an equally important consideration would be whether preprints also diminish the lure of fancy universities. They do have one less thing to offer now, or at least in the future.

English as the currency of science’s practice

K. VijayRaghavan, the secretary of India’s Department of Biotechnology, has written a good piece in Hindustan Times about how India must shed its “intellectual colonialism” to excel at science and tech – particularly by shedding its obsession with the English language. This, as you might notice, parallels a post I wrote recently about how English plays an overbearing role in our lives, and particularly in the lives of scientists, because it remains a language many Indians don’t have to access to get through their days. Having worked closely with the government in drafting and implementing many policies related to the conduct and funding of scientific research in the country, VijayRaghavan is able to take a more fine-grained look at what needs changing and whether that’s possible. Most hearteningly, he says it is – only if we had the will to change. As he writes:

Currently, the bulk of our college education in science and technology is notionally in English whereas the bulk of our high-school education is in the local language. Science courses in college are thus accessible largely to the urban population and even when this happens, education is effectively neither of quality in English nor communicated as translations of quality in the classroom. Starting with the Kendriya Vidyalayas and the Nayodya Vidyalayas as test-arenas, we can ensure the training of teachers so that students in high-school are simultaneously taught in both their native language and in English. This already happens informally, but it needs formalisation. The student should be free to take exams in either language or indeed use a free-flowing mix. This approach should be steadily ramped up and used in all our best educational institutions in college and then scaled to be used more widely. Public and private colleges, in STEM subjects for example, can lead and make bi-lingual professional education attractive and economically viable.

Apart from helping students become more knowledgeable about the world through a language of their choice (for the execution of which many logistical barriers spring to mind, not the least of which is finding teachers), it’s also important to fund academic journals that allow these students to express their research in their language of choice. Without this component, they will be forced to fallback to the use of English, which is bound to be counterproductive to the whole enterprise. This form of change will require material resources as well as a shift in perspective that could be harder to attain. Additionally, as VijayRaghavan mentions, there also need to be good quality translation services for research in one language to be expressed in another so that cross-disciplinary and/or cross-linguistic tie-ups are not hampered.

Featured image credit: skeeze/pixabay.

English as the currency of science's practice

K. VijayRaghavan, the secretary of India’s Department of Biotechnology, has written a good piece in Hindustan Times about how India must shed its “intellectual colonialism” to excel at science and tech – particularly by shedding its obsession with the English language. This, as you might notice, parallels a post I wrote recently about how English plays an overbearing role in our lives, and particularly in the lives of scientists, because it remains a language many Indians don’t have to access to get through their days. Having worked closely with the government in drafting and implementing many policies related to the conduct and funding of scientific research in the country, VijayRaghavan is able to take a more fine-grained look at what needs changing and whether that’s possible. Most hearteningly, he says it is – only if we had the will to change. As he writes:

Currently, the bulk of our college education in science and technology is notionally in English whereas the bulk of our high-school education is in the local language. Science courses in college are thus accessible largely to the urban population and even when this happens, education is effectively neither of quality in English nor communicated as translations of quality in the classroom. Starting with the Kendriya Vidyalayas and the Nayodya Vidyalayas as test-arenas, we can ensure the training of teachers so that students in high-school are simultaneously taught in both their native language and in English. This already happens informally, but it needs formalisation. The student should be free to take exams in either language or indeed use a free-flowing mix. This approach should be steadily ramped up and used in all our best educational institutions in college and then scaled to be used more widely. Public and private colleges, in STEM subjects for example, can lead and make bi-lingual professional education attractive and economically viable.

Apart from helping students become more knowledgeable about the world through a language of their choice (for the execution of which many logistical barriers spring to mind, not the least of which is finding teachers), it’s also important to fund academic journals that allow these students to express their research in their language of choice. Without this component, they will be forced to fallback to the use of English, which is bound to be counterproductive to the whole enterprise. This form of change will require material resources as well as a shift in perspective that could be harder to attain. Additionally, as VijayRaghavan mentions, there also need to be good quality translation services for research in one language to be expressed in another so that cross-disciplinary and/or cross-linguistic tie-ups are not hampered.

Featured image credit: skeeze/pixabay.

The language and bullshitness of ‘a nearly unreadable paper’

Earlier today, the Retraction Watch mailing list highlighted a strange paper written by a V.M. Das disputing the widely accepted fact that our body clocks are regulated by the gene-level circadian rhythm. The paper is utter bullshit. Sample its breathless title: ‘Nobel Prize Physiology 2017 (for their discoveries of molecular mechanisms controlling the circadian rhythm) is On Fiction as There Is No Molecular Mechanisms of Biological Clock Controlling the Circadian Rhythm. Circadian Rhythm Is Triggered and Controlled By Divine Mechanism (CCP – Time Mindness (TM) Real Biological Clock) in Life Sciences’.

The use of language here is interesting. Retraction Watch called the paper ‘unreadable’ in the headline of its post because that’s obviously a standout feature of this paper. I’m not sure why Retraction Watch is highlighting nonsense papers on its pages – watched by thousands every day for intriguing retraction reports informed by the reporting of its staff – but I’m going to assume its editors want to help all their readers set up their own bullshit filters. And the best way to do this, as I’ve written before, is to invite readers to participate in understanding why something is bullshit.

However, to what extent do we think unreadability is a bullshit indicator? And from whose perspective?

There’s no exonerating the ‘time mindness’ paper because those who get beyond the language are able to see that it’s simply not even wrong. But if you had judged it only by its language, you would’ve landed yourself in murky waters. In fact, no paper should be judged by how it exercises the grammar of the language its authors have decided to write it in. Two reasons:

1. English is not the first language for most of India. Those who’ve been able to afford an English-centred education growing up or hail from English-fluent families (or both) are fine with the language but I remember most of my college professors preferring Hindi in the classroom. And I assume that’s the picture in most universities, colleges and schools around the country. You only need access to English if you’ve also had the opportunity to afford a certain lifestyle (cosmopolitan, e.g.).

2. There are not enough good journals publishing in vernacular languages in India – at least not that I know of. The ‘best’ is automatically the one in English, among other factors. Even the government thinks so. Earlier this year, the University Grants Commission published a ‘preferred’ list of journals; only papers published herein were to be considered for career advancement evaluations. The list left out most major local-language publications.

Now, imagine the scientific vocabulary of a researcher who prefers Hindi over English, for example, because of her educational upbringing as well as to teach within the classroom. Wouldn’t it be composed of Latin and English jargon suspended from Hindi adjectives and verbs, a web of Hindi-speaking sensibilities straining to sound like a scientist? Oh, that recalls a third issue:

3. Scientific papers are becoming increasingly hard to read, with many scientists choosing to actively include words they wouldn’t use around the dinner table because they like how the ‘sciencese’ sounds. In time, to write like this becomes fashionable – and to not write like this becomes a sign of complacency, disinterest or disingenuousness.

… to the mounting detriment of those who are not familiar with even colloquial English in the first place. To sum up: if a paper shows other, more ‘proper’ signs of bullshit, then it is bullshit no matter how much its author struggled to write it. On the other hand, a paper can’t be suspected of badness if its language is off – nor can it be called bad as such if that’s all is off about it.

This post was composed entirely on a smartphone. Please excuse typos or minor formatting issues.

The language and bullshitness of 'a nearly unreadable paper'

Earlier today, the Retraction Watch mailing list highlighted a strange paper written by a V.M. Das disputing the widely accepted fact that our body clocks are regulated by the gene-level circadian rhythm. The paper is utter bullshit. Sample its breathless title: ‘Nobel Prize Physiology 2017 (for their discoveries of molecular mechanisms controlling the circadian rhythm) is On Fiction as There Is No Molecular Mechanisms of Biological Clock Controlling the Circadian Rhythm. Circadian Rhythm Is Triggered and Controlled By Divine Mechanism (CCP – Time Mindness (TM) Real Biological Clock) in Life Sciences’.

The use of language here is interesting. Retraction Watch called the paper ‘unreadable’ in the headline of its post because that’s obviously a standout feature of this paper. I’m not sure why Retraction Watch is highlighting nonsense papers on its pages – watched by thousands every day for intriguing retraction reports informed by the reporting of its staff – but I’m going to assume its editors want to help all their readers set up their own bullshit filters. And the best way to do this, as I’ve written before, is to invite readers to participate in understanding why something is bullshit.

However, to what extent do we think unreadability is a bullshit indicator? And from whose perspective?

There’s no exonerating the ‘time mindness’ paper because those who get beyond the language are able to see that it’s simply not even wrong. But if you had judged it only by its language, you would’ve landed yourself in murky waters. In fact, no paper should be judged by how it exercises the grammar of the language its authors have decided to write it in. Two reasons:

1. English is not the first language for most of India. Those who’ve been able to afford an English-centred education growing up or hail from English-fluent families (or both) are fine with the language but I remember most of my college professors preferring Hindi in the classroom. And I assume that’s the picture in most universities, colleges and schools around the country. You only need access to English if you’ve also had the opportunity to afford a certain lifestyle (cosmopolitan, e.g.).

2. There are not enough good journals publishing in vernacular languages in India – at least not that I know of. The ‘best’ is automatically the one in English, among other factors. Even the government thinks so. Earlier this year, the University Grants Commission published a ‘preferred’ list of journals; only papers published herein were to be considered for career advancement evaluations. The list left out most major local-language publications.

Now, imagine the scientific vocabulary of a researcher who prefers Hindi over English, for example, because of her educational upbringing as well as to teach within the classroom. Wouldn’t it be composed of Latin and English jargon suspended from Hindi adjectives and verbs, a web of Hindi-speaking sensibilities straining to sound like a scientist? Oh, that recalls a third issue:

3. Scientific papers are becoming increasingly hard to read, with many scientists choosing to actively include words they wouldn’t use around the dinner table because they like how the ‘sciencese’ sounds. In time, to write like this becomes fashionable – and to not write like this becomes a sign of complacency, disinterest or disingenuousness.

… to the mounting detriment of those who are not familiar with even colloquial English in the first place. To sum up: if a paper shows other, more ‘proper’ signs of bullshit, then it is bullshit no matter how much its author struggled to write it. On the other hand, a paper can’t be suspected of badness if its language is off – nor can it be called bad as such if that’s all is off about it.

This post was composed entirely on a smartphone. Please excuse typos or minor formatting issues.

A conference's peer-review was found to be sort of random, but whose fault is it?

It’s not a good time for peer-review. Sure, if you’ve been a regular reader of Retraction Watch, it’s never been a good time for peer-review. But aside from that, the process has increasingly been taking the brunt for not being able to stem the publishing of results that – after publication – have been found to be the product of bad research practices.

The problem may be that the reviewers are letting the ‘bad’ papers through but the bigger issue is that, while the system itself has been shown to have many flaws – not excluding personal biases – journals rely on the reviewers and naught else to stamp accepted papers with their approval. And some of those stamps, especially from Nature or Science, are weighty indeed. Now add to this muddle the NIPS wrangle, where researchers may have found that some peer-reviews are just arbitrary.

NIPS stands for the Neural Information Processing Systems (Foundation), whose annual conference was held in the second week of December 2014, in Montreal. It’s considered one of the few main conferences in the field of machine-learning. Around the time, two attendees – Corinna Cortes and Neil Lawrence – performed an experiment to judge how arbitrary the conference’s peer-review could get.

Their modus operandi was simple. All the papers submitted to the conference were peer-reviewed before they were accepted. Cortes and Lawrence then routed a tenth of all submitted papers through a second peer-review stage, and observed which papers were accepted or rejected in the second stage (According to Eric Price, NIPS ultimately accepted a paper if either group of reviewers accepted it). Their findings were distressing.

About 57%* of all papers accepted in the first review were rejected during the second review. To be sure, each stage of the review was presumably equally competent – it wasn’t as if the second stage was more stringent than the first. That said, 57% is a very big number. More than five times out of 10, peer-reviewers disagreed on what could be published. In other words, in an alternate universe, the same conference but with only the second group of reviewers in place was generating different knowledge.

Lawrence was also able to eliminate a possibly redeeming confounding factor, which he described in a Facebook discussion on this experiment:

… we had a look through the split decisions and didn’t find an example where the reject decision had found a ‘critical error’ that was missed by the accept. It seems that there is quite a lot of subjectivity in these things, which I suppose isn’t that surprising.

It doesn’t bode well that the NIPS conference is held in some esteem among its attendees for having one of the better reviewing processes. Including the 90% of the papers that did not go through a second peer-review, the total predetermined acceptance rate was 22%, i.e. reviewers were tasked with accepting 22 papers out of every 100 submitted. Put another way, the reviewers were rejecting 78%. And this sheds light on the more troubling perspective of their actions.

If the reviewers had been randomly rejecting a paper, they would’ve done so at the tasked rate of 78%. At NIPS, one can only hope that they weren’t – so the second group was purposefully rejecting 57% of the papers that the first group had accepted. In an absolutely non-random, logical world, this number should have been 0%. So, that 57% is closer to 78% than is 0% implies some of the rejection was random. Hmm.

While this is definitely cause for concern, forging ahead on the basis of arbitrariness – which machine-learning theorist John Langford defines as the probability that the second group rejects a paper that the first group has accepted – wouldn’t be the right way to go about it. This is similar to the case with A/B-testing: we have a test whose outcome can be used to inform our consequent actions, but using the test itself as a basis for the solution wouldn’t be right. For example, the arbitrariness can be reduced to 0% simply by having both groups accept every nth paper – a meaningless exercise.

Is our goal to reduce the arbitrariness to 0% at all? You’d say ‘Yes’, but consider the volume of papers being submitted to important conferences like NIPS and the number of reviewer-hours being available to evaluate them. In the history of conferences, surely some judgments must have been arbitrary for the reviewer to have fulfilled his/her responsibilities to his/her employer. So you see the bigger issue: it’s not all the reviewer as much as it’s also the so-called system that’s flawed.

Langford’s piece raises a similarly confounding topic:

Perhaps this means that NIPS is a very broad conference with substantial disagreement by reviewers (and attendees) about what is important? Maybe. This even seems plausible to me, given anecdotal personal experience. Perhaps small highly-focused conferences have a smaller arbitrariness?

Problems like these are necessarily difficult to solve because of the number of players involved. In fact, it wouldn’t be entirely surprising if we found that nobody or no institution was at fault except how they were all interacting with each other, and not just in fields like machine-learning. A study conducted in January 2015 found that minor biases during peer-review could result in massive changes in funding outcomes if the acceptance rate was low – such as with the annual awarding of grants by the National Institutes of Health. Even Nature is wary about the ability of its double-blind peer-review to solve the problems ailing normal ‘peer-review’.

Perhaps for the near future, the only takeaway is likely going to be that ambitious young scientists are going to have to remember that, first, acceptance – just as well as rejection – can be arbitrary and, second, that the impact factor isn’t everything. On the other hand, it doesn’t seem possible in the interim to keep from lowering our expectations of peer-reviewing itself.

*The number of papers routed to the second group after the first was 166. The overall disagreement rate was 26%, so they would have disagreed on the fates of 43. And because they were tasked with accepting 22% – which is 37 or 38 – group 1 could be said to have accepted 21 that group 2 rejected, and group 2 could be said to have accepted 22 that group 1 rejected. Between 21/37 (56.7%) and 22/38 (57.8%) is 57%.

Hat-tip: Akshat Rathi.

A conference’s peer-review was found to be sort of random, but whose fault is it?

It’s not a good time for peer-review. Sure, if you’ve been a regular reader of Retraction Watch, it’s never been a good time for peer-review. But aside from that, the process has increasingly been taking the brunt for not being able to stem the publishing of results that – after publication – have been found to be the product of bad research practices.

The problem may be that the reviewers are letting the ‘bad’ papers through but the bigger issue is that, while the system itself has been shown to have many flaws – not excluding personal biases – journals rely on the reviewers and naught else to stamp accepted papers with their approval. And some of those stamps, especially from Nature or Science, are weighty indeed. Now add to this muddle the NIPS wrangle, where researchers may have found that some peer-reviews are just arbitrary.

NIPS stands for the Neural Information Processing Systems (Foundation), whose annual conference was held in the second week of December 2014, in Montreal. It’s considered one of the few main conferences in the field of machine-learning. Around the time, two attendees – Corinna Cortes and Neil Lawrence – performed an experiment to judge how arbitrary the conference’s peer-review could get.

Their modus operandi was simple. All the papers submitted to the conference were peer-reviewed before they were accepted. Cortes and Lawrence then routed a tenth of all submitted papers through a second peer-review stage, and observed which papers were accepted or rejected in the second stage (According to Eric Price, NIPS ultimately accepted a paper if either group of reviewers accepted it). Their findings were distressing.

About 57%* of all papers accepted in the first review were rejected during the second review. To be sure, each stage of the review was presumably equally competent – it wasn’t as if the second stage was more stringent than the first. That said, 57% is a very big number. More than five times out of 10, peer-reviewers disagreed on what could be published. In other words, in an alternate universe, the same conference but with only the second group of reviewers in place was generating different knowledge.

Lawrence was also able to eliminate a possibly redeeming confounding factor, which he described in a Facebook discussion on this experiment:

… we had a look through the split decisions and didn’t find an example where the reject decision had found a ‘critical error’ that was missed by the accept. It seems that there is quite a lot of subjectivity in these things, which I suppose isn’t that surprising.

It doesn’t bode well that the NIPS conference is held in some esteem among its attendees for having one of the better reviewing processes. Including the 90% of the papers that did not go through a second peer-review, the total predetermined acceptance rate was 22%, i.e. reviewers were tasked with accepting 22 papers out of every 100 submitted. Put another way, the reviewers were rejecting 78%. And this sheds light on the more troubling perspective of their actions.

If the reviewers had been randomly rejecting a paper, they would’ve done so at the tasked rate of 78%. At NIPS, one can only hope that they weren’t – so the second group was purposefully rejecting 57% of the papers that the first group had accepted. In an absolutely non-random, logical world, this number should have been 0%. So, that 57% is closer to 78% than is 0% implies some of the rejection was random. Hmm.

While this is definitely cause for concern, forging ahead on the basis of arbitrariness – which machine-learning theorist John Langford defines as the probability that the second group rejects a paper that the first group has accepted – wouldn’t be the right way to go about it. This is similar to the case with A/B-testing: we have a test whose outcome can be used to inform our consequent actions, but using the test itself as a basis for the solution wouldn’t be right. For example, the arbitrariness can be reduced to 0% simply by having both groups accept every nth paper – a meaningless exercise.

Is our goal to reduce the arbitrariness to 0% at all? You’d say ‘Yes’, but consider the volume of papers being submitted to important conferences like NIPS and the number of reviewer-hours being available to evaluate them. In the history of conferences, surely some judgments must have been arbitrary for the reviewer to have fulfilled his/her responsibilities to his/her employer. So you see the bigger issue: it’s not all the reviewer as much as it’s also the so-called system that’s flawed.

Langford’s piece raises a similarly confounding topic:

Perhaps this means that NIPS is a very broad conference with substantial disagreement by reviewers (and attendees) about what is important? Maybe. This even seems plausible to me, given anecdotal personal experience. Perhaps small highly-focused conferences have a smaller arbitrariness?

Problems like these are necessarily difficult to solve because of the number of players involved. In fact, it wouldn’t be entirely surprising if we found that nobody or no institution was at fault except how they were all interacting with each other, and not just in fields like machine-learning. A study conducted in January 2015 found that minor biases during peer-review could result in massive changes in funding outcomes if the acceptance rate was low – such as with the annual awarding of grants by the National Institutes of Health. Even Nature is wary about the ability of its double-blind peer-review to solve the problems ailing normal ‘peer-review’.

Perhaps for the near future, the only takeaway is likely going to be that ambitious young scientists are going to have to remember that, first, acceptance – just as well as rejection – can be arbitrary and, second, that the impact factor isn’t everything. On the other hand, it doesn’t seem possible in the interim to keep from lowering our expectations of peer-reviewing itself.

*The number of papers routed to the second group after the first was 166. The overall disagreement rate was 26%, so they would have disagreed on the fates of 43. And because they were tasked with accepting 22% – which is 37 or 38 – group 1 could be said to have accepted 21 that group 2 rejected, and group 2 could be said to have accepted 22 that group 1 rejected. Between 21/37 (56.7%) and 22/38 (57.8%) is 57%.

Hat-tip: Akshat Rathi.

Some research misconduct trends by the numbers

A study published in eLIFE on August 14, 2014, looked at data pertaining to some papers published between 1992 and 2012 that the Office of Research Integrity had determined contained research misconduct. From the abstract:

Data relating to retracted manuscripts and authors found by the Office of Research Integrity (ORI) to have committed misconduct were reviewed from public databases. Attributable costs of retracted manuscripts, and publication output and funding of researchers found to have committed misconduct were determined. We found that papers retracted due to misconduct accounted for approximately $58 million in direct funding by the NIH between 1992 and 2012, less than 1% of the NIH budget over this period. Each of these articles accounted for a mean of $392,582 in direct costs (SD $423,256). Researchers experienced a median 91.8% decrease in publication output and large declines in funding after censure by the ORI.

While the number of retractions worldwide is on the rise – also because the numbers of papers being published and of journals are on the rise – the study addresses a subset of these papers and only those drawn up by researchers who received funding from the National Institutes of Health (NIH).

pubsfreq

Among them, there is no discernible trend in terms of impact factors and attributable losses. In the chart below, the size of each datapoint corresponds to the direct attributable loss and its color, to the impact factor of the journal that published the paper.

tabpublic 15-08-2014 100128

However, is the time to retraction dropping?

The maximum time to retraction has been on the decline since 1997. However, on average, the time to retraction is still fluctuating, influenced as it is by the number of papers retracted and the nature of misconduct.

trendTimeToRetr

No matter the time to retraction or the impact factors of the journals, most scientists experience a significant difference in funding before and after the ORI report comes through, as the chart below shows, sorted by quanta of funds. The right axis displays total funding pre-ORI and the left, total funding post-ORI.

prepostfund

As the study’s authors summarize in their abstract: “Researchers experienced a median 91.8% decrease in publication output and large declines in funding after censure by the ORI,” while total funding toward all implicated researchers went from $131 million to $74.5 million.

There could be some correlation between the type of misconduct and decline in funding, but there’s not enough data to determine that. Nonetheless, there are eight instances in 1992-2012 when the amount of funding increased after the ORI report, of which the lowest rise as such as is seen for John Ho, who committed fraud, and the highest for Alan Landay, implicated for plagiarism, a ‘lesser’ charge.

incfundFrom the paper:

The personal consequences for individuals found to have committed research misconduct are considerable. When a researcher is found by the ORI to have committed misconduct, the outcome typically involves a voluntary agreement in which the scientist agrees not to contract with the United States government for a period of time ranging from a few years to, in rare cases, a lifetime. Recent studies of faculty and postdoctoral fellows indicate that research productivity declines after censure by the ORI, sometimes to zero, but that many of those who commit misconduct are able to find new jobs within academia (Redman and Merz, 2008, 2013). Our study has found similar results. Censure by the ORI usually results in a severe decrease in productivity, in many cases causing a permanent cessation of publication. However the exceptions are instructive.

Retraction Watch reported the findings with especial focus on the cost of research misconduct. They spoke to Daniele Fanelli, one part of whose quote is notable – albeit no less than the rest.

The question of collateral damage, by which I mean the added costs caused by other research being misled, is controversial. It still has to be conclusively shown, in other words, that much research actually goes wasted directly because of fabricated findings. Waste is everywhere in science, but the role played by frauds in generating it is far from established and is likely to be minor.

References

Stern, A.M., Casadevall, A., Steen, R.G. and Fang, F.C., Financial costs and personal consequences of research misconduct resulting in retracted publications, eLIFE. August 14, 2014;3:e02956.

Plagiarism is plagiarism

In a Nature article, Praveen Chaddah argues that textual plagiarism entails that the offending paper only carry a correction and not be retracted because that makes the useful ideas and results in the paper unavailable. On the face of it, this is an argument that draws a distinction between the writing of a paper and the production of its technical contents.

Chaddah proposes to preserve the distinction for the benefit of science by punishing plagiarists only for what they plagiarized. If they pinched text, then issue a correction and apology but let the results stay. If they pinched the hypothesis or results, then retract the paper. He thinks this line of thought is justifiable because, this way, one does not retard the introduction of new ideas into the pool of knowledge, because it does not harm the notion of “research as a creative enterprise” for as long as the hypothesis, method and/or results are original.

I disagree. Textual plagiarism is also the violation of an important creative enterprise that, in fact, has become increasingly relevant to science today: communication. Scientists have to use communication effectively to convince people that their research deserves tax-money. Scientists have to use communication effectively to make their jargon understandable to others. Plagiarizing the ‘descriptive’ part of papers, in this context, is to disregard the importance of communication, and copying the communicative bits should be tantamount to copying the results, too.

He goes on to argue that if textual plagiarism has been detected but if the hypothesis/results are original, the latter must be allowed to stand. His hypothesis appears to assume that scientific journals are the same as specialist forums that prioritize results over a full package: introduction, formulation, description, results, discussion, conclusion, etc. Scientific journals are not just the “guarantors of the citizen’s trust in science” (The Guardian) but also resources that people like journalists, analysts and policy-makers use to understand the extent of the guarantee.

What journalist doesn’t appreciate a scientist who’s able to articulate his/her research well, much less patronizing the publicity it will bring him/her?

In September 2013, the journal PLoS ONE retracted a paper by a group of Indian authors for textual plagiarism. This incident exemplifies a disturbing attitude toward plagiarism. One of the authors of the paper, Ram Dhaked, complained that it was the duty of PLoS ONE to detect their plagiarism before publishing it, glibly abdicating his guilt.

Like Chaddah argues, authors of a paper could be plagiarizing text for a variety of reasons – but somehow they believe lifting chunks of text from other papers during the paper-production process is allowable or will go unchecked. As an alternative to this, publishers could consider – or might already be considering – the ethics of ghost-writing.

He finally posits that papers with plagiarized text should be made available along with the correction, too. That would increase the visibility of the offense and over time, presumably, shame scientists into not plagiarizing – but that’s not the point. The point is to get scientists to understand why it is important to think about what they’ve done and communicate their thoughts. That journals retract both the text and the results if only the text was plagiarized is an important way to reinforce that point. If anything, Chaddah’s contention could have been to reduce the implications of having a retraction against one’s bio.

R&D in China and India

“A great deal of the debate over globalization of knowledge economies has focused on China and India. One reason has been their rapid, sustained economic growth. The Chinese economy has averaged a growth rate of 9-10 percent for nearly two decades, and now ranks among the world’s largest economies. India, too, has grown steadily. After years of plodding along at an average annual increase in its gross domestic product (GDP) of 3.5 percent, India has expanded by 6 percent per annum since 1980, and more than 7 percent since 1994 (Wilson and Purushothaman, 2003). Both countries are expected to maintain their dynamism, at least for the near future.”

– Gereffi et al, ‘Getting the Numbers Right: International Engineering Education in the United States, China and India’, Journal of Engineering Education, January 2008

A June 16 paper in Proceedings of the National Academy of Sciences, titled ‘China’s Rise as a Major Contributor to Science and Technology’, analyses the academic and research environment in China over the last decade or so, and discusses the factors involved in the country’s increasing fecundity in recent years. It concludes that four factors have played an important role in this process:

  1. Large human capital base
  2. A labor market favoring academic meritocracy
  3. A large diaspora of Chinese-origin scientists
  4. A centralized government willing to invest in science

A simple metric they cite to make their point is the publication trends by country. Between 2000 and 2010, for example, the number of science and engineering papers published by China has increased by 470%. The next highest climb was for India, by 234%.

Click on the image for an interactive chart.
Click on the image for an interactive chart.

“The cheaters don’t have to worry they will someday be caught and punished.”

This is a quantitative result. A common criticism of the rising volume of Chinese scientific literature in the last three decades is the quality of research coming out of it. Dramatic increases in research output are often accompanied by a publish-or-perish mindset that fosters a desperation among scientists to get published, leading to padded CVs, falsified data and plagiarism. Moreover, it’s plausible that since R&D funding in China is still controlled by a highly centralized government, flow of money is restricted and access to it is highly competitive. And when it is government officials that are evaluating science, quantitative results are favored over qualitative ones, reliance on misleading performance metrics increases, and funds are often awarded for areas of research that favor political agendas.

The PNAS paper cites the work of Shi-min Fang, a science writer who won the inaugural John Maddox prize in 2012 for exposing scientific fraud in Chinese research circles, for this. In an interview to NewScientist in November of that year, he explains the source of widespread misconduct:

It is the result of interactions between totalitarianism, the lack of freedom of speech, press and academic research, extreme capitalism that tries to commercialise everything including science and education, traditional culture, the lack of scientific spirit, the culture of saving face and so on. It’s also because there is not a credible official channel to report, investigate and punish academic misconduct. The cheaters don’t have to worry they will someday be caught and punished.

At this point, it’s tempting to draw parallels with India. While China has seen increased funding for R&D…

Click on the chart for an interactive view.
Click on the chart for an interactive view.

… India has been less fortunate.

Click on the chart for an interactive view.
Click on the chart for an interactive view.

The issue of funding is slightly different in India, in fact. While Chinese science is obstinately centralized and publicly funded, India is centralized in some parts and decentralized in others, public funding is not high enough because presumably we lack the meritocratic academic environment, and private funding is not as high as it needs to be.

Click on the image for an interactive chart.
Click on the image for an interactive chart.

Even though the PNAS paper’s authors say their breakdown of what has driven scientific output from China could inspire changes in other countries, India is faced with different issues as the charts above have shown. Indeed, the very first chart shows how, despite the number of published papers having double in the last decade, we have only jumped from one small number to another small number.

“Scientific research in India has become the handmaiden of defense technology.”

There is also a definite lack of visibility: when little scientific output of any kind is accessible to 1) the common man, and 2) the world outside. Apart from minimal media coverage, there is a paucity of scientific journals, or they exist but are not well known, accessible or both. This Jamia Milia collection lists a paltry 226 journals – including those in regional languages – but it’s likelier that there are hundreds more, both credible and dubious. A journal serves as an aggregation of reliable scientific knowledge not just for scientists but also for journalists and other reliant decision-makers. It is one place to find the latest developments.

In this context, Current Science appears to be the most favored in the country, not to mention the loneliest. Then again, a couple fingers can be pointed at years of reliance on quantitative performance metrics, which drives many Indian researchers to publish in journals with very high impact factors such as Nature or Science, which are often based outside the country.

In the absence of lists of Indian and Chinese journals, let’s turn to a table used in the PNAS paper showing average number of citations per article compared with the USA, in percent. It shows both India and China close to 40% in 2010-2011.

The poor showing may not be a direct consequence of low quality. For example, a paper may have detailed research conducted to resolve a niche issue in Indian defense technology. In such a case, the quality of the article may be high but the citability of the research itself will be low. Don’t be surprised if this is common in India given our devotion to the space and nuclear sciences. And perhaps this is what a friend of mine referred to when he said “Scientific research in India has become the handmaiden of defense technology”.

To sum up, although India and China both lag the USA and the EU for productivity and value of research (albeit through quantitative metrics), China is facing problems associated with the maturity of a voluminous scientific workforce, whereas India is quite far from that maturity. The PNAS paper is available here. If you’re interested in an analysis of engineering education in the two countries, see this paper (from which the opening lines of this post were borrowed).