Analysis Tech

The overlay bias

I’m not very fond of some highly popular pieces of writing (I won’t name them because I’m nervous about backlash from authors and/or their supporters) because a part of their popularity is undeniably rooted in technological ‘solutions’ that asymmetrically promote work published in the solution’s country of origin.

My favourite example is Pocket, the app that allows users to save copies of articles to read later, offline if required. Not long ago, Pocket introduced an extension for the Google Chrome browser (which counts hundreds of millions of users) such that every time you opened a new tab, it would show you three articles lots of other Pocket users have read and liked. It’s fairly brainless, ergo presumably non-malicious, and you’d expect the results to be distributed equally from among magazines, journals, etc. published around the world.

However, nine times out of ten – but often more – I’d find articles by NYT, The Atlantic, The Baffler, etc. there. I was reluctant to blame Pocket at first, considering their algorithm seemed too simple, but then I realised Pocket was just the last in a long line of other apps and algorithms that simply amplified existing biases.

Before Pocket, for example, there might have been Twitter, Facebook or some other platform that allowed stories from some domains (,, etc.) to persist for longer on users’ feeds because they were more easily perceived to be legitimate than articles from other sources, say, a Venezuelan newspaper, a Kenyan blog, a Pakistani magazine or a Vietnamese journal. Or there might have been Nuzzle, which auto-compiles a digest of articles that others your friends on the social media have shared most – likely unmindful of the fact that people quite often share headlines, or domains they’d like to be known to be reading, instead of the articles themselves.

This is a social magnification like the biological magnification in nature, whereby toxic substances pile up in greater quantities in the gizzards of animals higher up in the food chain. Here, perceptions of legitimacy and quality accumulate in greater quantities in the feeds and timelines of people who consume, or even glance through, the most information. And this way, a general consciousness of what’s considered desirable erects itself without anything drastic, with just the more fleeting and mindless actions of millions of people, into a giant wheel of information distribution that constantly feeds itself its own momentum.

As the wheel turns, and The Atlantic publishes an article, it doesn’t just publish a good article that draws hundreds of thousands of readers. It also rides a wheel set in motion by American readers, American companies, American developers, American interests and American dollars, with a dollop of historical imperialism, that quietly but surely brings the world a good article plus a good-natured reminder that The Atlantic is good and that readers needn’t go looking for anything else because The Atlantic has them covered.

As I wondered in 2017, and still do: “Will my peers in India have been farther along in their careers had there been an equally influential Indian for-publishers tech stack?” Then again, how much is one more amplifier, Pocket or anything else, going to change?

I went into this tirade because of this Twitter thread, which describes a similar issue with arXiv – the popular preprint repo for physical sciences, computer science and applied mathematics papers (don’t @ me to quibble over arXiv’s actual remit). As the tweeter Jia-Bin Huang writes, the manuscripts that were uploaded last – i.e. most recently – to arXiv are displayed on top of the output stack, and what’s displayed on top of the stack gets more citations and readership.

This is a very simple algorithm, quite like Pocket’s algorithm, but in both cases they’re algorithms overlaid on existing bias-amplifying architectures. In a sense, they’re akin to the people who might stand by and watch a lynching, neither egging the perpetrators on nor stopping them. If the metaphor is brutal, remember that the effects on any publication or scientist that can’t infiltrate or ‘hack’ social biases are brutal as well. While their contents and their ideas might deserve international readership, these publications and scientists will need to spend more – energy, resources, effort – to grab international attention again and again.

The example Jia-Bin Huang cites is of scientists in Asia, who – unlike their American counterparts – can’t upload a paper on arXiv just before the deadline so that their papers sit on top of the stack because 2 pm in New York is 3 am in Taipei.

As some replies to the thread indicated, the people maintaining arXiv can easily solve the problem by waiting for the deadline to pass, then randomising the order of papers displayed in its email blast – but as Jia-Bin Huang notes, doing that would mean negating the just-in-time advantage that arXiv’s American users enjoy. So here we are.

It isn’t hard to see how we can extend the same suggestion to the world’s Pockets and Nuzzles. Pick your millions of users’ thousand most-read articles, mix up their order – even weigh down popular American publishers if necessary – and finally advertise the first ten items from this list. But ultimately, until technological solutions actively negate the biases they overlie, Pocket will lie on the same spectrum as the tools that produce the biases. I admit fact-checking in this paradigm could be labour-intensive, as could relevance-checking vis-à-vis arXiv, but I also think the latter would be better problems to solve.


The post-reporter era

One of the foundation stones of journalism is the process of reporting. That there is a messenger working the gap between an event and a story provides for news to exist and exist with myriad nuances attached to it. There are ethical and moral issues, technical considerations, writing styles, and presentation formats to perfect. The entire news-publishing industry is centered on the activities of reporters and streamlining them.

What the reporter requires the most is… well, a few things. The first is a domain of events, from which he picks issues to talk about. The second is a domain of stories, into which he publishes his reports. The third is a platform using which he may incentivize this process for himself, and acquire the tools with which he may publish his stories efficiently and effectively. The last entity is more commonly understood in the form of a publishing house.

The reason I’ve broken the working of a reporter into these categories is to understand what makes a reporter at all. Today, a reporter is most commonly understood in terms of an individual who is employed with a publishing house and publishes stories for them. Ideally, however, everyone is a reporter: simply the creation of knowledge by people based on experiences around them should be qualification enough. This calls into question the role of a publishing house: is it a platform working with which reporters may function efficiently, or is it an employer of reporters?

If it’s an employer of reporters, then any publishing house wouldn’t have to worry about where the course of journalism is going to take the organization itself. Reporters will have to change the way they work – how they spot issues, evolving writing styles to suit their audiences, so forth – but the publishing house will retain ownership of the reporters themselves. As long as it’s not a platform which individuals use to function as reporters, things are going to be fine.

Now, let’s move to the post-reporter era, where everyone is a reporter (of course, that’s an idealized image, but even so). In this world, a reporter is not someone who works for a publishing house – that aspect of the word’s meaning is left behind in the age of the publishing house. In this world, a reporter is someone who works simply as a messenger between the domains of events and stories, where the role of the publishing house as the owner of reportage is absent.

The nature of such a world throws light on the valuation of information. When multiple reporters cover different events and return to HQ to file their stories, the house decides which stories make the cut and which don’t on the basis of a set of parameters. In other words, the house creates and assigns a particular value to each story, and then compares the values of different stories to determine their destiny.

In the post-reporter era, which is likely to be occupied by channels of individual presentation – ranging from word-of-mouth to full-scale websites – houses that thrive today on the valuation of information and the importance the houses’ readers place on it  will steadily fade out. What exists will be an all-encompassing form of what is known as citizen journalism (CJ) today. Houses take to CJ because of the mutually beneficial relationship available therein: the CJ gets the coverage and the advantage of the issue pursued no longer being under wraps; the reporter gets a story that has both civic/criminal and human-interest angles to it.

However, when the CJ voids the relationship by refusing the intervention of a publishing/broadcasting house, and chooses to take his story straight to the people through a channel he finds effective enough, the house-level valuation of stories is replaced by a democratic institution that may or may not be guided by a paternalistic attitude.

Therefore, if a particular house has to survive into the post-reporter era, it must discard issue-valuation as an engine and instead rely on some other entity, such as one represented by a parameter whose efficiency is a maximizable quantity. This can be conceived as a fourth domain which, upon maximization, becomes the superset of which the three domains are subsets.

A counter-productive entity in this situation is that of property, which is accrued in great quantities by a high-achieving house in the present but which delays the onset of change in the future. Even when the house starts to experience slightly rougher weather, its first move will be to pump in more money, thereby offsetting change by some time. Only when the amount of property invested in delaying change is considerable will the house start to consider other alternatives, by which time other competing organizations will have moved into the future.