The new and large fly the farthest

British Airways and Air France mutually retired the Concorde supersonic jet in 2003. Both companies cited rising maintenance costs as being the reason, which in turn were compounded by falling demand after the Paris crash in 2000 and a general downturn in civil aviation after 9/11. Now, American and French scientists have found that Concorde was in fact an allometric outlier that stood out design-wise at the cost of its feasibility and, presumably, its maintenance. Perhaps it grounded itself.

One thing Adrian Bejan (Duke University), J.D. Charles (Boeing) and Sylvie Lorente (Toulouse University) seem to be in awe of throughout their analysis is that the evolution of commercial airplane allometry seems deterministic (allometry is the study of the relationship between a body’s physical dimensions and its properties and functions). This is awesome because it implies that the laws of physics used to design airplanes are passively guiding the designers toward very specific solutions in spite of creative drift, and that successive models are converging toward a sort of ‘unified model’. This paradigm sounds familiar because it could be said of any engineering design enterprise, but what sets it apart is that the evolution of airplane designs appears to be mimicking the evolution of flying animals despite significant anatomical and physiological differences.

One way to look at their analysis is in terms of the parameters the scientists claim have been guiding airplane design over the years:

  1. Wingspan
  2. Fuselage length
  3. Fuel load
  4. Body size

Among them, fuel load and body size are correlated along the lines of Tsiolkovsky’s rocket equation. It says that, for rockets, if two of the following three parameters are set, the third becomes immovably fixed in a proportional way: energy expenditure against gravity, potential energy in the propellant, and the fraction of the rocket’s mass made up by the propellant. According to Bejan et al, there is a corresponding ‘airplane equation’ that shows a similar correlation between engine size, amount of fuel, and mass of the whole vehicle. The NASA explainer finds this association tyrannical because, as Paul Gilster writes,

A … rocket has to carry more and more propellant to carry the propellant it needs to carry more propellant, and so on, up the dizzying sequence of the equation

Next, there is also a correlation between wingspan and fuselage length corresponding to an economy of scale such as what exists in nature. Bejan et al find that despite dissimilarities, airplanes and birds have evolved similar allometric rules on the road to greater efficiency, and that like bigger birds, bigger airplanes are “more efficient vehicles of mass”. Based on how different airplane components have evolved over the years, the scientists were able to distill a scaling relation.

S/L ~ M1/6 g1/2 ρ1/3 σ1/4aV2Cl)-3/4 21/4 Cf7/6

Be not afraid. S/L is the ratio of the wingspan to the fuselage length. It is most strongly influenced by ρa, the density; σ, the allowable stress level in the wing; g, the acceleration due to gravity; and Cf, the fixed skin-friction coefficient. More interestingly, the mass of the entire vehicle has a negligible effect on S/L, which pans out as a fixed S/L value across a range of airplane sizes.

Citation: J. Appl. Phys. 116, 044901 (2014);
Citation: J. Appl. Phys. 116, 044901 (2014);

Similarly, the size of a plane’s engine has also increased proportional to a plane’s mass. This would be common sense if not for there being a fixed, empirically determined correlation here as well: Me = 0.13M0.83, where Me and M are the masses of the engine and airplane, respectively, in tons.

During the evolution of airplanes, the engine sizes have increased almost proportionally with the airplane sizes (the data refer only to jet engine airplanes). J. Appl. Phys. 116, 044901 (2014);
During the evolution of airplanes, the engine sizes have increased almost proportionally with the airplane sizes (the data refer only to jet engine airplanes). J. Appl. Phys. 116, 044901 (2014);

In terms of these findings, the Concorde’s revolutionary design appears to have been a blip on the broader stream of traditional yet successful ones. In the words of the authors,

In chasing an “off the charts” speed rating the Concorde deviated from the evolutionary path traced by successful airplanes that preceded it. It was small, had limited passenger capacity, long fuselage, short wingspan, massive engines, and poor fuel economy relative to the airplanes that preceded it.

That the Concorde failed and that the creative drift it embodied couldn’t achieve what the uninspired rules that preceded it did isn’t to relegate the design of commercial airplanes to algorithms. It only stresses that whatever engineers have toyed with, some parameters have remained constant because they’ve had a big influence on performance. In fact, it is essentially creativity that will disrupt Bejan et al‘s meta-analysis by inventing less dense, stronger, smoother materials to build airplanes and their components with. By the analysts’ own admission, this is a materials era.

Bigger airplanes fly farther and are more efficient, and to maximize fuel efficiency, are becoming the vehicles of choice for airborne travel. And that there is a framework of allometric rules to passively maximize their inherent agency is a tribute to design’s unifying potential. In this regard, the similarity to birds persists (see chart below) as if to say there is only a fixed number of ways in which to fly better.

The characteristic speeds of all the bodies that fly, run, and swim (insects, birds, and mammals). J. Appl. Phys. 116, 044901 (2014);
The characteristic speeds of all the bodies that fly, run, and swim (insects, birds, and mammals). J. Appl. Phys. 116, 044901 (2014);

From the paper:

Equally important is the observation that over time the cloud of fliers has been expanding to the right . In the beginning were the insects, later the birds and the insects, and even later the airplanes, the birds, and the insects. The animal mass that sweeps the globe today is a weave of few large and many small. The new are the few and large. The old are the many and small.


The evolution of airplanes, J. Appl. Phys. 116, 044901 (2014); DOI: 10.1063/1.4886855

How big is your language?

This blog post first appeared, as written by me, on The Copernican science blog on December 20, 2012.


It all starts with Zipf’s law. Ever heard of it? It’s a devious little thing, especially when you apply it to languages.

Zipf’s law states that the chances of finding a word of a language in all the texts written in that language are inversely proportional to the word’s rank in the frequency table. In other words, this means that the chances of finding the most frequent word is twice as much as are chances of finding the second most frequent word, thrice as much as are chances of finding the third most frequent word, and so on.

Unfortunately (only because I like how “Zipf” sounds), the law holds only until about the 1,000th most common word; after this point, a logarithmic plot drawn between frequency and chance stops being linear and starts to curve.

The importance of this break is that if Zipf’s law fails to hold for a large corpus of words, then the language, at some point, must be making some sort of distinction between common and exotic words, and its need for new words must either be increasing or decreasing. This is because, if the need remained constant, then the distinction would be impossible to define except empirically and never conclusively – going against the behaviour of Zipf’s law.

Consequently, the chances of finding the 10,000th word won’t be 10,000 times less than the chances of finding the most frequently used word but a value much lesser or much greater.

A language’s diktat

Analysing each possibility, i.e., if the chances of finding the 10,000th-most-used word are NOT 10,000 times less than the chances of finding the most-used word but…

  • Greater (i.e., The Asymptote): The language must have a long tail, also called an asymptote. Think about it. If the rarer words are all used almost as frequently as each other, then they can all be bunched up into one set, and when plotted, they’d form a straight line almost parallel to the x-axis (chance), a sort of tail attached to the rest of the plot.
  • Lesser (i.e., The Cliff): After expanding to include a sufficiently large vocabulary, the language could be thought to “drop off” the edge of a statistical cliff. That is, at some point, there will be words that exist and mean something, but will almost never be used because syntactically simpler synonyms exist. In other words, in comparison to the usage of the first 1,000 words of the language, the (hypothetical) 10,000th word would be used negligibly.

The former possibility is more likely – that the chances of finding the 10,000th-most-used word would not be as low as 10,000-times less than the chances of encountering the most-used word.

As a language expands to include more words, it is likely that it issues a diktat to those words: “either be meaningful or go away”. And as the length of the language’s tail grows, as more exotic and infrequently used words accumulate, the need for those words drops off faster over time that are farther from Zipf’s domain.

Another way to quantify this phenomenon is through semantics (and this is a far shorter route of argument): As the underlying correlations between different words become more networked – for instance, attain greater betweenness – the need for new words is reduced.

Of course, the counterargument here is that there is no evidence to establish if people are likelier to use existing syntax to encapsulate new meaning than they are to use new syntax. This apparent barrier can be resolved by what is called the principle of least effort.

Proof and consequence

While all of this has been theoretically laid out, there had to have been many proofs over the years because the object under observation is a language – a veritable projection of the right to expression as well as a living, changing entity. And in the pursuit of some proof, on December 12, I spotted a paper on arXiv that claims to have used an “unprecedented” corpus (Nature scientific report here).

Titled “Languages cool as they expand: Allometric scaling and the decreasing need for new words”, it was hard to miss in the midst of papers, for example, being called “Trivial symmetries in a 3D topological torsion model of gravity”.

The abstract of the paper, by Alexander Petersen from the IMT Lucca Institute for Advanced Studies, et al, has this line: “Using corpora of unprecedented size, we test the allometric scaling of growing languages to demonstrate a decreasing marginal need for new words…” This is what caught my eye.

While it’s clear that Petersen’s results have been established only empirically, that their corpus includes all the words in books written with the English language between 1800 and 2008 indicates that the set of observables is almost as large as it can get.

Second: When speaking of corpuses, or corpora, the study has also factored in Heaps’ law (apart from Zipf’s law), and found that there are some words that obey neither Zipf nor Heaps but are distinct enough to constitute a class of their own. This is also why I underlined the word common earlier in this post. (How Petersen, et al, came to identify this is interesting: They observed deviations in the lexicon of individuals diagnosed with schizophrenia!)

The Heaps’ law, also called the Heaps-Herdan law, states that the chances of discovering a new word in one large instance-text, like one article or one book, become lesser as the size of the instance-text grows. It’s like a combination of the sunk-cost fallacy and Zipf’s law.

It’s a really simple law, too, and makes a lot of sense even intuitively, but the ease with which it’s been captured statistically is what makes the Heaps-Herdan law so wondrous.

The sub-linear Heaps' law plot: Instance-text size on x-axis; Number of individual words on y-axis.
The sub-linear Heaps’ law plot: Instance-text size on x-axis; Number of individual words on y-axis.

Falling empires

And Petersen and his team establish in the paper that, extending the consequences of Zipf’s and Heaps’ laws to massive corpora, the larger a language is in terms of the number of individual words it contains, the slower it will grow, the lesser cultural evolution it will engender. In the words of the authors: “… We find a scaling relation that indicates a decreasing ‘marginal need’ for new words which are the manifestations of cultural evolution and the seeds for language growth.”

However, for the class of “distinguished” words, there seems to exist a power law – one that results in a non-linear graph unlike Zipf’s and Heaps’ laws. This means that as new exotic words are added to a language, the need for them, as such, is unpredictable and changes over time for as long as they are away from the Zipf’s law’s domain.

All in all, languages eventually seem an uncanny mirror of empires: The larger they get, the slower they grow, the more intricate the exchanges become within it, the fewer reasons there are to change, until some fluctuations are injected from the world outside (in the form of new words).

In fact, the mirroring is not so uncanny considering both empires and languages are strongly associated with cultural evolution. Ironically enough, it is the possibility of cultural evolution that very meaningfully justifies the creation and employment of languages, which means that at some point, languages only become bloated in some way to stop germination of new ideas and instead start to suffocate such initiatives.

Does this mean the extent to which a culture centered on a language has developed and will develop depends on how much the language itself has developed and will develop? Not conclusively – as there are a host of other factors left to be integrated – but it seems a strong correlation exists between the two.

So… how big is your language?