Linux Evolution Reveals Origins of Curious Mathematical Phenomenon
December 1, 2008 By Lisa Zyga
When the Zipf curve is plotted on a log-log scale, it appears as a straight line with a slope of -1. This graph shows that four Debian Linux releases each follow Zipf’s law: Woody (orange), Sarge (green), Etch (blue) and Lenny (black). Credit: T. Maillart, et al.
(PhysOrg.com) -- Zipf’s law is a testament to the order in our world, showing that the same patterns emerge in a wide variety of situations. The linguist George Kingsley Zipf first proposed the law in 1949, when he noticed that the distribution of words in a newspaper, book, or other literary article always followed the same pattern.
Zipf counted how many times each word appeared, and found that the probability of the occurrence of words starts high and tapers off. Specifically, the most frequent word occurs about twice as often as the second most frequent word, which occurs about twice as often as the fourth most frequent word, and so on. Mathematically, this means that the frequency of any word is inversely proportional to its rank. When the Zipf curve is plotted on a log-log scale, it appears as a straight line with a slope of -1.
Since Zipf’s discovery, researchers have found that the power law describes many other natural and human phenomena, including the distribution of cities ranked by their population, the distribution of corporate wealth, and Internet traffic characteristics.
When analyzing systems that follow Zipf’s law, researchers usually assume certain mechanisms to be responsible for this patterned behavior. However, no one has ever empirically demonstrated that these assumed mechanisms are indeed the origin of Zipf’s law.
Now, a team of researchers from ETH Zürich (the Swiss Federal Institute of Technology Zürich) in Switzerland has confirmed that these assumed mechanisms – such as scale-free, proportional growth rates – are at the origin of Zipf’s law. The researchers used four orders of magnitude of data detailing the evolution of open source software applications created for a Linux operating system to confirm the assumption.
The team studied Debian Linux, a free operating system continuously being developed by more than 1,000 volunteers from around the world. Developers create software packages, such as text editors or music players, that are added to the system. Beginning with 474 packages in 1996, Debian Linux has expanded to include more than 18,000 packages today. The packages form an intricate network, with some packages having greater connectivity than others, as defined by how many other packages depend on a given package.
“Open source offers a unique opportunity provided by the high completeness of data concerning open source (thanks to the disclosure policy of the open source terms of license),” lead author Thomas Maillart of ETH Zürich told PhysOrg.com. “Debian Linux allowed us to retrieve exhaustive information from several years ago. Many other complex systems are not so well ‘documented.’”
As the researchers explain, the Linux network is constantly changing: new packages enter, some disappear, and others gain or lose connectivity. Yet throughout the 12 years, the distribution of packages, as ranked by their number of incoming links from other packages, has followed Zipf’s law, with a few very popular packages having much greater connectivity than most.
While many previous models of Zipf’s law start with the assumption that the set of entities (e.g. packages) appeared at the same time, the Swiss researchers track the time evolution of package connectivity in the Linux network since 1996. This perspective enabled them to test for the presence of specific characteristics of the growth of the Linux network, which leads to the emergence of Zipf’s law.
Using the data, they showed that the growth rates of connectivities between packages are proportional to the degree of connectivity between packages. In addition, they showed empirically that the average growth rate of the total number of links to a given package over a time interval is proportional to that time interval. Further, the variability of the total number of links to a given package increases proportionally to the square-root of time, providing a crucial test of the mechanism of stochastic proportional growth of connectivity between packages. Altogether, these characteristics are responsible for the universal distribution pattern of Zipf’s law.
“We show that the distribution of connectivity of new entrants is also a power law with an exponent much bigger than 1, confirming that the proportional growth mechanism is solely responsible for the Zipf's law,” Maillart said.
He explained that, while Linux data allowed the researchers to confirm the origins of Zipf’s law, their results bring up more questions.
“Linux Debian gave us the opportunity to verify the ‘proportional mechanism,’ thanks to an important dataset and a huge investigation potential,” Maillart said. “All changes (evolution) in open source software are freely available and therefore can be tracked in detail. However, model verification has brought one answer and many resulting questions we intend to give an answer to. We think particularly of mechanisms of success/failure of projects in relation with their management.
“Remember that we still do not clearly understand the reasons of the success of the open source, since it's free and based on altruist contributions by programmers,” he said. “Additionally, one can bet that further research in this direction (open source and proportional growth) may raise useful questions for other systems (cities, economy, etc.) that would bring new insights to explain their evolution.”
More information: T. Maillart.; D. Sornette; S. Spaeth, and G. von Krogh. “Empircal Tests of Zipf’s Law Mechanism in Open Source Linux Distribution.” Physical Review Letters 101 218701 (2008).
Copyright 2008 PhysOrg.com.
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in whole or part without the express written permission of PhysOrg.com.
-
Feds urge states to ban texting, talking on roads
Dec 13, 2011 |
1 / 5 (1) |
1
-
Philip Morris sues Australia over plain packaging
Nov 21, 2011 |
5 / 5 (3) |
0
-
No painkillers please, we're British
Nov 10, 2011 |
5 / 5 (1) |
2
-
Philip Morris fights Australian packaging rules
Jun 27, 2011 |
not rated yet |
0
-
Billionaires vie for Mexico's telecom market
May 27, 2011 |
not rated yet |
2
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (28) |
26
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (3) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
4 / 5 (22) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (1) |
0
-
Use Kirchoff's rules to determine I1, I2 and I3 for the following circuit:
4 hours ago
-
pendulum solution not working
6 hours ago
-
Basic Projectile with Wind
6 hours ago
-
Calculating pressure on a surface after reaching final velocity
7 hours ago
-
question about calculating work done
7 hours ago
-
Linear vs Non-linear waves
8 hours ago
- More from Physics Forums - General Physics
More news stories
A quantum connection between light and motion
(PhysOrg.com) -- Physicists have demonstrated a system in which light is used to control the motion of an object that is large enough to be seen with the naked eye at the level where quantum mechanics governs ...
22 hours ago |
4.9 / 5 (17) |
7
|
Electrons in concert: A simple probe for collective motion in ultracold plasmas
(PhysOrg.com) -- Collective, or coordinated behavior is routine in liquids, where waves can occur as atoms act together. In a milliliter (mL) of liquid water, 1022 molecules bob around, colliding. When a bre ...
21 hours ago |
4.2 / 5 (5) |
0
|
Quantum microphone captures extremely weak sound
(PhysOrg.com) -- Scientists from Chalmers have demonstrated a new kind of detector for sound at the level of quietness of quantum mechanics. The result offers prospects of a new class of quantum hybrid circuits ...
21 hours ago |
5 / 5 (1) |
3
Progress and promise in DIAL LIDAR
For climatologists and environmental policy makers who need to determine the flux of greenhouse gases (GHG), there are three paramount questions: Where is it, how much is there, and how is it moving? A new ...
17 hours ago |
5 / 5 (4) |
0
Repulsive gravity as an alternative to dark energy (Part 2: In the quantum vacuum)
(PhysOrg.com) -- During the past few years, CERN physicist Dragan Hajdukovic has been investigating what he thinks may be a widely overlooked part of the cosmos: the quantum vacuum. He suggests that the quantum vacuum has ...
Our Amorphophallus is smaller: New plant species from Madagascar smells like roadkill
The famed "corpse flower" plant known for its giant size, rotten-meat odor and phallic shape has a new, smaller relative: A University of Utah botanist discovered a new species of Amorphophallus that i ...
Invasive alien predator causes rapid declines of European ladybirds
A new study provides compelling evidence that the arrival of the invasive non-native harlequin ladybird to mainland Europe and subsequent spread has led to a rapid decline in historically-widespread species ...
New findings highlight the benefit of exercise ECGs just as they are being scrapped
In the UK, the exercise electrocardiogram (ECG) is the most common initial test for the evaluation of stable chest pain and has been used widely for almost half a century. However, recent NICE guidelines recommend that it ...
Long-term study shows epilepsy surgery improves seizure control and quality of life
While epilepsy surgery is a safe and effective intervention for seizure control, medical therapy remains the more prominent treatment option for those with epilepsy. However, a new 26-year study reveals that following epilepsy ...
New DVT guidelines: No evidence to support 'economy class syndrome'
New evidence-based guidelines from the American College of Chest Physicians (ACCP) address the many risk factors for developing a deep vein thrombosis (DVT), or blood clot, as the result of long-distance travel. These risk ...
Nicira promises virtual networks will transform networking
(PhysOrg.com) -- For the past four years, founders of the start-up company Nicira have been developing cutting-edge software that they predict will transform the networking technology underlying the Internet. ...
Dec 01, 2008
Rank: 2.6 / 5 (5)
Dec 01, 2008
Rank: 3 / 5 (3)
Dec 01, 2008
Rank: 3 / 5 (3)
So the packages are not random.
Dec 02, 2008
Rank: 2.6 / 5 (5)
Dec 02, 2008
Rank: 3 / 5 (4)
Dec 02, 2008
Rank: 3 / 5 (2)
Dec 02, 2008
Rank: 4 / 5 (4)
What a bunch of ballux. If you want to know how economic theories or cities, pick up a history book.
Dec 02, 2008
Rank: 4.3 / 5 (3)
the growth rates of connectivities between packages are proportional to the degree of connectivity between packages
AND
the variability of the total number of links to a given package increases proportionally to the square-root of time, providing a crucial test of the mechanism of stochastic proportional growth of connectivity between packages
AND
they showed empirically that the average growth rate of the total number of links to a given package over a time interval is proportional to that time interval.
NOW MY QUESTIONS IS:
How could this article NOT point out that it should be the root inverse proportion of the mean??
And those of you that were bashing this article... if you read this far, can you still not get it?
sheesh...
Yok
Dec 03, 2008
Rank: 3 / 5 (2)
Maybe it depends what you mean by random. In the evolution of linux code packages there might be some "true" randomness to the origin of a particular idea or strategy but once it is instantiated its evolution may be largely determined by 'dialectical' interaction with the rest of its world, just like biological species, etc, but will be mostly unpredictable due to the non-linear progression of these recursive interactions.
b/ I am not sure what Yok is saying about "root inverse proportion of the mean", but I am neither mathematician nor scientist. I agree with Yok though that the bashers must be sleeping through their lives, not to be entranced by yet another demonstration of the amazing depths of emergent order manifest by evolutionary processes.
Mark
Dec 03, 2008
Rank: 3 / 5 (4)
Dec 06, 2008
Rank: 3 / 5 (1)
Although the article is about linux, the underlying natural law, "Zipfs law, a testament to the order in our world, showing that the same patterns emerge in a wide variety of situations," is potentially profound. Evidently, humans are driven by forces beyond our comprehension. Beyond psychology and into the physical world.
Dec 06, 2008
Rank: 5 / 5 (1)
http://www.maa.or...ram.html
Dec 07, 2008
Rank: 5 / 5 (1)
Dec 07, 2008
Rank: 5 / 5 (1)
Sigh... anyway, Zipf%u2019s law, great, yeah.
Dec 07, 2008
Rank: 5 / 5 (1)
Dec 07, 2008
Rank: 5 / 5 (1)
ONLY 10%!