Hidden Fractals Suggest Answer to Ancient Math Problem

There are a number of different ways to “partition” the number 7 into other numbers. Writing them all out:

7 = 1 + 1 + 1 + 1 + 1 + 1 + 1
7 = 2 + 1 + 1 + 1 + 1 + 1
7 = 2 + 2 + 1 + 1 + 1
7 = 3 + 1 + 1 + 1 + 1
7 = 4 + 1 + 1 + 1
7 = 3 + 2 + 1 + 1
7 = 5 + 1 + 1
7 = 4 + 2 + 1
7 = 3 + 3 + 1
7 = 2 + 2 + 2 + 1
7 = 3 + 2 + 2
7 = 6 + 1
7 = 5 + 2
7 = 4 + 3
7 = 7

There’s 15 ways listed above, and so the “partition number” of 7 is 15. Mathematicians typically write this as P(7) = 15, and have been studying P(n) for other n for hundreds of years now. The fact that this partition function P(n) has been studied for hundreds of years should be a clue that it’s been difficult to get a handle on. For one thing, it grows absurdly fast: P(10) is already up to 42, and P(100) = 190,569,292. I am certainly not writing all those down! More importantly, those numbers I just quoted were not computed using any neat little formula. Up til now the only way of computing P(n) when n got large involved calculating—by hand or computer—all the other partition numbers that came before. For example, calculating P(100) required calculating each of P(1) to P(99).

All that has changed, however, thanks to a breakthrough made by Ken Ono and Zachary Kent during a hike in the woods. “The problem for a theoretical mathematician is you can observe some patterns, but how do you know these patterns go on forever? We were, frankly, completely stuck. We were stumped,” Ono says in a brief interview (below) about the days leading up to their realization.

During a break from their day-to-day duties at Emory University, on a nature hike to Tallulah Falls, “we realized that the process by which these numbers fold over on themselves was very much like what you see in the woods.”

It was as simple as that. Maybe we could translate the problem of studying the partition numbers, this difficult problem of why do they fold over on themselves, maybe we could turn this into a problem where: what if we were just walking among the partition numbers, from one partition number to the next and then to the next? Could we turn the idea of a switchback into a concrete mathematical structure which we could prove had to occur over and over again? We realized that, if we could do that, then our fractal structure would have to be true and instead of having a walk that ends at the falls, our walk would just go on forever.”

The switchback phenomenon that Ono, Kent, and Amanda Folsum at Yale University detected and mathematized shows that the partition numbers have a fractal-like structure that was completely unexpected, and (with Jan Bruinier of the Technical University of Darmstadt in Germany) also produced the first explicit formula for the P(n) function. Their news was picked up by Wired, Scientific American, the Smithsonian magazine, and online at the Free Republic of all places. The “fractal-like” nature of the solution also let a number of those outlets seize the opportunity to link to media of psychedelic fractal patterns like the Mandelbrot set. I’m not above that either, and so feel free to check out the awesome “zoom-in” on the Mandelbrot set that heads this entry!

Modeling Human Drug Trials — Without the Human

Hologram body

Dr. David M. Eddy is one of the rare people for whom the “Dr.” refers to both a medical and academic degree. Dr. Eddy received his Ph.D. in applied mathematics in 1978, ten years after becoming a medical doctor, and he’s been a pioneer in applications of mathematics to medicine ever since. His latest project sounds like a massive undertaking: a computer program called Archimedes, which is intended to represent the various aspects of human physiology well enough so that new drugs, tests, or procedures could be tried out on virtual human subjects before any actual humans went under treatment. A description of the project’s genesis and development appears in this story from Wired magazine.

The program was a kind of SimHealth: a vast compendium of medical knowledge drawn from epidemiological data, clinical trials, and physician interviews, which Eddy had laboriously translated into differential equations over the past decade. Those equations, Eddy hoped, would successfully reproduce the complex workings of human biology — down to the individual chambers of a simulated person’s virtual heart…. a soup-to-nuts model that would capture everything known by modern medicine, from the evolution of disease in different people — as shaped by factors like race, genetic risk, and number of hours spent doing yoga — to specific physiological details, such as the amount of heart muscle that dies in the hours after a heart attack and the degree to which medications like aspirin can limit that damage. Tests could be run in hours instead of years, and the model could be constantly updated with the latest research.

According to the article, the model could already do a very good job of predicting the results of previous clinical trials, and the next step—coming soon no doubt—was to have it ‘virtually’ conduct some drug trials, trials that might otherwise be too difficult, costly, or dangerous for drug companies to do themselves.

The 100 Top Science Stories of 2010

Every year Discover magazine lists its 100 Top Science Stories, and a number of these stories, particularly those involving physics and engineering, require a lot of math in their execution. Beyond that, however, four of the stories feature mathematics centrally. In numerical order:

  • In #51 A Computer Rosetta Stone we find a computer program that deciphers ancient heiroglyphics statistically. MIT computer scientist Regina Barzilay has developed the program, which compares unknown letters and words to letters and words of known languages in order to find parallels. When she tested it by seeing how much of ancient Ugaritic the program could decipher using the related language Hebrew as the ‘parallel’, the program correctly matched 29 of the 30 Ugaritic letters to their Hebrew equivalent, and 60% of the Ugaratic words that had Hebrew cognates. More importantly, it did the work in a matter of hours, whereas human translators needed decades (and the chance find of an ancient Ugaritic axe that had the word “axe” carved on it) to accomplish similar feats. While the program certainly cannot replace the intuition and feel for language that human scientists possess, “it is a powerful tool that can aid the human decipherment process,” and could already be of use in expanding the number of languages that machine translators can handle.
  • #60 Fighting Crime with Mathematics details the work of UCLA mathematicians Martin Short and Andrea Bertozzi who, along with UCLA anthropologist Jeff Brantingham, developed a mathematical model of the formation and behavior of crime ‘hotspots.’ After calibrating the model with real-world data, it appears that hotspots come in two varieties: “One type forms when an area experiences a large-scale crime increase, such as when a park is overrun by drug dealers. Another develops when a small number of criminals—say, a pair of burglars—go on a localized crime spree.” According to the work, the typical police reaction of targeting the hotspots appears to work much better on the first type of hotspot, but hotspots of the second type usually just relocate to a less-patrolled area. As the story notes, “By analyzing police reports as they come in, Short hopes to determine which type of hot spot is forming so police can handle it more effectively.”
  • There seems to be a steady stream of stories recently that remark on how some animals instinctively know the best way to do things. One example from this blog is Iain Couzin’s work on animal migration. And here’s another: #92 Sharks Use Math to Hunt. Levy flight is the name given a search pattern which has been long suspected by mathematicians of being one of the most effective hunting strategies when food is scarce. David Sims of the Marine Biological Association of 
the United Kingdom logged the movements of 55 marine animals from 14 different species over 5,700 days, and confirmed that the fish movements closely matched Levy flight. (The marine animals included tuna and marlin, by the way, but sharks always get the headlines.)
  • #95 Rubik’s Cube Decoded covers a story already mentioned on this blog about “God’s Number”, the maximum number of moves that an omniscient being would need in order to solve any starting position of Rubik’s cube. The answer, as you can read in this story or by reading my earlier blog post, is 20.

The whole Top 100 is worth going through as well. It’s remarkable to realize how much and how quickly science is learning in this day and age.

A Physicist Solves the City

Sometime around 2008 or so a tipping point was reached: for the first time, the number of people worldwide living in cities outnumbered the number of people living in rural areas. The ‘urbanification’ of humanity will likely only continue (you can see the United Nations projections here), and so cities—their structure, their qualities, their creation, maintenance, and growth—are becoming increasingly important objects of study.

I’ve already mentioned the NY Times Magazine’s Annual Year in Ideas issue in a previous post. Included in the same issue is a full-fledged article on the work of Geoffrey West, a former physicist at Stanford and the Los Alamos National Laboratory. West has recently turned his attention away from particle physics and toward biological subjects, and done so with effect; as the article notes, one of West’s first forays was “one of the most contentious and influential papers in modern biology” which has garnered over 1500 citations since published.

The mathematical equations that West and his colleagues devised were inspired by the earlier findings of Max Kleiber. In the early 1930s, when Kleiber was a biologist working in the animal-husbandry department at the University of California, Davis, he noticed that the sprawlingly diverse animal kingdom could be characterized by a simple mathematical relationship, in which the metabolic rate of a creature is equal to its mass taken to the three-fourths power. This ubiquitous principle had some significant implications, because it showed that larger species need less energy per pound of flesh than smaller ones. For instance, while an elephant is 10,000 times the size of a guinea pig, it needs only 1,000 times as much energy. Other scientists soon found more than 70 such related laws, defined by what are known as “sublinear” equations. It doesn’t matter what the animal looks like or where it lives or how it evolved — the math almost always works.

West’s next work went along similar lines, but now the biological subject under the microscope was the city. The first and natural quantity to investigate would be something that played the role of ‘energy’ in the city, and West and his collaborator Luis Bettencourt discovered that indeed a whole host of ‘energy’ measures scaled at a sublinear rate.

In city after city, the indicators of urban “metabolism,” like the number of gas stations or the total surface area of roads, showed that when a city doubles in size, it requires an increase in resources of only 85 percent. This straightforward observation has some surprising implications. It suggests, for instance, that modern cities are the real centers of sustainability…. Small communities might look green, but they consume a disproportionate amount of everything.

Still more surprises arrived when West and Bettencourt looked at measuring not ‘energy’ in terms of infrastructure, but ‘energy’ in terms of people. When people decide to move to a city—and as the United Nations data shows, people are doing so in droves—they often do so not to decrease their expenditures, but to increase their social opportunities. Now it is hard to measure social interactions, but there are related interactions that can be measured, and interestingly enough these seem to scale the same way infrastructure does, but in the opposite direction. Social activity seems to scale in a superlinear way. All sorts of economic activities, from city-wide construction spending to individual bank account deposits, increase by 15 percent per capita when a city doubles in size. Or as West puts it, “[y]ou can take the same person, and if you just move them to a city that’s twice as big, then all of a sudden they’ll do 15 percent more of everything that we can measure.” The bad news is that the ‘everything’ is in fact everything: violent crime, traffic, and AIDS cases for example also see the same type of increase.

West and Bettencourt’s current calculations are controversial and not universally believed. (The author, Jonah Lehrer, seems fairly skeptical himself.) Nevertheless, as with the earlier biological findings, the work described here certainly looks like a very good launching point for some very valuable and much needed future analysis.

The 10th Annual Year in Ideas

NY Times 2010 Year In Ideas

Another year has passed, which means it’s time again for the NY Times Magazine’s annual The Year in Ideas issue, “a high-to-low, silly-to-serious selection of ingenuity and innovation from 2010.” As with the 2009 list, a number of these ideas are based around some bit of mathematics and/or statistical analysis. The ones I’ve listed below are the ones that most prominently feature mathematics ideas, or feature mathematics and/or mathematicians centrally.

  • Perfect Parallel Parking by Jascha Hoffman mentions Simon Blackburn’s geometric analysis of parallel parking, which we covered on the blog previously. Updating that earlier story, Hoffman’s entry notes that Jerome White and some fellow teachers at Lusher Charter School in New Orleans subsequently improved the model. (White and company built in allowances for the driver to do a bit more maneuvering.)
  • Aftercrimes visits a topic seen already here in this blog: just as earthquakes typically beget aftershocks, some types of crime beget copycat crimes. Mathematician George Mohler has been able to show that “the timing and location of the crimes can be statistically predicted with a high degree of accuracy.” For more info, check out the entry and the earlier blog post.
  • The entry Social Media as Social Index describes some of the ways that researchers—academic, government, and corporate—are mining social networks like Twitter and Facebook for valuable information. For instance, algorithms analyzing millions of Twitter posts were able to predict how certain movies would perform at the box office and how the Dow Jones Industrial Average would perform in the near future. More social media data mining is undoubtedly in store, as the story ends with one Facebook officer quoted as saying that this is the future of opinion research.
  • Finally, two entries which illustrate the public appetite for data analysis. Do-It-Yourself Macroeconomics describes the growing legion of “ordinary citizens” who are making it their business to “pull apart the [economic] data and come to their own conclusions.” All this is possible, of course, due to the explosion in publicly available economic data, one example of which is described in The Real-Time Inflation Calculator. As the story concludes, thanks to this (freely available) software, “Data on prices, once monopolized by government gatekeepers, are now up for grabs.”

What’s Just Around the Bend? Soon, a Camera May Show You

Cameras have become not just ubiquitous (they’re standard on just about any cell phone or computer) but powerful. And in the future they may be even more so, thanks to the emerging field of computational photography. This article in the NY Times describes several of the projects currently in the works in computational photography labs across the country, many of which rely on sophisticated mathematics to achieve their effects.

The article title is taken from the work of Ramesh Rankar at MIT. Dr. Rankar is pairing cameras with lasers in order to get a “picture” of objects that aren’t actually in the camera’s line of sight. The laser acts something like bat or dolphin sonar. To sketch an example, if a door to a room is partially open the laser will fire into the room beyond, bounce around, and of course scatter according to whatever objects are present in the room. As the article then says, “From the reflected light, as well as the room’s geometry and mathematical modeling, [Rankar] deduces the structure of the hidden objects. ‘If you modify your camera and add sophisticated processing,’ he said, ‘the camera can look around objects and see what’s beyond.'”

Much of the rest of the article describes the “Frankencamera” of Marc Levoy at Stanford. The camera has a number of different computationally-enabled features. To mention one:

Dr. Levoy and his group have also written applications showing the Frankencameras’ abilities. The Rephotography app, for instance, lets users take a photo in the exact spot where an earlier one was shot. “The camera guides you step by step, so that you mathematically find the exact same viewpoint,” said Professor Durand, who with colleagues created the original app.

These researchers’ abilities are currently somewhat limited by the proprietary nature of most cameras. You can’t write an app when you’re not allowed to mess with the camera’s software. But Nokia just opened the N900 smartphone to Dr. Levoy and his group, and in turn Dr. Levoy has just created the first graduate course in computational photography. More advances are certainly to come.

In 500 Billion Words, New Window on Culture


Well that was fast. My last post described a project that analyzed word frequency in book titles, and mentioned that Google (which was providing the scanning and compiling for the project) had begun work on scanning and compiling an even larger corpus: the actual texts of every book published from 1500 to 2008. Now from the NY Times comes an article describing some preliminary analysis of the book text data sets. Even the preliminary results, obtained after only 11% of the task has been completed, are amazing.

[T]he researchers measured the endurance of fame, finding that written references to celebrities faded twice as quickly in the mid-20th century as they did in the early 19th. “In the future everyone will be famous for 7.5 minutes,” they write.

Looking at inventions, they found technological advances took, on average, 66 years to be adopted by the larger culture in the early 1800s and only 27 years between 1880 and 1920.

They tracked the way eccentric English verbs that did not add “ed” at the end for past tense (i.e., “learnt”) evolved to conform to the common pattern (“learned”). They figured that the English lexicon has grown by 70 percent to more than a million words in the last 50 years and they demonstrated how dictionaries could be updated more rapidly by pinpointing newly popular words and obsolete ones.

Other surprising and interesting facts mentioned include the relative frequencies of the words “men” and “women”, the popularity of Jimmy Carter, the rise of grilling, and the many more instances of the words “Tiananmen Square” in English-language texts than in Chinese-language texts.

And there’s more! Google has created a web tool that lets anybody plot the popularity of words and phrases over time. In the picture heading this entry I charted the relative frequencies for the words “mathematics,” “biology”, “physics”, and “chemistry” for the years 1800–2000. I was a bit surprised (but not unhappy) to see mathematics leading the pack at the moment, but the thing that is most obvious is the general trend: people are just getting more and more interested in the sciences as time goes by. The article and the Google tool also mention that the data sets themselves are available for download for those who have more heavy-duty data analysis in mind.

The research is detailed in a recent article from Science, which has taken the unusual step of making the article freely available. (That’s what the Times article says. It looks to me like you do have to sign up for a free Science registration.) Fourteen entities collaborated on the project; I use the word ‘entities’ because one author is listed as “The Google Books Team.” The two main authors, Jean-Baptiste Michel and Erez Lieberman Aiden, both have backgrounds in applied mathematics, as do some of the other listed authors.

As with the previous work on title words, the reaction of humanities scholars to the appearance of statistics and data analysis in their domain has been mixed. But there seems to be little doubt that, as the article states, this data set itself “offers a tantalizing taste of the rich buffet of research opportunities now open to literature, history and other liberal arts professors who may have previously avoided quantitative analysis.”

Analyzing Literature by Words and Numbers

Christian data

The first paragraphs of newspaper articles typically aim to summarize the main points of the full article, and the first paragraph of this NY Times article by Patricia Cohen does a whiz-bang job.

Victorians were enamored of the new science of statistics, so it seems fitting that these pioneering data hounds are now the subject of an unusual experiment in statistical analysis. The titles of every British book published in English in and around the 19th century — 1,681,161, to be exact — are being electronically scoured for key words and phrases that might offer fresh insight into the minds of the Victorians.

The data comes from a project of Dan Cohen and Fred Gibbs, Victorian scholars at George Mason University, with a big assist from Google, which is funding the project and carrying out the scanning and compiling. Although only the titles of the books have been compiled to date, even they reveal some interesting trends. The image above is one of a few graphs generated by the Times from the title data, and shows a big decline in the appearances of the word “Christian” in titles as the 19th century progressed. Other graphs show big decreases in the use of “universal”, and increases in the instances of the words “industrial” and “science”.

The entire corpus—the text of the books as well as the titles—should be compiled soon, at which point more sophisticated analyses can be performed. The quoted reactions of Victorian scholars toward the appearance of statistical tools in their milieu vary from “sheer exhilaration” to “excited and terrified”. But one common reaction to the analysis seems pervasive, and was best expressed by Matthew Bevis, a lecturer at the University of York in Britain: “This is not just a tool; this is actually shaping the kind of questions someone in literature might even ask.”

Networked Networks Are Prone to Epic Failure

Many networks are designed to be pretty failsafe—if any failures do occur there is enough resiliency in the network that it is routine to bypass them and life can go on as usual. However networks do not live in a vacuum and not uncommonly one network ends up somehow connected to another network, which is somehow connected to another, and so on. What happens now? Is the resulting “super-network” really all that super? As the title of this Wired magazine article hints, researchers are beginning to figure out that the answer is no.

A dramatic real-world example of a cascade of failures (‘concurrent malfunction’) is the electrical blackout that affected much of Italy on 28 September 2003: the shutdown of power stations directly led to the failure of nodes in the Internet communication network, which in turn caused further breakdown of power stations.

The quote above is actually from a Nature article which is one of a few (here’s another, where the picture above comes from) mentioned in the Wired article. All is not doom and gloom, however. With greater understanding comes a greater power to avoid catastrophes like the Italian shutdown.

According to Raissa D’Souza, a University of California, Davis mathematician who studies interdependent networks, the findings are “a starting point for thinking about the implications of interactions.” D’Souza hopes such research will pull together mathematicians and engineers. “We now have some analytic tools in place to study interacting networks, but need to refine the models with information on real systems,” she said.

The Aftershocks of Crime

It’s fairly common knowledge that large earthquakes are frequently followed by aftershocks, smaller earthquakes that occur in the same locale or relatively nearby. It is less well-known that some types of crime also have a similar aftershock phenomenon. For instance, burglaries are frequently followed by burglaries in the same neighborhood or a neighborhood nearby. This story in The Economist describes how mathematicians like George Mohler at the University of Santa Clara are using this phenomenon to devise methods of predicting where these “aftercrimes” are most likely to occur. The technique literally adapts the same equations used to describe earthquake aftershocks, and appears to hold some promise.

In one test the program accurately identified a high-risk portion of the city in which, had it been adequately patrolled, police could have prevented a quarter of the burglaries that took place in the whole area that day.

Together with researchers at UCLA, Mohler is extending the work to explore another type of crime in which there are often aftershocks: gang violence. Some of that work, and some additional projects involving ‘predictive policing’, are also detailed in the recent LA Times story “Stopping Crime Before It Starts”, by Joel Rubin.