Official: Social media, web tracking can predict disease outbreaks
WASHINGTON — Researchers tracking social media and Web searches have detected outbreaks of the flu and rare diseases in Latin America by up to two weeks before they were reported by local news media or government health agencies, a U.S. intelligence official told USA TODAY.
Working at a series of universities and companies around the country, the researchers are part of a program led by the Intelligence Advanced Research Projects Agency (IARPA) that is aimed at anticipating critical societal events, such as disease outbreaks, violent uprisings or economic crises before they appear in the news.
"The goal is to use publicly available information to predict events, such as political violence, disease outbreaks and economic crises," said Jason Matheny, program manager of IARPA's Open Source Indicators program. "We're using leading indicators like social media, Web search trends, Wikipedia in order to identify the events. We're looking at flu outbreaks or other signs of unrest in a population."
IARPA's goal, Matheny said, is to inform U.S. policymakers about major events early enough to make more of a difference. Too often, he said, public announcements of disease outbreaks come too late. Intelligence analysts with access to a system able to eliminate the clutter that's common in open source data may be able to get a jump on disease outbreaks or other problems.
For example, analysts monitoring Web searches for information in a given country about the symptoms of a disease such as cholera might determine that residents of that country are experiencing an outbreak. Taken alone, that may not mean much, but when combined with data such as canceled restaurant reservations on sites like Open Table, the data may signal a larger health emergency.
"The early information is critical in detecting the problem and getting people treatment," Matheny said. "One of the goals of this is provide that kind of early warning system. It can protect U.S. citizens abroad or enable the government to issue travel advisories about certain places."
While Matheny would not disclose the cost of the program, officials at Virginia Tech, which is part of the research team, said it was part of a three-year, $13.36 million program. Matheny said the research has another year left. Other participants, IARPA documents show, are the University of Maryland, Cornell University, Harvard Medical School, San Diego State University and CACI and Basis Technology, two military contractors.
Last week, IARPA issued a request for information looking for potential researchers to help code the torrent of societal events they collect as part of the open source indicators program.
All of the information analyzed by the IARPA researchers is open source, which means it is publicly available and not classified. Although the intelligence community's image is of clandestine operatives working in dangerous locations, open source information is a growing part of what analysts use to paint a more accurate picture of what is happening around the world.
IARPA is the intelligence community's version of the Defense Advanced Research Projects Agency (DARPA), which performs much of the military's research into technology to make better weapons or improve medical treatments. The agency says it "invests in high-risk, high-payoff research programs that have the potential to provide the United States with an overwhelming intelligence advantage over future adversaries" and reports to the director of National Intelligence.
Despite its early success in anticipating events, Matheny said, IARPA is concerned about what it calls "model drift," in which a system designed to track data in one location might provide incorrect information about another country. In late January, IARPA issued a request for research proposals to stop model drift.
"The model drift is to protect us from having a program become outdated," Matheny said. "Say you had a model that allowed you to track disease outbreaks that was designed and used in 2014. But now it's 2016. How confident can you be that it works as it was designed to? That's a problem with model drift.