Monitoring tweets for symptoms of the flu could boost surveillance
The influenza virus is one of our most enduring opponents. Each year, seasonal epidemics sweep across the globe, leaving up to 5 million people stricken with aching muscles, headaches, feverish chills, and the runny nose, sore throat and coughing of an upper respiratory infection. A further half million people die from their illness.
In addition to being potentially deadly, the flu is also a particularly wily virus. For some viruses, vaccination can protect against infection for life. But the flu is a master shapeshifter, continually changing its appearance and rapidly rendering flu vaccines ineffective. In order to keep pace with the briskly evolving virus, new flu vaccines are developed each year.
Vaccination isn’t the only arrow in our quiver against the flu. Surveillance can help to get the word out that the flu is in town so that people – especially the most vulnerable – can get their annual flu shot before they get sick. Surveillance can also provide hospitals and public health agencies with forewarning of an imminent surge in demand for their services.
In recent years, epidemiologists tracking the flu have been turning to cyberspace to gather their surveillance data, rather than waiting for flu notifications to trickle out from hospitals in the weeks following the local flu season. Traditional epidemiology is now complemented with modern ‘infodemiology.’
In 2008, Google entered the flu monitoring game by launching Google Flu Trends. By tracking internet searches for flu-related health information, Google Flu Trends is able to estimate how many cases of flu are in a particular area in real-time. Although this type of surveillance is not specific for the influenza virus, and detects a range of influenza-like illnesses, Google Flu Trends can successfully detect a regional outbreak of influenza 7–10 days earlier than traditional surveillance systems.
Now, a team of Italian researchers report that Twitter could be used in a similar way. The study, published this week in PLoS ONE, used an algorithm to identify tweets indicative of someone having a flu-like illness flu. In order to capture casual references to the flu and non-technical language, the team’s algorithm mapped commonly used terms for flu symptoms from online health sites to specific medical terms. For example, ‘sore throat’ was mapped to the medical term ‘pharyngitis.’ The algorithm could then identify all tweets reporting a combination of symptoms, regardless of whether the tweeter had used everyday language or correct medical terminology.
Flu tweets were tracked for four and a half months during the Northern hemisphere winter when flu cases are peaking. A one percent sample of tweets – nearly half a billion tweets in total – were monitored and five and a half thousand were identified as flu tweets. An increasing frequency of flu tweets correlated well with increases in reported rates of influenza-like illness reported to the US Centers for Disease Control and Prevention and Google Flu Trends.
Unlike Google Flu Trends, which relies on the change in demand for health information, the Twitter-based algorithm that the team developed garners its data from innocuous comments and discussions posted on the microblogging site. According to the authors, this ‘supply-based’ infodemiology has the potential to reduce some of the noise associated with gathering data from search queries.
Google Flu Trends frequently overestimates flu cases because it picks up when people are searching for information on the flu, or on a pandemic unfolding on the other side of the globe. By favoring colloquial references to symptoms of the flu, the Twitter algorithm avoids this overestimation, preferentially detecting actual cases of influenza-like illness. If you’re compelled to tweet about your sore throat, runny nose and achy muscles, chances are that you do in fact have an influenza-like illness flu. It might be time to switch off the computer and jump into bed, but public health officials might be glad you tweeted your woes first.
Reference: Gesualdo F, Stilo G, Agricola E, Gonfiantini MV, Pandolfi E, Velardi P & Tozzi AE (2013) Influenza-like illness surveillance on Twitter through automated learning of naïve language. PLoS ONE doi:10.1371/journal.pone.0082489