A team of researchers from the University of Pennsylvania and Stony Brook University recently developed a new algorithm which was able to identify which Facebook users would be diagnosed with depression.

For the study, the researchers analyzed social media data shared by consenting users for several months. Based on this data, the researchers developed an algorithm which could accurately predict future depression.

Indicators of depression included mentions of hostility and loneliness, words like “tears” and “feelings,” and use of more first-person pronouns like “I” and “me.”

“What people write in social media and online captures an aspect of life that’s very hard in medicine and research to access otherwise,” said Dr. H. Andrew Schwartz, senior paper author and a principal investigator of the World Well-Being Project (WWBP).

“It’s a dimension that’s relatively untapped compared to biophysical markers of disease. Considering conditions such as depression, anxiety, and PTSD, for example, you find more signals in the way people express themselves digitally.”

For six years, the WWBP, based at the University of Pennsylvania’s Positive Psychology Center and Stony Brook University’s Human Language Analysis Lab, has been studying how the words people use reflect inner feelings and contentedness.

In 2014, Johannes Eichstaedt, WWBP founding research scientist, began to question whether it was possible for social media to predict mental health outcomes, particularly for depression.

“Social media data contain markers akin to the genome,” Eichstaedt explains. “With surprisingly similar methods to those used in genomics, we can comb social media data to find these markers. Depression appears to be something quite detectable in this way; it really changes people’s use of social media in a way that something like skin disease or diabetes doesn’t.”

Eichstaedt and Schwartz teamed with colleagues Robert J. Smith, Raina Merchant, David Asch, and Lyle Ungar from the Penn Medicine Center for Digital Health for this study.

Rather than recruit participants who had self-reported depression, the researchers identified data from people consenting to share Facebook statuses and electronic medical-record information, and then analyzed the statuses using machine-learning techniques to distinguish those with a formal depression diagnosis.

“This is early work from our Social Mediome Registry from the Penn Medicine Center for Digital Health,” Merchant said, “which joins social media with data from health records. For this project, all individuals are consented, no data is collected from their network, the data is anonymized, and the strictest levels of privacy and security are adhered to.”

Nearly 1,200 people consented to allow researchers to access both digital archives. Of these, 114 people had a diagnosis of depression in their medical records.

The researchers then matched every person with a diagnosis of depression with five who did not have such a diagnosis, to act as a control, for a total sample of 683 people (excluding one for insufficient words within status updates). The goal was to create as realistic a scenario as possible to train and test the researchers’ algorithm.

“This is a really hard problem,” Eichstaedt says. “If 683 people present to the hospital and 15 percent of them are depressed, would our algorithm be able to predict which ones? If the algorithm says no one was depressed, it would be 85 percent accurate.”

To develop the algorithm, the researchers looked back at 524,292 Facebook updates from the years leading to diagnosis for each participant with depression and for the same time span for the control.

They identified the most frequently used words and phrases and then modeled 200 topics to tease out what they called “depression-associated language markers.” Finally, they compared in what manner and how frequently depressed versus control participants used such phrasing.

They found that these indicators comprised emotional, cognitive, and interpersonal processes such as hostility and loneliness, sadness and rumination. Those indicators could predict future depression as early as three months before first documentation of the illness in a medical record.

“There’s a perception that using social media is not good for one’s mental health,” Schwartz said, “but it may turn out to be an important tool for diagnosing, monitoring, and eventually treating it.”

The findings are published in the journal Proceedings of the National Academy of Sciences.

Source: University of Pennsylvania