What is it about?
A study of 1.53 billion tweets suggests the removing of “LOL” and other misleading words can improve well-being estimates and monitoring ability.
Featured Image
Photo by John-Mark Smith on Unsplash
Why is it important?
Social media posts can help us understand how people are adapting to and coping with the new normal. But our words are useful not just to understand what we – as individuals – think and feel. They’re also useful clues about the community we live in. Why is this so?
Perspectives
Words like ”lol” confound word-level methods because their contemporary use on social media is out of sync with their emotion scores in typical dictionaries, which interpret it as an expression of happiness. How is internet use evolving the connotations of typically positive or negative words? And, how do these connotations change with culture and region? These are questions that need to be addressed before standard measurements can work as expected to estimate populations, and not merely individuals. Our findings show that the words posted by county residents on social media can offer a signal into their well-being, over and above their socioeconomic markers.
Dr. Kokil Jaidka
National University of Singapore
Read the Original
This page is a summary of: Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods, Proceedings of the National Academy of Sciences, April 2020, Proceedings of the National Academy of Sciences, DOI: 10.1073/pnas.1906364117.
You can read the full text:
Resources
Data and language model
We've released the resources for this paper. Key highlights: county-level language features used for the analysis, and the best performing Life Satisfaction language model (trained on the Facebook data of survey respondents).
What do your tweets say about your happiness?
Article in Inverse magazine about our study
Improving the prediction of regional well-being from tweets
Featured Research in NUS News
Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods
Open-access version of the study
Main result
Removing as few as three words ("lol", "love" and "good") from well-being measurements can improve regional estimates of well-being based on emotion measurements.
Contributors
The following have contributed to this page