*Public Service Announcement: The word clouds included in this post feature some, shall we say, colorful language. If that kind of thing bothers you, skip them. If not, check them out. They’re really quite cool.*
A new study from the University of Pennsylvania is looking to find the connections between language and personality, gender, and age in the world of social media. The study, published this past September via the Public Library of Science, looked at the Facebook messages of 75,000 volunteers, examining around 700 million words, to discover how personality, gender, and age affect people’s use of language. This is a pretty impressive study if for no other reason than it is the single largest of its kind. The next largest study, which did a similar linguistic analysis with bloggers, included only 694 authors.
The group used a new type of linguistic analysis called “open-vocabulary” analysis. In the past, with “closed-vocabulary” studies, researchers would choose groups of words they thought would be associated with certain personality traits, genders, etc. and then test accordingly. In an open-vocabulary experiment, the data automatically forms its own associations, allowing for unexpected connections to crop up.
For example, this particular study found that men tended to say “my wife” or “my girlfriend” while women generally used only “husband” or “boyfriend.” While this may seem like it lends itself towards an argument that men are more possessive with their language, the authors do note that this difference also arises because women tended to put “her” or some series of adjectives before husband or boyfriend. So it may also suggest that when it comes to social media, men simply tend to talk about their own relationships more than those of others. If a closed-vocabulary approach had been used, neither of these new connections would have necessarily been noticed. While researchers may have predicted that the possessive “my” would be more popular with guys than girls, no doubt pointing to a more possessive or territorial nature or something along those lines, they wouldn’t have been able to see all of the possible roots of that difference.
As I’ve said, the group looked at personality type, gender, and age to see how language use differed in each. They used the Five Factor Model (FFM) to look at personality types. The FFM breaks personality down into five categories: extraversion, agreeableness, conscientiousness, neuroticism, and openness. The goal was then to see what kind of words, phrases, and topics were most common to each personality type. Age presented a much trickier problem for categorization. The vast majority of Facebook users are fairly young, so to avoid lumping together the bulk of their study under one age group, the study divided the younger crowd much more. They created age groups that ranged from 13-18, 19-22, 23-29, and then 30-65. Hopefully that 30-65 group was pretty concentrated at one end of the spectrum or else that’s a hell of a spread to lump in there with the five, three, and six year spans they gave young people.
Talking about the group’s results doesn’t quite do them justice. It’s difficult to suitably sum up such a large study in a quick and understandable way. Luckily enough, the group put together a series of word clouds, which I’ve sprinkled here throughout this post, which do a remarkably good job of describing their results. Obviously more extraverted people’s messages included more socially oriented words, like party or boys/ladies, and introverts tended more towards isolated language, like talking about computers or reading. More interesting differences were found along gender lines, like the “my” tidbit noted earlier, or the significant increases in swearing when you look at men’s messages. Surprisingly, men tend to use fewer emoticons. I only say surprisingly because I love those little cat ones they have on Facebook now. Seriously. Those are obscenely cute.
Age presents another fairly predictable series of developments in language. Younger people tend to use more slang and emoticons, complain about school, and so on. College age kids talk about registering for classes and drinking. When you hit the 22-29 range you get mostly talk about beers when it comes to alcohol, and obviously a lot more words connected to work. That big old 30-65 group used more words associated with family and children, which the authors suggest indicates a greater level of social integration as age increases. They argue that as people get older, family and social connections become more important, so they crop up more, although in my opinion it may just be that older people don’t use Facebook for much more than to keep in touch with family rather than a means of self-expression.
By and large, this is a pretty interesting study. It shows how different people approach social media through language, and makes use of an incredibly large sample to substantiate their claims. It also shows how much we can learn from how people use these new forms of technology as they gain popularity. That being said, I really don’t want to read any new studies about the social implications of Instagram filters. Ever.