Sydney Bednar

Are We Even Listening? Taking a Closer Look at Pro-Anorexia Communities

A topic modeling approach to understanding the experiences of those struggling with an eating disorder

By Sydney Bednar

TRIGGER WARNING: This article will discuss personal experiences with eating disorders.

Introduction

More than 9% of the global population will have an eating disorder at some point over their lifetime. About 26% of this population will attempt suicide, and only 10% will receive treatment. So where do the other 90% go?

When we are affected by untreated diseases, it is common for us to look for support wherever we can. For many of those struggling with eating disorders, this supportive space is on the internet, in eating disorder communities. The concept of medical support communities is not new; in fact, a few are remarkably high-functioning, such as online communities for chronic pain and cancer. In some eating disorder communities, however, there exists a subculture of pro-ana (pro-anorexia) thinking. Users in these spaces comfortably trade harmful dieting tips and medical misinformation. Although most social media platforms have attempted to moderate these spaces, pro-ana websites continue to operate under the radar as message boards, blogs and Q&A threads where users can post about their experiences anonymously.

Surprisingly, existing research is divided on whether pro-ana communities are harmful or helpful and, in turn, whether people concerned for those affected by EDs should be intervening or not. Some studies of pro-ana groups have labeled them as empowering spaces for pro-ana “practitioners” who gain a positive sense of selfhood through their pro-ED advocacy. These researchers argue that pro-ana groups are transgressive, as young members resist cultural critiques of their bodies. Some studies also highlight the importance of pro-ana communities for providing social support. Researchers have used these findings to suggest that clinicians intervene in pro-ana sites by actively participating in them. They propose that by accurately answering common medical questions, clinicians could prevent patients from worsening their disorders. Others, however, have described pro-ana sites as spaces which proliferate an incredibly toxic subculture. They cite pro-ana communities as an “urgent issue,” and they provide guidelines for clinicians, young people, parents, and educators on how to address it. This rigid labeling, however, does not fully capture the complexity of interactions that occur between members of these communities.

Eating disorders can affect anyone, and these communities include the perspectives of people with different genders, ages and cultural backgrounds. So how can we as researchers encompass a group with such diverse experiences? In short, we need to go beyond simplistic labels and try to understand what it is that members of these communities really need. Rather than condemning these groups as harmful or celebrating them as welcoming, we should be asking: how can our understanding of interactions within these groups inform the areas of focus for health care providers, peers, family members and those who are struggling with an eating disorder?

While there has been some ethnographic research and manual content analysis aimed at understanding the experiences of pro-ana community members, results are limited in scale, and it requires a longer time to conduct a comprehensive study. For my research, I have used text mining approaches to carry out a large-scale comparative study on a dataset of 50,591 posts.

Corpus and Methods

I gathered data from three distinct pro-ana communities: MyProAna, Eating Disorder Central, and EDAnonymous on Reddit. These are widely seen by outsiders and users as pro-ana due to the substantial amount of pro-ana content that circulates on them. Because I wanted to compare communities of different forms, I selected these three by referring to previous studies, online threads, and outbound links from inactive communities.

👉
It is worth noting that there is some disagreement around the definition of “pro-ana” sites. In fact, one study even cited MyProAna as a recovery community. However, for the purpose of this research and based on my own readings of the text, I will be referring to these three communities as pro-ana.

Using Selenium and the Reddit API, I created a web scraper to retrieve the text from each post, as well as associated metadata such as the username, number of likes, and post time. I collected documents spanning from July 2018 to November 2022 in order to represent recent topics of discussion.

After cleaning the data, I implemented a topic modeling algorithm, which helps to identify the main topics within a corpus of texts. Topic modeling is an unsupervised machine learning method that, by performing iterations of probability assignments, determines how likely it is that

a) a given occurence of a word belongs to each topic and

b) a given topic is associated with each document.

In order to implement the model, I had to determine the optimal number of topics. To find this number, I tested the model on 5, 10, 15, 20, and 30 topics, and I determined that 10 topics produced the most distinct and representative groups. In the end, the output of the model is a list of the most probable words for each topic, sorted in descending order, as well as the proportional weight of each topic associated with each document. Because the model is unsupervised, however, it cannot find meaning in these lists of words that define a topic. For the most comprehensive results, we need to combine computational methods with our own manual analysis. So, for each topic, I read its 20 most representative posts and assigned a label based on my interpretation of the texts.

Here’s a walk-through of the process.

How does topic modeling work?

Let's look at a real post from our dataset as an example...

Which topics are associated with this post?

Our model tells us that this post is associated with a few topics, including two topics that both have the word "weight" as one of their most probable words.

Which topics are the most important?

And we can get a document-topic matrix, which shows us the proportional weights of each topic associated with that post.

Which documents are the most representative of a given topic?

We can also reverse this matrix and see which documents are most highly associated with a given topic.

Why should we care?

This process helps us label common themes in these communities so that we can better understand their needs.

What are these communities talking about?

Many current studies analyze eating disorder communities under two umbrellas: social support and eating-disorder-specific content. So, based on previous work and my initial readings of threads in eating disorder communities, I assumed that I would find some harmful topics such as dieting tips along with some more hopeful topics such as support, recovery, and empathy. What is most interesting about the results of this model, however, is that half of the topics focus on the experience of having an eating disorder and the challenges that come with it.

Of the ten topics extracted by the model, three surround dieting tips, goal weights, and discussions about calorie counting and restricting. Two are dedicated to messages which show support or express empathy as well as messages which discuss recovery. The other five topics all surround what it is like to live with an eating disorder.

I want to reiterate that although what goes into each topic is algorithmically extracted, the label is a human judgment. As I mentioned, the labels I landed on are informed by manual examination of the top words and posts for each topic.

With eating disorders, it can be difficult to get people who are struggling to speak openly and honestly about their experiences. Specifically, individuals with eating disorders who are not oriented towards recovery may attempt to conceal their disorders and have difficulty recognizing the negative implications of their own condition. In pro-ana communities, however, it appears that there is a different set of expectations for self presentation. Although we cannot exactly assess the authenticity of this honesty, our evidence suggests that the social norms of these communities differ from in-person interactions.

Many pro-ana communities even have their own principles or rules for members. Eating Disorder Central’s founding principles state that:

“We believe those suffering from eating disorders deserve a space where they can be unapologetically themselves and are able to freely express their ED behaviors free from censorship and suppression.”

This emphasis on openness and participation fosters an environment with unique topical trends. Understanding these trends is the first step to uncovering the gaps in current treatment.

TIP: Reading topic cards

Throughout the article, you will see cards corresponding to each topic. Here is what each card will tell you:

Which topics are the most common?

Before running the model, as I mentioned earlier, I expected that the most commonly occurring topics would surround dieting and weight loss or emotional support. From the results of this model, we find this to be true: two of the most commonly discussed topics across all of the communities are topic 3 (support/empathy) and topic 10 (goal weights), with topic 10 being equally salient across the three communities.

However, I was surprised to find that two other topics are discussed frequently as well: topic 5 (living with an ED) and topic 8 (feeling hungry). In posts that are dominated by these topics, users candidly describe their experiences with an eating disorder, often mentioning feeling misunderstood. One user highlights the issue of generalizing a complex disease:

“Making sweeping generalizations to cover an entire anorexia population is never going to be accurate, people have anorexia for many different reasons. For me, it's the opposite, it's wanting comfort and safety and not wanting to be in an adult world”

A previous study which analyzed content on pro-ana blogs described them as platforms for social support. Topics 5 and 8 from my model align with their finding that “reciprocal self-disclosure” is a specific form of social support that “may be one of the primary ways in which pro-ana blogs facilitate the sense of virtual community among their members.”

Overall, it was interesting to find that the majority of topics surround the actual experience of having an eating disorder rather than just dieting tips and tricks, which is the common preconception of what content is like on pro-ana communities. It appears that users feel comfortable sharing the most challenging aspects of anorexia as well as its repercussions. On top of that, users seek to relate to each other but are not afraid to speak up when they disagree with something that was posted.

What's interesting here?

One topic that I want to highlight is topic 9 (triggers). This topic is not the most popular one across all three communities, and it is not more present on one community relative to another. While triggers are not an unexpected topic, I was surprised that triggering conversations often focused on celebrities and media. Many posts describe specific films, television shows, or famous people who trigger their ED. Interestingly, not all of these posts are referring to “triggers” in a negative way.

The reason I want to point out this topic is because I think it is a good example of the limitations of this model. Although the model grouped these posts together as having the same dominant topic, the sentiment of each post varies. Some posts condemn celebrities and influencers for being triggering, like this one:

“I find her videos so triggering. And I hate how she always points out body checks while including them in her videos. Like she is criticizing them but then also getting views off of them and giving them even more attention. If she really has a problem with body checks then why not cut them off or even avoid stitching videos with them altogether??”

Meanwhile, other posts describe triggering content as “therapeutic,” like this one:

“did NOT expect the music video for anti hero to be that ed heavy but i’m kinda glad tbh. it’s therapeutic.”

Yes, both of these posts are referring to media and triggers, but do they really mean the same thing? I would argue that they don’t. In the first one, the user is describing posting ED content as something harmful and painful. In the second, the user says they are comforted by ED content. Other posts with high salience of this topic also use the word trigger in the context of encouraging or motivating their ED. Because of this inconsistency, topic 9 was difficult to label.

Despite the challenge of capturing sentiment, topic modeling is still a valuable tool for textual analysis. With a large amount of data, topic modeling helps us efficiently organize, comprehend, and summarize patterns in a corpus. In the case of this corpus, it also provides a foundation for us to compare content amongst different communities.

How do topics vary across communities?

I ran permutation tests to determine the statistically significant differences in topic saliency across communities. All of the results, including average saliency and percent difference, that I am going to cite here were statistically significant at the p<0.05 level. I found that topic 5 (living with an ED) is on average more salient in the Reddit community than the other two (14% distribution on Reddit vs 11% for MPA and 9% for EDC). Specifically, this topic is 37% more prevalent on Reddit than it is on EDC. We can infer that the nature of Reddit being a threaded social media affords these types of discussions that involve relating and responding to each other. EDC, however, consists of individual responses to a prompt without as much threading as Reddit.

It is interesting how interactions differ between EDC and Reddit because of their different forms. From a human-computer interaction perspective, this made me reflect on how design choices can greatly impact the social behaviors of users on a platform. The user experience of the three communities in my dataset should be considered as a factor when comparing them.

Most posts on EDC and MPA take on the form of a journal entry or Q&A. Topic 7 (calorie restriction) fits the Q&A form, and is most often present on EDC and MPA compared to Reddit (14% distribution on EDC vs 13% on MPA and 7% on Reddit). Here’s a post where one user poses a question about restriction (I have replaced specific numbers with # signs to draw attention to the structure of this post rather than to the numbers themselves):

“for example: let's say your daily intake goal is ### calories or less and you eat ### but you burn ###. so your net calories for the day would be ###. for me, i never feel like that's good enough. i will still feel bad about the extra ### calories i ate even though i burned twice as much. does anyone relate?”

In the case of topic 7, we must also consider the impact of content moderation.

Because Reddit is a more popular forum, it is also more moderated, and we often see posts replaced by a message from a Reddit bot explaining that the post was removed for harmful content. So it’s not surprising that topic 7, which has the potential to spread harmful nutritional misinformation, is 96% more prevalent on EDC, an unmoderated site, compared to Reddit.

One reason why I chose MPA as a community of interest was due to this article, which describes MyProAna as a recovery community. But after reading numerous threads on the site, I was not so convinced. While the community guidelines emphasize on support, many of the users on this site still discuss pro-anorexia topics.

If this labelling were accurate, I would expect that MPA would discuss topic 4 (recovery) and topic 3 (support) significantly more often compared to the other two communities. We do find that when comparing MPA and Reddit, topic 4 (recovery) is 37% more salient on MPA. However, compared to EDC, it is only 10% more salient. Recovery is also the least salient topic across all three communities. Similarly, topic 3 (support) is 22% more salient on MPA compared to EDC, but it is only 8% more salient compared to Reddit.

From these results, we can see how a simplistic label does not encapsulate the entire community.

Where do we go from here?

Eating disorders are incredibly under-treated. They need to be addressed more seriously. In order to do so, we need to put conscious effort into understanding the complicated emotions and behaviors of those who are struggling. While eating disorder treatment has come a long way, it is still de-prioritized and neglected compared to disorders such as substance abuse. My research examined three pro-ana communities, but future work should continue to use computational methods to look at an even larger number of communities in order to prescribe the most effective solution.

Intervention should go beyond just shutting down these sites or directing people toward ones that are “better” based on subjective opinion. The reality is that for every site that is shut down or moderated, another site will replace it. Instead, we should be listening to what people are saying on these communities that they are not saying to a parent, a therapist, or a friend. And we should ask ourselves why that is.

The fact that people feel more comfortable sharing their experiences honestly and openly with anonymous strangers online than with the people who are closest to them should be enough to tell us that something needs to change.

Explore all 10 topics

....

Topic description here

Top words here...