The Data Science Behind r/antiwork's Upvotes

"The dataset for our analysis was shaped by filtering out potentially biased comments, ensuring that the final set was representative and valid for our study."

"We categorized users as 'light' or 'heavy' based on their engagement levels, with light users comprising a majority of posters while heavy users contributed significantly to overall engagement."

This section outlines the methodology used to analyze posts from the r/antiwork subreddit, covering data collection, user categorization, and thematic periodization. Posts from January 2019 to July 2022 were retrieved through the PushShift API, resulting in a large dataset. Filters were applied to exclude biased or irrelevant comments, leading to over 11 million usable comments and nearly 285,000 posts. Users were divided into 'light' and 'heavy' categories based on their engagement levels, with most being light users, thereby informing the analysis on participation in discussions about anti-work sentiments.

#rantiwork #user-engagement #social-media-analysis #methodology #reddit-posts

Read at Hackernoon

Unable to calculate read time

Collection

[

...

]

The Data Science Behind r/antiwork's Upvotes | HackerNoonThe Data Science Behind r/antiwork's Upvotes | HackerNoon Briefly

The Data Science Behind r/antiwork's Upvotes | HackerNoon
The Data Science Behind r/antiwork's Upvotes | HackerNoon
Briefly