Abstract: Background: Social media has the capacity to afford the healthcare industry with valuable feedback
from patients who reveal and express their medical decision-making process, as well as self-reported
quality of life indicators both during and post treatment. In prior work, Crannell et al. , we
have studied an active cancer patient population on Twitter and compiled a set of tweets describing
their experience with this disease. We refer to these online public testimonies as “Invisible Patient
Reported Outcomes” (iPROs), because they carry relevant indicators, yet are difficult to capture
by conventional means of self-report.
Methods: Our present study aims to identify tweets related to the patient experience as an additional
informative tool for monitoring public health. Using Twitter’s public streaming API, we
compiled over 5.3 million “breast cancer” related tweets spanning September 2016 until mid December
2017. We combined supervised machine learning methods with natural language processing to
sift tweets relevant to breast cancer patient experiences. We analyzed a sample of 845 breast cancer
patient and survivor accounts, responsible for over 48,000 posts. We investigated tweet content with
a hedonometric sentiment analysis to quantitatively extract emotionally charged topics.
Results: We found that positive experiences were shared regarding patient treatment, raising support,
and spreading awareness. Further discussions related to healthcare were prevalent and largely
negative focusing on fear of political legislation that could result in loss of coverage.
Conclusions: Social media can provide a positive outlet for patients to discuss their needs and
concerns regarding their healthcare coverage and treatment needs. Capturing iPROs from online
communication can help inform healthcare professionals and lead to more connected and personalized
Abstract: Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion. Due to the increasing popularity of Twitter, its perceived potential for exerting social influence has led to the rise of a diverse community of automatons, commonly referred to as bots. These inorganic and semi-organic Twitter entities can range from the benevolent (e.g., weather-update bots, help-wanted-alert bots) to the malevolent (e.g., spamming messages, advertisements, or radical opinions). Existing detection algorithms typically leverage metadata (time between tweets, number of followers, etc.) to identify robotic accounts. Here, we present a powerful classification scheme that exclusively uses the natural language text from organic users to provide a criterion for identifying accounts posting automated messages. Since the classifier operates on text alone, it is flexible and may be applied to any textual data beyond the Twittersphere.
Abstract: Twitter has become the “wild-west” of marketing and promotional strategies for advertisement agencies. Electronic cigarettes have been heavily marketed across Twitter feeds, offering discounts, “kid-friendly” flavors, algorithmically generated false testimonials, and free samples.
All electronic cigarette keyword related tweets from a 10% sample of Twitter spanning January 2012 through December 2014 (approximately 850,000 total tweets) were identified and categorized as Automated or Organic by combining a keyword classification and a machine trained Human Detection algorithm. A sentiment analysis using Hedonometrics was performed on Organic tweets to quantify the change in consumer sentiments over time. Commercialized tweets were topically categorized with key phrasal pattern matching.
The overwhelming majority (80%) of tweets were classified as automated or promotional in nature. The majority of these tweets were coded as commercialized (83.65% in 2013), up to 33% of which offered discounts or free samples and appeared on over a billion twitter feeds as impressions. The positivity of Organic (human) classified tweets has decreased over time (5.84 in 2013 to 5.77 in 2014) due to a relative increase in the negative words ‘ban’, ‘tobacco’, ‘doesn’t’, ‘drug’, ‘against’, ‘poison’, ‘tax’ and a relative decrease in the positive words like ‘haha’, ‘good’, ‘cool’. Automated tweets are more positive than organic (6.17 versus 5.84) due to a relative increase in the marketing words like ‘best’, ‘win’, ‘buy’, ‘sale’, ‘health’, ‘discount’ and a relative decrease in negative words like ‘bad’, ‘hate’, ‘stupid’, ‘don’t’.
Due to the youth presence on Twitter and the clinical uncertainty of the long term health complications of electronic cigarette consumption, the protection of public health warrants scrutiny and potential regulation of social media marketing.
Abstract: Twitter, a popular social media outlet, has become a useful tool for the study of social behavior through user interactions called tweets. The location time, and message content of tweets provide invaluable social and demographic information for an applied comparison of social behaviors across the world. Our goal is to determine the density and sentiment surrounding tobacco and e-cigarette tweets and link prevalence of word choices to tobacco and e-cigarette use at various localities.