Original Paper
Abstract
Background: Despite the results of the Testosterone Trials, physicians remain uncomfortable treating men with hypogonadism. Discouraged, men increasingly turn to social media to discuss medical concerns.
Objective: The goal of the research was to apply natural language processing (NLP) techniques to social media posts for identification of themes of discussion regarding low testosterone and testosterone replacement therapy (TRT) in order to inform how physicians may better evaluate and counsel patients.
Methods: We retrospectively extracted posts from the Reddit community r/Testosterone from December 2015 through May 2019. We applied an NLP technique called the meaning extraction method with principal component analysis (MEM/PCA) to computationally derive discussion themes. We then performed a prospective analysis of Twitter data (tweets) that contained the terms low testosterone, low T, and testosterone replacement from June through September 2019.
Results: A total of 199,335 Reddit posts and 6659 tweets were analyzed. MEM/PCA revealed dominant themes of discussion: symptoms of hypogonadism, seeing a doctor, results of laboratory tests, derogatory comments and insults, TRT medications, and cardiovascular risk. More than 25% of Reddit posts contained the term doctor, and more than 5% urologist.
Conclusions: This study represents the first NLP evaluation of the social media landscape surrounding hypogonadism and TRT. Although physicians traditionally limit their practices to within their clinic walls, the ubiquity of social media demands that physicians understand what patients discuss online. Physicians may do well to bring up online discussions during clinic consultations for low testosterone to pull back the curtain and dispel myths.
doi:10.2196/21383
Keywords
Introduction
The Testosterone Trials were a coordinated series of placebo-controlled, double-blinded trials intended to elucidate risks and benefits of testosterone replacement therapy (TRT) in hypogonadal men [
- ]. Despite these recent trials, clinicians continue to be uncomfortable treating these men, in part due to unanswered questions related to cardiovascular outcomes and cancer risk, as well as how TRT is portrayed in popular culture. Perhaps discouraged by conflicting information from physicians and traditional media, patients sometimes turn to social media platforms to discuss medical concerns with peers [ , ].Interactive social media channels have emerged as potent resources for individuals to discuss health care concerns [
]. Reddit, an anonymous discussion platform with over 330 million monthly active users, serves as a popular internet destination for discussions of health-related topics [ ]. The Reddit forum or subreddit r/Testosterone [ ], which boasts over 30,000 active members, is devoted to answering questions, sharing personal accounts, and disseminating resources related to TRT and testosterone levels. Similar discussions occur on other social media sites, including Twitter, a microblogging platform with over 126 million daily active users [ ].We hypothesized that the content of online discussions about low testosterone can be classified into themes that may inform how physicians evaluate, counsel, and treat men with hypogonadism. Here, we apply quantitative natural language processing (NLP) techniques to identify dominant themes of discussions regarding low testosterone and TRT on social media.
Methods
Study Design and Sources of Data
An overview of our methodology is presented in
. The study comprised three phases: extraction of data from social media platforms ( A), automated organization of textual data ( B), and quantitative analysis of the textual data to identify dominant themes of the text ( C).First, we retrospectively processed posts and comments from the Reddit community r/Testosterone from December 2015 through May 2019. Reddit data were extracted using BigQuery (Google LLC), an enterprise data analytics platform, from a dataset uploaded for public use [
] ( A). We evaluated both parent posts (the main post in a Reddit discussion) and comment posts (submitted in response to a parent post). We applied a word count criterion of >20 words for parent posts to exclude potential spam, deleted text, and posts composed only of links to other websites. As we anticipated the average word count of comment posts to be less, we used a more relaxed word count criterion of >5 words for comment posts.Next, Twitter data (tweets) were collected prospectively from June through September 2019 using the rtweet application [
], which integrates tweets for processing in RStudio version 1.1.463 (RStudio PBC) ( A). We extracted tweets containing the terms low testosterone, low T, and testosterone replacement. We applied a word count criterion for tweets (>5 words per tweet), given the character count limitation imposed by the Twitter platform. Retweets (reposts of an identical, previously published tweet) were excluded from analysis.Natural Language Processing Using the Meaning Extraction Method
Reddit parent posts, Reddit comment posts, and tweets from Twitter were separately subjected to an NLP technique called the meaning extraction method (MEM) [
] with principal component analysis (PCA). MEM/PCA tracks words that cluster together to derive themes quantitatively [ ]. This approach has been previously validated to reveal information about individuals’ personalities, communication strategies, and behaviors [ , ].To automate the MEM, we used the topic modeling application meaning extraction helper version 2 [
] to deconstruct each post or tweet into its component words. Stop words (eg, articles, prepositions, and transitions) were filtered out. Remaining words were ranked by their frequencies of appearance in each post or tweet ( B). Words were then subjected to PCA with varimax rotation ( C) using SPSS Statistics version 25 (IBM Corporation). PCA identified clusters of words that frequently appeared together. Each word was conferred a factor loading, the correlation coefficient between the word and the cluster to which it belonged. Factor loading thresholds of >0.20 are appropriate when performing PCA of text data to capture a sufficient proportion of the variance in the data [ , ]. We assigned a descriptive theme to each cluster based on the words within it.Subset Analyses on Key Topics of Interest
Given widespread interest and controversy regarding the potential associations of TRT with cardiovascular disease and prostate cancer risk, we sought to quantitate the appearance of these topics on Reddit and Twitter. Subset analysis was performed to determine the frequencies of the words prostate, cancer, PSA (prostate-specific antigen), heart, attack, stroke, cardiovascular, and death. Furthermore, to identify the degree to which individuals allude to seeking consultation with a health care provider, an additional analysis was performed to determine the frequencies of the relevant terms doctor, urologist, endocrinologist, and appointment.
Statistical Validity of Principal Component Analysis
To assess applicability of PCA to each dataset, the Kaiser-Meyer-Olkin (KMO) statistic, a measure of sampling adequacy (values >0.60 are adequate), and the Bartlett test for sphericity, which tests if there are significant correlations among variables of interest, were calculated [
].Ethics
Consistent with previous investigations on social media data, this work was exempted by the institutional review board of the University of California, Los Angeles, as it involves publicly available data and does not involve human subjects.
Results
Total Number of Posts Extracted From Social Media
From the r/Testosterone community on Reddit, we retrospectively extracted 19,083 parent posts and 218,082 comment posts over the 42-month period of study. After exclusions, 12,665 parent posts and 186,670 comment posts remained. From Twitter, we prospectively extracted 7467 tweets over 4 months; 6659 tweets remained after exclusions.
Natural Language Processing of Reddit Data
Using MEM for Reddit parent post and comment post data, we identified 5 factors, or thematic word clusters, that included words with factor loadings greater than 0.30 and 0.20, respectively (
and ).The following themes emerged from NLP of Reddit data: seeing a doctor, results of laboratory tests, administration of TRT, and lifestyle interventions (both parent posts and comment posts); symptoms of hypogonadism (parent posts only); and TRT medications (comment posts only).
contains representative quotations that feature each Reddit theme. Some quotes have been abridged in the interest of space.Cluster and word | Factor loading coefficient | Frequency | |
Results of laboratory tests | |||
LH (luteinizing hormone) | 0.72 | 13.2 | |
FSH (follicle-stimulating hormone) | 0.70 | 11.4 | |
Free | 0.67 | 26.3 | |
Prolactin | 0.61 | 8.8 | |
TSH (thyroid-stimulating hormone) | 0.58 | 7.7 | |
SHBG (sex hormone binding globulin) | 0.57 | 12.3 | |
Total | 0.56 | 24.3 | |
Estradiol | 0.54 | 11.3 | |
Range | 0.46 | 22.1 | |
Result | 0.41 | 26.0 | |
Lifestyle interventions | |||
Weight | 0.55 | 10.6 | |
Fat | 0.55 | 9.3 | |
Eat | 0.53 | 8.4 | |
Diet | 0.50 | 8.7 | |
Gain | 0.49 | 7.3 | |
Lift | 0.48 | 7.0 | |
Muscle | 0.46 | 9.8 | |
Sleep | 0.43 | 11.3 | |
Gym | 0.40 | 6.7 | |
Body | 0.38 | 11.6 | |
Seeing a doctor | |||
Doctor | 0.43 | 25.5 | |
Low | 0.37 | 43.3 | |
Told | 0.37 | 8.4 | |
Level | 0.35 | 34.6 | |
Month | 0.32 | 26.4 | |
Appointment | 0.32 | 5.1 | |
Read | 0.31 | 13.1 | |
Endocrinologist | 0.31 | 5.2 | |
Treatment | 0.31 | 7.5 | |
Urologist | 0.30 | 5.7 | |
Testosterone replacement therapy administration | |||
Week | 0.55 | 40.2 | |
Dose | 0.49 | 13.3 | |
HCG (human chorionic gonadotropin) | 0.46 | 13.5 | |
Protocol | 0.42 | 6.3 | |
Injection | 0.42 | 16.4 | |
Day | 0.42 | 28.4 | |
CYP (cytochrome P450) | 0.42 | 5.9 | |
Start | 0.40 | 28.7 | |
E2 (estradiol) | 0.38 | 12.3 | |
Twice | 0.36 | 5.8 | |
Symptoms of hypogonadism | |||
Fog | 0.76 | 5.0 | |
Brain | 0.75 | 5.7 | |
Depress | 0.42 | 11.7 | |
Symptom | 0.40 | 21.0 | |
Anxiety | 0.38 | 7.9 | |
Libido | 0.36 | 16.0 | |
Erection | 0.35 | 8.4 | |
Sex | 0.34 | 14.7 | |
Energy | 0.32 | 10.7 | |
Drive | 0.31 | 9.1 |
Cluster and word | Factor loading coefficient | Frequency | |
Seeing a doctor | |||
Doctor | 0.39 | 9.0 | |
TRT (testosterone replacement therapy) | 0.38 | 15.3 | |
Treatment | 0.33 | 2.7 | |
Symptom | 0.32 | 5.9 | |
Life | 0.31 | 4.0 | |
People | 0.30 | 5.9 | |
Issue | 0.29 | 5.3 | |
Help | 0.27 | 6.0 | |
Cause | 0.27 | 5.6 | |
Hormone | 0.26 | 3.9 | |
Prescribe | 0.26 | 2.6 | |
Experience | 0.25 | 2.8 | |
Results of laboratory tests | |||
Free | 0.64 | 4.9 | |
Total | 0.58 | 4.9 | |
SHBG (sex hormone binding globulin) | 0.54 | 4.6 | |
Range | 0.44 | 5.9 | |
Test | 0.42 | 17.9 | |
Low | 0.38 | 15.4 | |
LH (luteinizing hormone) | 0.34 | 2.8 | |
Lab | 0.33 | 4.5 | |
E2 (estradiol) | 0.31 | 8.0 | |
Normal | 0.28 | 5.6 | |
Result | 0.26 | 4.3 | |
Testosterone replacement therapy administration | |||
Week | 0.60 | 13.0 | |
Day | 0.45 | 9.7 | |
Injection | 0.43 | 5.8 | |
Dose | 0.42 | 7.6 | |
Inject | 0.35 | 3.9 | |
Start | 0.33 | 9.1 | |
Protocol | 0.32 | 2.8 | |
Feel | 0.31 | 11.6 | |
Month | 0.31 | 7.2 | |
Time | 0.30 | 9.4 | |
Lifestyle interventions | |||
Fat | 0.57 | 3.0 | |
Eat | 0.55 | 3.1 | |
Diet | 0.53 | 3.4 | |
Weight | 0.50 | 3.2 | |
Muscle | 0.41 | 2.8 | |
Body | 0.34 | 4.7 | |
Sleep | 0.33 | 3.5 | |
Testosterone replacement therapy medications | |||
Increase | 0.40 | 4.2 | |
Estrogen | 0.37 | 3.3 | |
Effect | 0.36 | 4.2 | |
Lower | 0.32 | 4.7 | |
Testosterone | 0.30 | 11.9 | |
Clomid | 0.26 | 3.7 | |
HCG (human chorionic gonadotropin) | 0.26 | 5.9 |
Data source and theme | Representative quotation | |
Reddit parent posts | ||
Results of laboratory tests | ||
Here’s what came up: FSHa 2.1 (1.5-12.4) LHb 5.6 (1.7-8.6) Prolactin 15.25 (4.04-15.2)* T, totalc 311.1 (249-836)* SHBGd 33.3 (16.5-55.9) Free testosterone index 32.43 (35.0-92.6)* | ||
shbg and dhea still pending. I had to get these results because i have an appointment with neurosurgeon soon and he will need the labs and mrie. | ||
Lifestyle interventions | ||
Have been eating super clean. Working with a dietitian/personal trainer. Was dieting mostly high protein / low fat / low carb | ||
I work out all the time lifting heavy weights, 3 or 4 times a week on average. I eat a good diet, take my zinc, vitamin D, and get in my fats and essential fats. | ||
Seeing a doctor | ||
I know several people on trtf, but they all have the same doc...you walk in, tell him you want to get bigger, stronger, and faster, pay out of pocket for his blood test then buy your meds from his attached pharmacy. That’s not what I want. I want to find out what’s wrong without a preconceived bias. | ||
So I go to the appointment. And the specialist I saw (a urologist) said he wasn’t the guy to see about this issue, and ended up referring me to another specialist. | ||
Testosterone replacement therapy administration | ||
T cypg 200 mg/ml - 0.32 mL IM/SQ twice weekly (~130 mg/week) HCGh 500 IU SQ twice weekly to prevent testicular atrophy No AIi - low E2j, monitor DHEAk 25 mg every night | ||
So I don’t know what to do? Take my AI and hope that my E2 is high? Or keep not taking my AI and hope things will get better? I literally can’t hold out a week to get another blood test and also I can’t afford it right now. | ||
Symptoms of hypogonadism | ||
All the normal symptoms: brain fog, mood swings, low libido, erectile dysfunction, inability to add muscle at the gym despite working out 3x a week. | ||
Symptoms: brain fog, very low energy level, lifelessness-zombie feeling most days, very lethargic, mood swings, easy to get angry, grumpy and annoyed at earliest, no libido/sex drive, EDl—less frequency, less powerful, minimal to no erections during sex, softer (haven’t had sex in years) | ||
Reddit comment posts | ||
Seeing a doctor | ||
Many doctors—especially PCPsm—are not fluent in the endocrine system. They aren’t supposed to be. Going to your primary care physician for hormone questions is a mistake. If you knew you had heart issues, wouldn’t you go to a cardiologist? | ||
My PCP looked super confused and clueless as to what he was supposed to do for me. Doc made me do two more labs fasting to confirm then he referred me out to an endocrinologist. The endo made me do three more fasting labs and a testicular ultrasound to confirm. | ||
Results of laboratory tests | ||
Honestly I don’t think testosterone is your problem based on Sept 10th 2015 blood results. You have decent midrange total, and free testosterone. SHBG bounces around, so maybe it’s a testing error. | ||
198 is low as hell for your dad, and even 450 for him would be low. Yours is lowish, but you have definite symptoms. | ||
Testosterone replacement therapy administration | ||
75 mg E5Dn (105 mg per week). Doesn’t require an AI, doesn’t give me side effects. I am at ~700 on trough days and feel pretty damn good. | ||
I had just moved to a standard TRT dose of Test Cyp, 100 mg/week. At 5’11”, 172 lbs, and 17% body fat, taking 1 mg of Arimidex every day tanked my E2. Dropping down to 0.25 mg Arimidex once a week had the same effect. | ||
Lifestyle interventions | ||
Eat good food, lift heavy, and get sleep. Repeat for two years. | ||
TRT will not turn you into a bodybuilder. It may tone you a little bit (if everything is in check). But just saying “I eat good” literally means nothing. What are your macros? What’s your diet? Etc? | ||
Testosterone replacement therapy medications | ||
HCG is a water-based peptide hormone that can be injected to replace the lost LH hormone that TRT shuts down. Without hCG, the LH receptors in the testes are no longer getting activated. The results: the testes shrink. | ||
Clomiphene. What a double-edged sword. First, Clomid will certainly have an effect on your testosterone levels. Usually, it is doses substantially higher than 12.5 mgs daily. | ||
Tweets | ||
Symptoms of hypogonadism | ||
Keeping your hormone levels up is a crucial part of #health. Low T can lead to all types of adverse effects: - weight gain/belly fat - #LowEnergy - low sex drive | ||
This, in turn, causes a lower sex drive, depression, reduced muscle mass, and low levels of energy. Erectile dysfunction is another symptom. | ||
Cardiovascular risk | ||
#Testosterone Replacement Therapy Lowers Heart Attack Risk | ||
Aging men with low testosterone levels who take testosterone replacement therapy (TRT) are at a slightly greater risk of experiencing an ischemic stroke | ||
Symptom improvement | ||
Starting testosterone replacement therapy and thyroid medication at the same time is quite the 1-2 punch to the system. Endless energy, great sleep, and able to lift weights heavier and longer. | ||
“My energy is back”: how testosterone replacement therapy is changing men’s lives | ||
Derogatory comments and insults | ||
That little cuck should be the poster boy for low T supplements | ||
I was called effeminate and a low testosterone beta here for defending women’s rights. |
aFSH: follicle-stimulating hormone.
bLH: luteinizing hormone.
cT, total: total testosterone.
dSHBG: sex hormone binding globulin.
eMRI: magnetic resonance imaging.
fTRT: testosterone replacement therapy.
gCYP: cytochrome P450.
hHCG: human chorionic gonadotropin.
iAI: aromatase inhibitor.
jE2: estradiol.
kDHEA: dehydroepiandrosterone.
lED: erectile dysfunction.
mPCP: primary care physician.
nE5D: every 5 days.
The highest frequency word occurrences among parent posts as determined by PCA were low (5484/12,665 [43.30%] of posts), week (5092/12,665, 40.20%), level (4382/12,665, 34.60%), and start (3635/12,665, 28.70%). Among comment posts, the highest frequency word occurrences were test (33,414/186,670, 17.90%), low (28,747/186,670, 15.40%), TRT (28,561/186,670, 15.30%), and week (24,267/186,670, 13.00%).
Parent post and comment post PCA accounted for 15.45% (1957/12,665) and 13.84% (25,835/186,670) of the total variance, respectively. KMO statistic was 0.91 for Reddit parent post data and 0.80 for Reddit comment post data, with Bartlett test <0.01, indicating that the Reddit data were appropriate for factor analysis using PCA.
Natural Language Processing of Twitter Data
Similarly, MEM for Twitter data identified 4 factors, or thematic word clusters, with factor loadings greater than 0.25 (
). The following themes emerged from NLP of tweets: symptoms of hypogonadism, cardiovascular risk, symptom improvement, and derogatory comments and insults.The highest frequency word occurrences among tweets as determined by PCA were level (693/6659, 10.40%), male (426/6659, 6.40%), sex (213/6659, 3.20%), and increase (200/6659, 3.00%). Twitter PCA accounted for 9.01% (600/6659) of the total variance. KMO statistic was 0.61 for Twitter data, with Bartlett test <0.01, indicating that the Twitter data were appropriate for factor analysis using PCA. Of note, other studies using MEM/PCA have reported similar percentages of variance as those determined in our analysis of Reddit and Twitter data [
, ].Cluster and word | Factor loading coefficient | Frequency | |||
Symptoms of hypogonadism | |||||
Muscle | 0.54 | 1.9 | |||
Mass | 0.48 | 1.2 | |||
Sex | 0.41 | 3.2 | |||
Libido | 0.39 | 1.0 | |||
Level | 0.36 | 10.4 | |||
Drive | 0.35 | 1.8 | |||
Fat | 0.32 | 1.3 | |||
Hormone | 0.31 | 2.7 | |||
Body | 0.28 | 2.3 | |||
Weight | 0.27 | 1.1 | |||
Cardiovascular risk | |||||
Heart | 0.76 | 1.1 | |||
Attack | 0.76 | 1.1 | |||
Risk | 0.73 | 2.1 | |||
Increase | 0.62 | 3.0 | |||
Symptom improvement | |||||
Change | 0.69 | 2.2 | |||
Energy | 0.69 | 2.9 | |||
Live | 0.69 | 2.0 | |||
Life | 0.21 | 2.1 | |||
Derogatory comments and insults | |||||
Boy | 0.58 | 2.3 | |||
Soy | 0.56 | 2.4 | |||
Beta | 0.42 | 2.8 | |||
Cuck | 0.29 | 1.0 | |||
Male | 0.23 | 6.4 | |||
Girl | 0.22 | 1.0 | |||
Little | 0.22 | 1.3 |
Word Occurrences on Key Topics of Interest
Subset analysis was performed to determine word occurrence frequencies in three key topics of interest that relate to TRT: prostate cancer risk, cardiovascular disease risk, and seeking consultation with a health care professional. These data are presented in
.In brief, over 1% of Reddit parent posts contain the terms prostate (143/12,665, 1.13%), cancer (143/12,665, 1.13%), PSA (210/12,665, 1.66%), or heart (175/12,665, 1.38%). Over a quarter of Reddit parent posts contain the term doctor (3235/12,665, 25.54%), while over 5% of parent posts refer to either a urologist (732/12,665, 5.78%) or endocrinologist (657/12,665, 5.19%). Frequencies of these terms were higher among Reddit posts than among tweets from Twitter.
Concern associated with testosterone replacement therapy | Word frequency (%) | |||
Reddit parent post | Reddit comment post | Tweet | ||
Prostate cancer risk | ||||
Prostate | 1.13 | 0.36 | 0.63 | |
Cancer | 1.13 | 0.56 | 0.87 | |
PSAa | 1.66 | 0.25 | 0.06 | |
Cardiovascular disease risk | ||||
Heart | 1.38 | 0.61 | 1.14 | |
Attack | 0.89 | 0.32 | 1.14 | |
Stroke | 0.23 | 0.13 | 0.86 | |
Cardiovascular | 0.13 | 0.13 | 0.44 | |
Death | 0.24 | 0.19 | 0.21 | |
Seeking health care consultation | ||||
Doctor | 25.54 | 9.00 | 1.77 | |
Urologist | 5.78 | 1.36 | 0.20 | |
Endocrinologist | 5.19 | 0.97 | 0.17 | |
Appointment | 5.13 | 0.74 | 0.39 |
aPSA: prostate-specific antigen.
Discussion
Principal Findings
NLP techniques applied to unfiltered discussions on Reddit and Twitter offer a useful framework for understanding patient priorities outside the doctor’s office. We found that men largely turn to social media to learn about symptoms of low testosterone, interpretation of personal lab results, practicalities of TRT, and body changes with treatment. Notably, cardiovascular risk was a major discussion theme, echoing concerns among prescribers, who may be deterred by continued ambiguity despite the publication of the Testosterone Trials. Although NLP analysis did not reveal prostate cancer as a notable theme, a number of posts included text related to this topic, suggesting that this may represent an important discussion point for a subset of online discussions related to TRT.
Our results underscore that patients are searching for medical guidance related to hypogonadism on social media, an environment where anecdotes predominate and advertising often masquerades as medical advice [
]. TRT prescriptions have risen almost 4-fold over the last two decades, which can be attributed, in part, to off-label indications and direct-to-consumer advertising [ ]. Even beyond standard TRT, testosterone-boosting supplements with minimal data to support their efficacy are aggressively marketed and readily available online [ ]. But still, social media represents an enormous opportunity for the medical community to improve how we engage with our patients and to do so in a meaningful and impactful way. Potential interventions that may inoculate against coercive direct-to-consumer marketing practices include disseminating high-quality, open-access information related to hypogonadism. For example, Halpern et al [ ] recently published a JAMA Patient Page article on hypogonadism. This single-page handout written in easily accessible language includes an infographic highlighting symptoms of hypogonadism and potential adverse effects of TRT, in addition to information related to etiology of hypogonadism and a discussion of potential cardiovascular and prostate cancer risks associated with TRT—all topics that emerged as major themes of discussion from our data.Social media platforms, including Reddit and Twitter, create a space for patients not only to obtain answers to questions that they are either uncomfortable or unwilling to ask in a face-to-face clinical setting but also to connect with others going through similar experiences. However, not all health-related discussions online are productive. Twitter featured the theme of derogatory comments and insults, highlighting an undertone of stigma, which may compound existing barriers preventing men from accessing care [
]. In contrast, the seeing a doctor theme only emerged on Reddit, with more than 25% of parent posts mentioning the word doctor, compared with less than 2% on Twitter. This may reflect inherent differences among the two social media platforms, as Twitter is constrained by a strict character count limitation and is overall less anonymous, with discussants frequently using their true identities in their display usernames and account photos.Although clinician engagement with the online hypogonadism community will become increasingly important in the coming years, improving the in-office clinical experience of our patients cannot be overemphasized. Our data reveal that many of the online discussions featured personal questions related to interpretation of lab results. This is consistent with a previous study exploring Reddit discussions of male factor infertility, where nearly 20% of all posts featured a question related to personal semen analysis results [
]. Such discussions related to lab results cannot be addressed by disseminating a primer on hypogonadism and TRT, but instead demand the expertise of a clinician trained in managing male endocrinology and the related sexual, reproductive, and psychological comorbidities. Creating an in-office experience where men feel comfortable and safe to ask their questions and voice their concerns should be a priority for any outpatient clinical setting, but especially one that caters to men with suspected hypogonadism. Both outpatient primary care settings and urological outpatient clinics can learn from the success of the emerging multidisciplinary men’s health clinic [ ].Here we offer valuable insight into primarily patient concerns in a forum that allows for honest and unfiltered patient feedback as it relates to these discussants’ experiences with hypogonadism. Clinically, these data highlight that patients worry most about comorbidities, lifestyle factors impacted by low testosterone, and treatment options. While other aspects of hypogonadism can be discussed, these data highlight the most salient hypogonadism-related concerns for our patients. Additionally, this study can further improve on patients’ in-office experiences by informing how physicians can lead discussions to highlight aspects of low testosterone that patients may feel are not being adequately addressed.
Limitations
Our study is not without limitations. Although NLP techniques allowed us to analyze a large volume of discrete social media posts, generalizability of MEM is limited by the absence of contextual valence (positivity or negativity). However, this does not impair overall thematic identification. Additionally, discussants who turn to social media for health care information may be different with respect to demographics, health care priorities, and information preferences compared with those who do not; our results should therefore be interpreted within this context [
]. It should also be noted that some individuals use social media as a platform to vent about their experiences with health care professionals as they relate to hypogonadism care. This is an important distinction to make because it may not necessarily represent a lack of communication between patients and their physician but rather a discussant’s opportunity to share. Future studies may consider investigating to other Reddit communities, expanding Twitter search terms, or exploring other social media platforms.Conclusions
This study represents the first evaluation of the social media landscape surrounding hypogonadism and TRT using NLP techniques. Our analysis of more than 200,000 discrete social media posts revealed dominant themes of discussion, which may inform how physicians evaluate and counsel men with hypogonadism. Understanding the complex internet landscape of hypogonadism discussions represents the first step in creating well-informed and clinically meaningful change. Although physicians traditionally limit their practices to within their clinic walls, the ubiquity of social media demands that physicians engage patients where they are, including online. Practicing physicians may do well to bring up online discussions during clinic consultations, to pull back the curtain and dispel myths.
Acknowledgments
SVE is supported by a Research Scholar Award from the Urology Care Foundation and American Urological Association. These organizations played no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Authors' Contributions
VO was responsible for concept and design; acquisition, analysis, or interpretation of data; drafting the manuscript; critical revision of the manuscript for important intellectual content; statistical analysis; and approval of the manuscript. TJ was responsible for concept and design; acquisition, analysis, or interpretation of data; critical revision of the manuscript for important intellectual content; and approval of the manuscript. JNM was responsible for concept and design; acquisition, analysis, or interpretation of data; critical revision of the manuscript for important intellectual content; administrative, technical, or material support; and approval of the manuscript. SVE was responsible for concept and design; acquisition, analysis, or interpretation of data; drafting the manuscript; critical revision of the manuscript for important intellectual content; administrative, technical, or material support; supervision; and approval of the manuscript.
Conflicts of Interest
SVE serves as a consultant for Metuchen Pharmaceuticals. JNM serves as a consultant for Antares Pharma, Boston Scientific Corporation, and Endo Pharmaceuticals. The remaining authors report no conflicts of interest.
References
- Snyder PJ, Bhasin S, Cunningham GR, Matsumoto AM, Stephens-Shields AJ, Cauley JA, et al. Lessons from the Testosterone Trials. Endocr Rev 2018 Jun 01;39(3):369-386 [FREE Full text] [CrossRef] [Medline]
- Bhasin S, Ellenberg SS, Storer TW, Basaria S, Pahor M, Stephens-Shields AJ, et al. Effect of testosterone replacement on measures of mobility in older men with mobility limitation and low testosterone concentrations: secondary analyses of the Testosterone Trials. Lancet Diabetes Endocrinol 2018 Nov;6(11):879-890 [FREE Full text] [CrossRef] [Medline]
- Mohler ER, Ellenberg SS, Lewis CE, Wenger NK, Budoff MJ, Lewis MR, et al. The effect of testosterone on cardiovascular biomarkers in the Testosterone Trials. J Clin Endocrinol Metab 2018 Feb 01;103(2):681-688 [FREE Full text] [CrossRef] [Medline]
- Resnick SM, Matsumoto AM, Stephens-Shields AJ, Ellenberg SS, Gill TM, Shumaker SA, et al. Testosterone treatment and cognitive function in older men with low testosterone and age-associated memory impairment. JAMA 2017 Feb 21;317(7):717-727 [FREE Full text] [CrossRef] [Medline]
- Budoff MJ, Ellenberg SS, Lewis CE, Mohler ER, Wenger NK, Bhasin S, et al. Testosterone treatment and coronary artery plaque volume in older men with low testosterone. JAMA 2017 Feb 21;317(7):708-716 [FREE Full text] [CrossRef] [Medline]
- Cunningham GR, Stephens-Shields AJ, Rosen RC, Wang C, Bhasin S, Matsumoto AM, et al. Testosterone treatment and sexual function in older men with low testosterone levels. J Clin Endocrinol Metab 2016 Aug;101(8):3096-3104 [FREE Full text] [CrossRef] [Medline]
- Snyder PJ, Bhasin S, Cunningham GR, Matsumoto AM, Stephens-Shields AJ, Cauley JA, Testosterone Trials Investigators. Effects of testosterone treatment in older men. N Engl J Med 2016 Feb 18;374(7):611-624 [FREE Full text] [CrossRef] [Medline]
- Nobles AL, Leas EC, Althouse BM, Dredze M, Longhurst CA, Smith DM, et al. Requests for diagnoses of sexually transmitted diseases on a social media platform. JAMA 2019 Nov 05;322(17):1712-1713. [CrossRef] [Medline]
- fox S, Duggan M. Health Online 2013. Washington: Pew Internet and American Life Project; 2013. URL: https://www.pewresearch.org/internet/wp-content/uploads/sites/9/media/Files/Reports/PIP_HealthOnline.pdf [accessed 2020-09-22]
- Cassis C, Bassett J. Reddit's year in review. 2018. URL: https://redditblog.com/2018/12/04/reddit-year-in-review-2018/ [accessed 2019-10-06]
- subreddit r/Testosterone. URL: https://www.reddit.com/r/Testosterone/ [accessed 2020-09-22]
- Shaban H. Twitter reveals its daily active user numbers for the first time. Washinton Post. 2019 Feb 07. URL: https://www.washingtonpost.com/technology/2019/02/07/twitter-reveals-its-daily-active-user-numbers-first-time/ [accessed 2019-10-06]
- subreddit directory. URL: https://files.pushshift.io/reddit/ [accessed 2020-09-22]
- Kearney M. rtweet: collecting Twitter data (R package version 0.6.7). 2018. URL: https://cran.r-project.org/package=rtweet [accessed 2020-09-22]
- Chung C, Pennebaker J. Revealing dimensions of thinking in open-ended self-descriptions: an automated meaning extraction method for natural language. J Res Pers 2008 Feb;42(1):96-132 [FREE Full text] [CrossRef] [Medline]
- Boyd RL, Pennebaker JW. Did Shakespeare write double falsehood? Identifying individuals by creating psychological signatures with text analysis. Psychol Sci 2015 May;26(5):570-582. [CrossRef] [Medline]
- Barrett A, Murphy M, Blackburn K. “Playing hooky” health messages: apprehension, impression management, and deception. Health Commun 2018 Mar;33(3):326-337. [CrossRef] [Medline]
- Boyd R. MEH: Meaning Extraction Helper (Version 2.1.06). 2018. URL: https://www.ryanboyd.io/software/meh/ [accessed 2020-09-22]
- Blackburn KG, Yilmaz G, Boyd RL. Food for thought: exploring how people think and talk about food online. Appetite 2018 Apr 01;123:390-401. [CrossRef] [Medline]
- Jiang T, Osadchiy V, Mills JN, Eleswarapu SV. Is it all in my head? Self-reported psychogenic erectile dysfunction and depression are common among young men seeking advice on social media. Urology 2020 May 11:1. [CrossRef] [Medline]
- Pett M, Lackey N, Sullivan J. Making Sense of Factor Analysis. Thousand Oaks: Sage Publications; 2003.
- Wolf M, Chung CK, Kordy H. Inpatient treatment to online aftercare: e-mailing themes as a function of therapeutic outcomes. Psychother Res 2010 Jan;20(1):71-85. [CrossRef] [Medline]
- Stanton AM, Boyd RL, Pulverman CS, Meston CM. Determining women's sexual self-schemas through advanced computerized text analysis. Child Abuse Negl 2015 Aug;46:78-88 [FREE Full text] [CrossRef] [Medline]
- Kravitz RL. Direct-to-consumer advertising of androgen replacement therapy. JAMA 2017 Mar 21;317(11):1124-1125. [CrossRef] [Medline]
- Bandari J, Ayyash OM, Emery SL, Wessel CB, Davies BJ. Marketing and testosterone treatment in the USA: a systematic review. Eur Urol Focus 2017 Oct;3(4-5):395-402. [CrossRef] [Medline]
- Balasubramanian A, Thirumavalavan N, Srivatsav A, Yu J, Lipshultz LI, Pastuszak AW. Testosterone imposters: an analysis of popular online testosterone boosting supplements. J Sex Med 2019 Feb;16(2):203-212 [FREE Full text] [CrossRef] [Medline]
- Halpern JA, Brannigan RE. Testosterone deficiency. JAMA 2019 Sep 17;322(11):1116. [CrossRef] [Medline]
- Gott M, Hinchliff S. Barriers to seeking treatment for sexual problems in primary care: a qualitative study with older people. Fam Pract 2003 Dec;20(6):690-695. [CrossRef] [Medline]
- Osadchiy V, Mills JN, Eleswarapu SV. Understanding patient anxieties in the social media era: qualitative analysis and natural language processing of an online male infertility community. J Med Internet Res 2020 Mar 10;22(3):e16728 [FREE Full text] [CrossRef] [Medline]
- Houman JJ, Eleswarapu SV, Mills JN. Current and future trends in men's health clinics. Transl Androl Urol 2020 Mar;9(Suppl 2):S116-S122 [FREE Full text] [CrossRef] [Medline]
- Koch-Weser S, Bradshaw YS, Gualtieri L, Gallagher SS. The Internet as a health information source: findings from the 2007 Health Information National Trends Survey and implications for health communication. J Health Commun 2010;15 Suppl 3:279-293. [CrossRef] [Medline]
Abbreviations
KMO: Kaiser-Meyer-Olkin statistic |
MEM: meaning extraction method |
NLP: natural language processing |
PCA: principal component analysis |
PSA: prostate-specific antigen |
TRT: testosterone replacement therapy |
Edited by G Eysenbach; submitted 12.06.20; peer-reviewed by A Balasubramanian, X Ding; comments to author 04.07.20; revised version received 27.07.20; accepted 03.08.20; published 07.10.20
Copyright©Vadim Osadchiy, Tommy Jiang, Jesse Nelson Mills, Sriram Venkata Eleswarapu. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 07.10.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.