What cancer research makes the news? A quantitative analysis of online news stories that mention cancer studies

Journalists’ health and science reporting aid the public’s direct access to research through the inclusion of hyperlinks leading to original studies in peer-reviewed journals. While this effort supports the US-government mandate that research be made widely available, little is known about what research journalists share with the public. This cross-sectional exploratory study characterises US-government-funded research on cancer that appeared most frequently in news coverage and how that coverage varied by cancer type, disease incidence and mortality rates. The subject of analysis was 11436 research articles (published in 2016) on cancer funded by the US government and 642 news stories mentioning at least one of these articles. Based on Altmetric data, researchers identified articles via PubMed and characterised each based on the news media attention received online. Only 1.88% (n = 213) of research articles mentioning US government-funded cancer research included at least one mention in an online news publication. This is in contrast to previous research that found 16.8% (n = 1925) of articles received mention by online mass media publications. Of the 13 most common cancers in the US, 12 were the subject of at least one news mention; only urinary and bladder cancer received no mention. Traditional news sources included significantly more mentions of research on common cancers than digital native news sources. However, a general discrepancy exists between cancers prominent in news sources and those with the highest mortality rate. For instance, lung cancer accounted for the most deaths annually, while melanoma led to 56% less annual deaths; however, journalists cited research regarding these cancers nearly equally. Additionally, breast cancer received the greatest coverage per estimated annual death, while pancreatic cancer received the least coverage per death. Findings demonstrated a continued misalignment between prevalent cancers and cancers mentioned in online news media. Additionally, cancer control and prevention received less coverage from journalists than other cancer continuum stages, highlighting a continued underrepresentation of prevention-focused research. Results revealed a need for further scholarship regarding the role of journalists in research dissemination.

work does not constitute dual publication and should be included in the current manuscript. 9 (2), e025783.
3.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.
Please also include the following statement within your amended Funding Statement.
"The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the 'author contributions' section." If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.
3.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.
Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the We have no funding and/or competing interests to declare. Apologies for any confusion on this front. NA following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials." (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.
Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Reviewers' Responses to Questions
1. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. 3. Have the authors made all data underlying the findings in their manuscript fully available?
The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data-e.g. participant privacy or use of data from a third party-those must be specified. This paper can be considered as a subsection of a previously published paper in which an overview of altmetric mentions to US funded papers on cancer research. In this case, the authors focus on news stories. The paper is pretty straightforward and has no technical complications. While the motivation of the paper is of interest, the authors do not feed much from scientific communication literature on interests of scientists communicating with journalists, motivations, etc. There is a vast stream of literature on this which would greatly enrich, both the introduction as well as the discussion. The analysis the authors make is quite superficial without deepening on motivations, external factors that may affect being mentioned in news media (e.g., journal venue, press releases, authors' institutional status, authors' influence), and I would say this is more of an exploratory paper than anything else.
Thank you for the suggestion. We have incorporated additional information from scientific communication literature regarding scientists communicating with journalists.
We agree that this paper is primarily exploratory, as there is a gap in existing literature about Altmetric mentions specific to US-funded papers on cancer research by journalists. However, with this revision, we have worked to provide greater description of the methods and analyses in order to provide a solid foundation for future research building off of these exploratory findings.
pp. 2-3, 14 Beyond that, there are two specific sentences the authors make I do not agree and should be modified if accepted for publication later on.
-They indicate that partly, their novelty is on the journalists' interest on funded cancer research and use Altmetric.com as a 'better source' than others because it includes online news media.
We have addressed the issues with these two specific sentences in our revision. We have revised and expanded out the section on Altmetric.com, including how Almetric uses both hyperlinks and phrases to scientific paper.

pp. 4-5
While this may be partly true, the authors ignore in the text two important limitations of this source: 1) the list of news media is quite arbitrary. The link the authors provided no longer refers to the list of news media from Altmetric.com. this should be updated. 2) News media mentions are identified by hyperlinks to papers, which is something that not always happens when reporting research in news media. This may affect especially traditional media which has a lower online presence and may not include hyperlinks to scientific papers, hence the differences in the results.
-In page 4, paragraph 3 the authors state the following: 'Increasingly, researchers utilize Altmetric's database of more than 2500 global media sources'. There is no evidence of this whatsoever and no references are given.
This issue has been addressed through the revision of this paragraph. Additional citations have been added, along with six lines regarding the limitations of using Altmetric's database.

Reviewer #2
This is a mostly descriptive paper about coverage of cancer topics in the media; the topic of this paper is important and timely. The introduction and discussion ar e interesting. However, the methods and results were underdeveloped. This might be addressed by clarifying some of the definitions and how variables were measured/coded. Perhaps providing a few examples of what was coded as a mention, or adding a list of mentions for a couple of the top mentioned papers would help. I would recommend removing the chi-squared analyses and, instead, creating some visuals that demonstrate the relative differences for incidence/death/mentions by cancer type and media type. Attaching one example of how this might look. I think a set of graphs would be a lot more powerful than the lists of numbers currently included.
Thank you for the suggestion to help clarify our definitions and coding methods for the reader as well as illustrate our analyses more clearly. To address this comment, we have first expanded the description of our analytic variables in the methods section, including more description of Altmetric's definition of a media mention, as well as a definition and reference for common types of cancers.
We have also added Figure 1 to provide a visualization of relative differences in the ratio of estimated deaths per news mention across all common types of cancer. We provide a brief description to pp. 6-7, 4-12 accompany this figure on pages 6-7 as well, which now reads: "We also examined how mentions differed across estimated deaths for each common cancer type (Figure 1). These ratios ranged from 0 to 8618. With the exception of urinary and bladder cancers, which did not receive any mentions, the lowest death to mention ratio observed for breast cancer (ratio = 760.56, indicating greater coverage per estimated death), and the highest ratio observed for pancreatic cancer (ratio = 8618, indicating the least coverage per estimated death)." Some information about the size and characteristics of the audience of the different outlets could add to the understanding of the reach of the different types of cancer information. Is a mention in the NYT equivalent in audience reach to a mention in the St. Louis Post-Dispatch, for example. These numbers may be tricky to get, but there are likely estimates of audience or market share.
Thank you for this suggestion. We have added a column to Table 3 (p. 9) that incorporates the estimated monthly unique visitors for each publication, as an indicator of audience size and characteristics. Additionally, the Discussion (pp. 12-15) now includes reference to the differences between local and national news, as well as the changing digital media landscape.
pp. 9, 12-15 In addition, given the current political climate and description of many of these outlets as fake news and growing public distrust of science, it seems like the inclusion of science in a broad spectrum of media outlets is extremely important. Some discussion of these topics could be useful.
We appreciate this suggestion and have now incorporated this into both the Introduction (p. 4) and the Discussion (pp. 12-15).
pp. 12-15 Finally, there was some discussion of how journalists find science Thank you for this point. to report on and, from the lists shown in the paper, it seems that journal impact factor/visibility is probably a big part of it. Academics and academic institutions have been more visible and active on social media in recent years, which could influence the reporting of science if academics/academic institutions share science this way and "tag" journalists or journalistic outlets.
( https://www.sciencedirect.com/science/article/pii/S2211419X2030 029X ) includes the impact factor for each journal and it is considered in the Discussion (pp. 14).
Other possible edits: -First two sentences of last paragraph in the abstract are confusing, reword to clarify.
We have rewritten these two sentences to address the reviewer's concern.
p. 1 -The files included are the data and data collection files, but the data management and analysis files are not available at the currently provided link. Including the data is great, but the paper is not reproducible without the statistical code as well. Use of Microsoft Excel can be problematic for reproducibility (see https://www.washingtonpost.com/news/wonk/wp/2016/08/26/an-ala rming-number-of-scientific-papers-contain-excel-errors/ and Ziemann M, Eren Y, El-Osta A. Gene name errors are widespread in the scientific literature. Genome biology. 2016 Dec;17(1):1-3.
We have addressed this: https://zenodo.org/record/4075712 NA -Chi-squared can only find associations, not the direction of association, so this sentence needs to be re-worded or an analysis of the standardized residuals should be included to support the finding: "Traditional news sources included significantly more mentions of research on common cancer types (n = 240) compared to news mentions across digital native news sources (n = 204; X = 5.690, df = 1, p = .017)." One suggestion for rewording would be, "There was a significant association between news source type and mention of common cancer type research (n = 204; X = 5.690, df = 1, p = .017) with traditional news sources Thank you to the reviewer for noting this important distinction regarding chi-squared tests and providing a suggested revision to offer a clearer interpretation of the results, given the limitations of this statistical test. We have adopted the revised wording suggested by the reviewer, and the lines on page 7 now read: p. 7 including more mentions of research on common cancer types (n = 240) compared to news mentions across digital native news sources." It is a subtle distinction, but important given how chi-squared is computed. Following up with standardized residuals to determine which of the frequencies in the chi-squared were much different from expected would strengthen the results section and perhaps provide the authors and readers with additional insights.
" There was a significant association between news source type and mention of common cancer type research (n = 204; X = 5.690, df = 1, p = .017) with traditional news sources including more mentions of research on common cancer types (n = 240) compared to news mentions across digital native news sources." -Including the IQR in addition to the range would be helpful in understanding the data. Or, as suggested above, including the statistical code so that interested readers could examine the distribution of the mentions per online news source.
Following the suggestion of the reviewers, we have included a copy of our statistical code and dataset for interested readers to examine.
NA -The standard deviation being higher than the mean, along with the range being so wide for number of mentions, suggests that this distribution is skewed and the median should be reported instead. It looks like the median is 1, so the mean of 3 is definitely exaggerating the central tendency.
Thank you for this suggestion. To more accurately represent the distribution of mentions, we have updated this section to report the median, with the mean and range included in parentheses. This sentence on page 6 now reads: "Of the 213 articles that received online news mentions, the median number of mentions per article was 1 (mean = 3; range: 1-23, SD=3.8)." p. 6 -In table 1 it might be useful to add some sort of mentions/death or death/mention metric; it takes some work as the table currently is formatted to understand that, for example, pancreatic cancer is woefully under-reported given the amount of death (more than breast cancer! I had no idea.) Or, alternatively, a visual that compares the mortality rank and publicity rank or something similar so that this disconnect between incidence/mortality and publicity We appreciate the reviewer's suggestion to include a visual to illustrate the ratio of estimated annual deaths to news mentions. The suggestion of this metric is indeed helpful in demonstrating the variation in coverage across types of cancers, given their mortality estimates. To pp. 6-7 are more clear. …as a journalist might say, it seems like the authors have buried the lead.
address this, we have added the following text on pages 6-7: "We also examined how mentions differed across estimated deaths for each common cancer type (Figure 1). These ratios ranged from 0 to 8618. With the exception of urinary and bladder cancers, which did not receive any mentions, the lowest death to mention ratio observed for breast cancer (ratio = 760.56, indicating greater coverage per estimated death), and the highest ratio observed for pancreatic cancer (ratio = 8618, indicating the least coverage per estimated death). " We have also added Figure 1 to provide a visualization of these ratios as well.
-The column headings on Thank you for sharing this code with us. Although the present analysis was not conducted in R, this is quite helpful to view for our future work. In the spirit of the reviewer's comment, we have used Excel to generate Figure 1 to illustrate a comparison of estimated deaths per news mentions across common cancer types.

Reviewer #3
Interesting article and approach! Thank you for your comment.

It wasn't completely clear what "journalistic media" meant. A
We added a more concrete definition in the pp. 3, 6 more concrete definition within the Introduction would be helpful as you dig down into the Methods, perhaps in lieu of the one you provided (e.g., includes both print and online sources). For example, in the paragraph beginning with: "As Maggio et al. 's [41] full data set included a broad collection of news media organizations, we filtered out non-journalistic news media sources from the data set, leaving only online news media sources." What is a non-journalistic media source? What exactly was filtered out? A clearer definition (perhaps with examples) would be helpful. 2. Similarly, you describe coding articles for the presence of a mention. As someone who is unfamiliar with Altmetric, was coding an automated process, or was this done manually? If the latter, more information about how this was done would be helpful.
We have addressed this concern in our revision. We have revised and expanded out the section on Altmetric.com, including how Almetric codes its news sources.
pp. 4-5 3. Extensive detail provided regarding how the media-related data were obtained. However, a brief mention of where incidence/mortality data were derived from would be helpful, too, since that's a major aim of the paper.
Thank you for this suggestion. On page 6, we have added wording to mention where data on cancer incidence and mortality were obtained and added a citation to the Common Types of Cancer list provided by the National Cancer Institute. This section now reads: "Additionally, we examined journalistic news coverage of scientific articles across common types of cancer (i.e., defined by the National Cancer Institute as cancers with an estimated incidence of 40,000 or more cases per year), which included breast, lung, melanoma, colorectal, prostate, leukemia, liver, pancreatic, endometrial, kidney, Non-Hodgkin's lymphoma, thyroid, and urinary/bladder p. 6 cancers." 4. It's nice to have all coding for this project publicly available! Thank you. We agree. Results: 1. Minor issue but Table 2 is referenced in text before Table 1.
We have addressed this. pp. 7-9 2. Table 1 was particularly interesting, but little reporting or discussion of it was presented in the text. If part of the goal of this paper is to highlight discrepancies between morbidity/mortality and news coverage, I might highlight some "standouts" in the text. For example, lung cancer is responsible for ~150,000 deaths annually and received 35 online mentions, while melanoma (responsible for half as many deaths) received nearly identical coverage. A greater discussion of these points would serve to support your overall study aim.
Thank you for this suggestion. We have expanded our discussion of Table 1  Thank you for pointing us to this paper.
You will see that we incorporated the suggested structure into the discussion. In the Track Changes version, you can see the suggested structure via subheads, which was removed in the "clean" version.
pp. 12-16 2. There are portions of the Discussion section that would be better placed within the Results. (Example: First paragraph under the "Growing divide: traditional vs. digital-native news" heading) We moved that initial paragraph and several other portions of the Discussion section into the Results section.
p. 6 3. You mention that certain journal article topics (e.g., drugs or financing) are prioritized, it would be interesting to see what We have tried to address this in the literature section (i.e., Introduction).
pp. 1-6 situated within the context of previous research as well. I would assume that this is common practice, but is that something unique to this study, or is that aligned with previous research on the topic?
4. Something that's missing from this section is a discussion of how this work relates to intentional dissemination efforts from researchers. Within the Introduction, you write: "Findings could facilitate future dissemination and funding initiatives... This study lays the groundwork for future research that explores how online news media could be better incorporated into dissemination processes and knowledge translation strategies." You also cite the 2018 Brownson article but don't discuss it or any related articles within this section. More consideration within the Discussion is warranted, as it seems to be a logical "next step" for this type of work.
Thank you for the suggestion. We have incorporated additional information from scientific communication literature regarding scientists communicating with journalists. We have included Brownson (2018) in the text and by name in the Conclusion. We have also worked to better address dissemination processes and knowledge translation strategies in the Conclusion.
pp. 2-3, 14-16 We greatly appreciate your further review of our manuscript and hope that the revisions as described above and in the revised paper meet your expectations. Please do not hesitate to contact me with any questions or additional requests.
Best regards, Dr. Laura Moorhead Assistant Professor, Journalism