No support was received from any organization for the submitted work. NM is currently employed by Avalere Health LLC, a healthcare strategic advisory firm that serves both public and private sector clients. Aside from final comments and approval, her contributions to this manuscript were carried out prior to her employment at Avalere Health LLC. Her contributions to this manuscript are independent of Avalere Health LLC or its clients, do not reflect the views or opinions of Avalere Health or its clients, and were not supported by Avalere Health LLC or its clients in any way (e.g., no financial or content contribution). KD and SSV are authors of one of the articles examined (Vedula et al 2009); KD served as an unpaid expert witness in the 2008 litigation associated with the article and SSV served as a paid consultant. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.
Conceived and designed the experiments: LR SSV KD. Analyzed the data: LSW. Wrote the paper: LSW KD. Developed and carried out search strategies: LR SSV CNK LMR CT NM. Screened articles for eligibility: LSW LR SSV CNK NM KD. Created and revised data abstraction form: LSW KD. Abstracted data from eligible articles: LSW LR SSV CNK LMR CT KD. Communicated with article authors and others to confirm information on location of internal documents and identify additional eligible articles: KD. Contributed to final edits on manuscript: LSW LR SSV CNK LMR CT NM KD.
Current address: Avalere Health LLC, Washington, DC, United States of America
To describe the sources of internal company documents used in public health and healthcare research.
We searched PubMed and Embase for articles using internal company documents to address a research question about a health-related topic. Our primary interest was where authors obtained internal company documents for their research. We also extracted information on type of company, type of research question, type of internal documents, and funding source.
Our searches identified 9,305 citations of which 357 were eligible. Scanning of reference lists and consultation with colleagues identified 4 additional articles, resulting in 361 included articles. Most articles examined internal tobacco company documents (325/361; 90%). Articles using documents from pharmaceutical companies (20/361; 6%) were the next most common. Tobacco articles used documents from repositories; pharmaceutical documents were from a range of sources. Most included articles relied upon internal company documents obtained through litigation (350/361; 97%). The research questions posed were primarily about company strategies to promote or position the company and its products (326/361; 90%). Most articles (346/361; 96%) used information from miscellaneous documents such as memos or letters, or from unspecified types of documents. When explicit information about study funding was provided (290/361 articles), the most common source was the US-based National Cancer Institute. We developed an alternative and more sensitive search targeted at identifying additional research articles using internal pharmaceutical company documents, but the search retrieved an impractical number of citations for review.
Internal company documents provide an excellent source of information on health topics (e.g., corporate behavior, study data) exemplified by articles based on tobacco industry documents. Pharmaceutical and other industry documents appear to have been less used for research, indicating a need for funding for this type of research and well-indexed and curated repositories to provide researchers with ready access to the documents.
Even though the scientific research enterprise and healthcare decisions rely on the biomedical literature being complete and accurate, it is neither
Research on selective reporting and other reporting biases is made possible when the published literature can be compared with other sources of information about the same research studies, for example from research ethics committees
Our objective was to describe the characteristics of public health and healthcare research using internal company documents across industries. The ultimate goal of our research was to document for others the potential sources of accessible internal company data for public health and healthcare research, particularly in the area of pharmaceutical research, and, in doing so, to take the first steps toward exploring the current use and future potential for repositories of internal company information.
Articles were eligible if they described a study that addressed a research question or objective related to public health or healthcare, and internal company documents were explicitly referred to as the source of data (i.e., information) examined in the study. We considered internal company documents to include emails, memoranda, reports (including those with study data), presentations, meeting minutes, and other documents not originally intended to be publicly available. If documents were prepared for outside entities that were not employees or subcontractors, then we did not consider them to be internal company documents (e.g., we did not consider clinical trial protocols which are shared with institutional review boards and investigators to be internal company documents, nor did we include published research performed by company staff or contractors). We defined public health and healthcare in a broad sense to include studies of incidence, prevalence, etiology, prevention, diagnosis, harm, or prognosis, as well as any other studies concerning products or materials with health effects. Eligible studies could be qualitative (including descriptive and exploratory studies) or quantitative primary studies. While we recognize that systematic reviews may include additional information from internal company documents, we did not include them because this would have necessitated first, identifying all systematic reviews, and then second, checking each of them to see whether they used internal company documents.
Initially, we used the authors' combined file of articles meeting our eligibility criteria (n = 35 articles using pharmaceutical, tobacco and other industry documents) as a “reference set” against which we tested various search strategies. Four of the authors (NM, SSV, LR, and CNK) developed a search strategy using relevant Medical Subject Heading (MeSH) terms and title and abstract text words from the reference set. This initial search retrieved about 2 million citations in PubMed and, considering this too many citations to review, we started over.
Working with an informationist (LMR), the team revised the PubMed search strategy by identifying a more targeted combination of keywords and MeSH headings from the reference set and running separate searches. Examples of MeSH headings included industry[majr], disclosure[mesh], and access to information[mesh], and keywords included terms such as “industry documents” (see
We also report in the Results section an additional search we did after this one, in an attempt to find more articles that used internal pharmaceutical company documents. We based this additional search on articles found by the electronic search strategy finally used, described above.
To determine whether articles met inclusion criteria for our study, two authors independently screened the title and abstract of each citation and then independently examined the full text of each article considered unclear or possibly eligible as a result of screening. Differences in opinion were resolved through discussion.
Two authors independently extracted data from eligible English-language articles using a standardized online data extraction form. Articles not in English were assigned to a single data extractor with expertise in the language. We extracted information on the following items: language and date of publication, type of company (e.g., tobacco, pharmaceutical); source(s) of internal documents (e.g., litigation, U.S. Freedom of Information Act (FOIA) request); type of research question (e.g., about strategic behavior on the part of a company or industry, about effects of a therapeutic intervention); type of internal documents (e.g., research studies, internal memos); and funding source (e.g., government, non-profit, for-profit) for the study (see
When an included article focused on research methods, one author classified the article into one of the following categories: 1) criticism of industry research (e.g., suggestion of misconduct or problems with dissemination of research), 2) exploration of company research methods that was not focused on criticism of the company, or 3) exploration of methods for accessing or analyzing internal company documents for the purposes of non-company research. A second author verified the classification with disagreements resolved through discussion.
One author collected additional details via email correspondence with authors of pharmaceutical research articles when the articles contained insufficient or unclear information about the location of internal company documents. These details were abstracted into a table, and a second author read the emails to verify the abstraction.
One author compared the results of our searches to known reference standards of research articles using internal company documents. For tobacco articles, we used the online Tobacco Documents Bibliography at the Tobacco Control Archives held by the University of California, San Francisco (UCSF)
For research articles using internal company documents for other types of companies, we were not aware of a source we could use as a true reference standard. We were particularly interested in identifying articles using pharmaceutical company documents and therefore applied three methods to identify additional articles we might have missed through our electronic searches.
First, we used
Second, one author visited the website of the Drug Industry Document Archive (DIDA)
Finally, we retrieved a few potentially eligible articles through
We performed descriptive statistical analyses, including counts of the number of studies with different characteristics, and cross tabulations of the joint distribution of study characteristics.
Our search of PubMed and Embase retrieved 9,305 unique records. After screening, 357 articles were classified as eligible for the study. Our searches of other sources to identify additional articles using pharmaceutical documents identified two from
Source of internal documents | Tobacco | Pharmaceuticals | Manufacturing | Mining | Transportation | Alcohol | Other companies |
Total |
N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) |
|
Litigation |
324 (>99) | 18 (90) | 6 (67) | 1 (50) | 1 (50) | 1 (100) | 1 (20) | 351 (97) |
U.S. Freedom of Information Act (FOIA) | 2 (1) | 0 (0) | 0 (0) | 0 (0) | 1 (50) | 0 (0) | 0 (0) | 2 (1) |
Company | 0 (0) | 2 (10) | 2 (22) | 0 (0) | 0 (0) | 0 (0) | 4 (80) | 8 (2) |
Whistleblower | 1 (<1) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (<1) |
Unknown | 0 (0) | 0 (0) | 1 (11) | 1 (13) | 0 (0) | 0 (0) | 0 (0) | 1 (<1) |
Other sources |
1 (<1) | 0 (0) | 0 (0) | 0 (0) | 1 (50) | 0 (0) | 0 (0) | 2 (<1) |
Total |
325 (100) | 20 (100) | 9 (100) | 2 (100) | 2 (100) | 1 (100) | 5 (100) |
The articles using documents from other companies were one article each using documents from a hospital, a physician practice management organization, a soft drinks distributor, and a nuclear plant, and one article using internal documents from six different companies including an agribusiness and a utility company.
The totals in this column equal the number of articles relying upon a particular source of documents, minus three instances of duplicate classification by type of company within category of document source. These instances were: one article with litigation source was classified as both tobacco and alcohol; one article with FOIA source was classified as both tobacco and transportation; and one article with unknown source was classified as both manufacturing and mining. The overall column total is not shown, as it is greater than the total number of included articles (n = 361) because several articles relied upon multiple sources for documents.
The litigation-related source of documents for one pharmaceutical article was a leak from legal proceedings.
The other sources of documents were: private archives of a company consultant (1 tobacco article) and records from a bankruptcy (1 transportation article).
The totals in this row equal the number of articles for each type of company, minus instances of duplicate sources of documents. Two tobacco articles relying upon FOIA for documents and one tobacco article relying upon other sources of company documents (the private archives of a company consultant) also relied upon documents from litigation, and one transportation article relied upon both litigation and FOIA. The totals for the tobacco and transportation article columns are therefore not equal to the sum of the classifications within the columns. The overall row total is not shown, as it is greater than the total number of included articles (n = 361) because three articles were classified with two types of companies.
Most included articles relied upon internal company documents obtained through litigation (350/361; 97%) (see
The research questions posed in included articles were primarily about company strategies to promote or position the company and its products (326/361; 90%) (see
Types of questions | Tobacco | Pharmaceuticals | Manufacturing | Mining | Transportation | Alcohol | Other | Total questions |
N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) |
|
Company's strategic behavior (eg, marketing) | 303 (93) | 15 (75) | 6 (67) | 2 (100) | 1 (50) | 1 (100) | 1 (20) | 326 (90) |
Company's other behavior (eg, safety) | 7 (2) | 1 (5) | 3 (33) | 0 (0) | 0 (0) | 0 (0) | 1 (20) | 12 (3) |
Health effects of exposure or intervention | 19 (5.9) | 6 (30) | 5 (56) | 0 (0) | 1 (50) | 0 (0) | 1 (20) | 33 (9) |
Therapeutic intervention | 1 (<1) | 7 (35) | 0 (0) | 0 (0) | 1 (50) | 0 (0) | 0 (0) | 9 (2) |
Prevalence of intervention, exposure or outcome | 2 (1) | 0 (0) | 2 (22) | 0 (0) | 0 (0) | 0 (0) | 2 (40) | 6 (2) |
Research methods | 31 (10) | 9 (45) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 40 (11) |
Total |
325 (100) | 20 (100) | 9 (100) | 2 (100) | 2 (100) | 1 (100) | 5 (100) |
The totals in this column equal the number of articles asking a particular type of question, minus instances of duplicate classification by type of company within category of type of question. These instances were: Strategic behavior questions were asked by articles classified as both tobacco and transportation, both mining and manufacturing, and both tobacco and alcohol; a therapeutic intervention question was asked by the article classified as both tobacco and transportation. The overall column total is not shown, as it is greater than the total number of included articles (n = 361) because several articles posed multiple types of questions.
The totals in this row equal the total number of articles for each type of company, minus instances where articles asked multiple types of questions, of which there are too many to list. The totals for the columns are therefore not equal to the sum of the classifications within the columns. The overall row total is not shown, as it is greater than the total number of included articles (N = 361) because three articles were classified with two types of companies.
It was often difficult to identify the exact type of internal company documents used in the articles. Our interest was not in the format of the document but rather in the type of document information that appeared to have been used in the article. We were interested in whether the document contained quantitative study data and if so whether the data was produced by the company itself or by another entity acting on behalf of the company. We also wished to capture whether the document was the result of routine company activities. We therefore classified the type of document information as belonging to one or more of four categories: 1) quantitative study data from internal company studies (e.g., analysis or re-analysis of quantitative data from studies conducted by the company), 2) quantitative data from non-company studies (e.g., quantitative data quoted from market research conducted on behalf of a company), 3) data from company records collected as part of routine company activities (e.g., employee records), and 4) ‘other’ types of data, information generally from miscellaneous documents such as memos or letters, or from unspecified types of documents. Most studies in the review (344/361; 95%) were classified as using ‘other’ types of company data (see
Types of internal company data | Tobacco | Pharmaceuticals | Manufacturing | Mining | Transportation | Alcohol | Other | Total |
N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) |
|
Quantitative data from company study | 61 (19) | 9 (45) | 3 (33) | 1 (50) | 0 (0) | 0 (0) | 0 (0) | 73 (20) |
Quantitative data from non-company study | 38 (12) | 0 (0) | 1 (11) | 0 (0) | 0 (0) | 0 (0) | 0 (20) | 39 (11) |
Data from company day-to-day records | 3 (1) | 2 (10) | 3 (33) | 0 (0) | 1 (50) | 0 (0) | 5 (100) | 14 (4) |
Other types of company data/information or type is unclear | 321 (99) | 14 (70) | 6 (67) | 2 (100) | 1 (50) | 1 (100) | 4 (80) | 344 (95) |
Total |
325 (100) | 20 (100) | 9 (100) | 2 (100) | 2 (100) | 1 (100) | 5 (100) |
The totals in this column equal the number of articles using a particular type of data, minus instances of duplicate classification by type of company within category of type of data. These instances were: Other types of data were used by articles classified as both tobacco and transportation, both mining and manufacturing, and both tobacco and alcohol, and quantitative data from internal company studies were used by the article classified as both mining and manufacturing. The overall column total is not shown, as it is greater than the total number of included articles (n = 361) because several articles used multiple types of internal documents.
The totals in this row equal the total number of articles for each type of company, minus instances where articles used multiple types of data, of which there are too many to list. The totals for the columns are therefore not equal to the sum of the classifications within the columns. The overall row total is not shown, as it is greater than the total number of included articles (N = 361) because three articles were classified with two types of companies.
Articles describing studies using tobacco company documents consistently referred to physical or online tobacco company document repositories as the location of the documents used in the study. Articles describing studies using pharmaceutical company documents did not have a consistent source of documents, and we investigated the current location of those documents. Of the 20 articles using internal pharmaceutical company documents, two used documents made available directly to the researchers by the company, and the remainder used documents released as a result of litigation (n = 18, including one case in which documents were leaked from litigation) (see
Company and product name |
Source(s) of documents according to article/correspondence with author | Location(s) of documents as of May 10, 2013 | Article(s) |
Bayer: cerivastatin (Baycol). | Documents from litigation. Documents part of the public record through Hollis N. Halton v. Bayer, Nueces County Clerk, Tx. | Documents part of court records. No online link to documents. | Psaty et al. 2004 |
Eli Lilly: olanzapine (Zyprexa). | Documents from litigation, available at |
Documents part of court records. Online link to documents not active as of 5/10/13. | Applbaum 2009 |
Eli Lilly: olanzapine (Zyprexa). | Documents leaked from litigation. | Author says the documents are available at |
Spielmans 2009 |
Eli Lilly: olanzapine (Zyprexa). | Documents from litigation (two lawsuits). Unpublished analyses by sponsor of premarket safety data. | Documents currently available through |
Woods et al. 2011 |
Glaxo Smith Kline (GSK): paroxetine (Paxil) | Documents from litigation. The expert report was based on 3-day examination of files at company headquarters by Dr. Breggin. | Actual documents used are not publicly accessible, only psychiatric expert report. Report available at the court (Moffett v. GSK, United States District Court for the Southern District of Mississippi) and at |
Breggin 2006 |
Glaxo Smith Kline (GSK): paroxetine (Paxil) | Documents from litigation. Authors had access to confidential documents as a consequence of their roles in litigation. Some documents in the case have been released into the public domain. | Documents part of court records for |
Jureidini et al. 2008 |
Merck: rofecoxib (Vioxx) | Documents from litigation: |
Documents part of court records. Online link to documents not active as of 5/10/13. | Psaty and Kronmal 2008 |
Merck: rofecoxib (Vioxx) | Documents available through litigation. Authors had access to internal Merck documents created in 1998–2006, and obtained through discovery in legal proceedings, |
All legal documents and the dataset used in the first three articles are available in the Drug Industry Document Archive (DIDA) at |
Hill et al. 2008 |
Pfizer & Parke-Davis, Division of Warner-Lambert: gabapentin (Neurontin). | Documents from litigation. Authors obtained access to the data because they served as unpaid expert witnesses for the plaintiff in the whistleblower litigation |
Documents available in DIDA at |
Steinman et al. 2006 |
Pfizer & Parke-Davis, Division of Warner-Lambert: gabapentin (Neurontin). | Documents from litigation. Authors obtained access to the data because two of them served as consultants for the plaintiff in |
All of these documents are available in DIDA at |
Vedula et al. 2009 |
Wyeth: conjugated equine estrogens and medroxyprogesterone acetate (Prempro) | Documents from litigation. Expert report by Dr. Fugh-Berman based on documents from Wyeth interactions with DesignWrite (medical education and communications company). Documents used in expert MDL Docket no 4:03CV1507 WRW, and used again in |
Prempro Products Liability Litigation now available at |
Fugh-Berman 2010 |
Merck: MDMA (“ecstasy”). | Merck Archives. Internal company documents recording the history of drug development. | Internal documents not publicly available. | Bernschneider-Reif et al. 2006 |
Merck, Sharpe and Dohme: chlorothiazide (Diuril). | Merck Archives. Internal company documents recording the history of development and promotion of chlorothiazide. | Internal documents not publicly available. | Greene 2005 |
Documents are grouped in rows where the articles are linked by a common set of authors working with the same set of documents.
Vedula and colleagues (Vedula et al 2009) included in their analysis internal company documents from a 2004 litigation that were also used by other authors in two articles (Steinman et al 2007 and Landefeld and Steinman 2009), and in addition analyzed documents from a 2008 litigation that were not used in other articles.
Explicit information about study funding was found in 290/361 (80%) articles, of which a small number (10/290; 3%) specified that the study had not been funded and 280 listed funding. Among the 280 articles describing funding received specifically for the study, the most common source of funding was the U.S. government, followed by non-profit organizations (see
Type of funding | Tobacco | Pharmaceuticals | Manufacturing | Mining | Transportation | Alcohol | Other | Total |
N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | N (%) | |
No information about how research using documents was funded | 54 (17) | 9 (43) | 4 (44) | 2 (100) | 1 (50) | 0 (0) | 2 (40) | 71 (20) |
Statement that research using documents was not funded | 2 (1) | 8 (67) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 10 (3) |
Funding from government, US | 242 (74) | 3 (15) | 1 (11) | 0 (0) | 0 (0) | 0 (0) | 2 (40) | 248 (69) |
Funding from government, not US | 36 (11) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 36 (10) |
Funding from non-profit organization | 80 (25) | 3 (17) | 2 (22) | 0 (0) | 1 (50) | 1 (100.0) | 1 (20) | 86 (24) |
Funding from company being studied | 0 (0) | 0 (0) | 1 (11) | 0 (0) | 0 (0) | 0 (0) | 1 (20) | 2 (<1) |
Funding from other sources |
6 (2) | 0 (0) | 1 (11) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 7 (2) |
Total |
325 (100) | 20 (100) | 9 (100) | 2 (100) | 2 (100) | 1 (100) | 5 (100) |
The totals in this column equal the number of articles reporting a particular type of funding, minus instances of duplicate classification by type of company within funding category. These instances were: There was no information on funding for the article classified as both manufacturing and mining, and non-profit, non-governmental funding was used by the articles classified as both tobacco and transportation and both tobacco and alcohol. The overall column total is greater than the total number of included articles (N = 361) because some articles reported multiple types of funding.
Other funding sources include Blue Cross Blue Shield (4 tobacco articles), the World Health Organization (2 tobacco articles), and funding from a law firm (1 manufacturing article).
The totals in this row equal the total number of articles reporting funding for each type of company, minus instances where articles reported multiple types of funding, of which there are too many to list. The totals for the columns are therefore not equal to the sum of the classifications within the columns. The overall row total is greater than the total number of included articles (N = 361) because three articles were classified with two types of companies.
The source of our reference standard for tobacco studies, the Tobacco Documents Bibliography, contained 579 journal articles published in March 2011 or earlier. Our searches identified 337/579 (58%) of these records, of which 307/337 (91%) were deemed eligible and included in our study. Of the 242 remaining journal articles contained in the reference standard, 173 were indexed in PubMed or Embase and were not captured by our searches. On the other hand, 18/325 (6%) of the tobacco company records in our study were not included in the Tobacco Documents Bibliography.
Given the low number of citations reporting research using internal pharmaceutical company documents that were captured by the search we finally used, an informationist designed an additional strategy tailored to be more sensitive and to identify research using internal pharmaceutical company documents, and a second informationist reviewed the strategy. Eighteen (two later determined to be not eligible) pharmaceutical company research articles (15 PubMed records and 3 Embase) retrieved by our original search formed the basis for this “drug industry” search strategy. One author reviewed the reference lists of the 18 articles and selected references on the topic of internal pharmaceutical company documents (n = 53), and a colleague provided a list of additional related articles (n = 8). Keywords and MeSH terms from the 18 originally included articles, the 53 selected references, and the 8 related articles were combined into a more targeted and potentially more sensitive search strategy, which was run in PubMed. This search strategy captured 17/18 of the PubMed citations to the pharmaceutical company articles finally included in our study. To achieve this level of sensitivity, however, the new, more sensitive PubMed search identified 26,399 “hits”, of which 25,605 had not been identified by the previous search, and we decided that this was an unmanageable number for continuing to search for eligible articles.
Internal company documents serve as a valuable source of information about industry activities for those who wish to know about the impact of those activities upon the health of the public. Internal documents from pharmaceutical companies include not only information on marketing and policy activities but also contain quantitative and other data related to clinical trials carried out on company products. Data from all trials are critical for a complete and accurate assessment of interventions within systematic reviews
What we learned, first, is that thousands of internal tobacco company documents, mainly released through litigation, are located in repositories around the world
The second thing we learned was that identification of non-tobacco studies using internal company documents was harder than we had anticipated. Only 36/361 articles that we identified used non-tobacco sources, and more than half of these (20/36) were concerned with pharmaceutical company documents. We made every effort to ensure a thorough search of PubMed and Embase databases to retrieve all relevant documents, and to be practical we designed a search strategy that elected precision over sensitivity. It is possible that there are additional relevant articles from other non-tobacco industries (e.g., the chemical industry, the food and agricultural industry) that our search failed to retrieve. We need to identify better search terms for retrieving articles that use internal company documents, and consider consistent indexing of such articles. New machine learning approaches to searching databases may be a way to improve the retrieval of difficult-to-find articles as well. We also found that in contrast to tobacco company documents, which are contained in well-indexed repositories developed to facilitate public access to information, the pharmaceutical company documents are available in a range of sites, not all of which are well-known or accessible to the public. In addition, bibliographies of articles using internal pharmaceutical company documents, similar to the Tobacco Documents Bibliography, would greatly ease the identification of research using internal pharmaceutical company data.
These findings point to the importance of having one or more indexed and searchable repositories in place to assure comprehensive identification of internal company documents. Litigation has been an important source of internal company documents for research, and some documents from pharmaceutical company litigation have now been placed in DIDA; indeed, DIDA was started with funds from litigation. Nevertheless, the majority of pharmaceutical company documents in the studies we found were made available through websites (some no longer accessible), were obtained through collaboration with the company, or are court documents that one must know exist to be able to find. Comprehensive well-indexed and searchable repositories of internal company documents from pharmaceutical and other industries, similar to the repositories that exist for documents from the tobacco industry, are critical for the development of a program of research using other types of internal documents, including restorative authorship
The third thing we learned is that funding for research using internal company documents is uneven. Where there has been funding available, notably for the tobacco-related research, many important research projects have been conducted. Three-quarters of the tobacco research was funded by the U.S. government, primarily the NCI. Indeed, the NCI established a program of research and actively solicited researchers to develop projects using internal tobacco documents (e.g.,
In contrast, research studies making use of internal pharmaceutical company documents have typically not been federally funded. Most articles we identified (13/20) reported no funding or had no explicit information about funding for the research, and only 3/20 reported US state or federal government funding. If, to date, only a handful of research studies have used internal pharmaceutical company documents, then it may be because of lack of available funding. Given the importance of research using internal tobacco documents to our current knowledge and views about tobacco and its health effects, a similar investment in other areas, including the pharmaceutical area, could also yield potentially important findings.
We do not know whether our initial search would have found more, fewer, or the same number of studies using pharmaceutical documents in our reference standard, if we had followed through and screened the over 2 million citations retrieved. It is possible that the number of research articles using pharmaceutical company documents is actually small and that we found most of them. We know that of the articles we identified, there was considerable overlap in documents, authors, and drugs examined. If we have identified most of the relevant articles, it highlights all there is to be gained by making all publicly available source documents (especially clinical study reports and datasets) accessible in one or a few locations, assuming this will prompt new research
The studies of internal pharmaceutical company documents we identified, and others, have provided important signals for evidence-based medicine, indicating that the published literature, generally, is not always reliable and that much of what is known remains unpublished
While our particular interest in this project was pharmaceutical company documents, other company documents released through litigation or other means and potentially useful for health-related research and for setting governmental standards (e.g., regarding environmental hazards) should also be made centrally available to researchers. These collections of corporate documents should ideally be linked or merged, as companies often collaborate across industries (e.g., large corporations control both tobacco and alcohol companies) to promote their interests, often at the expense of public health
Our study is limited by our focus on relevant research indexed in PubMed or Embase, and by a search date that is now more than three years in the past. However, while our study is limited by the possibility that we overlooked relevant research using internal company documents, including documents used in systematic reviews, we are able to conclude that our findings highlight the great need for well-indexed and curated repositories so that researchers can have ready access to internal company documents. The existing DIDA repository is a good start but additional funds are required to make it maximally useful to researchers. Each document in the repository needs to have consistent indexing information (metadata) such as title, author, date, bates number, and document type. This would either need to be provided (e.g., by the plaintiffs' attorneys) or a vendor would have to be hired to create it. In addition, funding is needed for DIDA's ongoing curation to support, for example, information science and programming staff. Linking repositories and bibliographies (e.g., unpublished data in systematic reviews) should be explored, as well as linking these sources and registers of studies (e.g., ClinicalTrials.gov and the Cochrane Register of Studies). The research articles we identified relating to tobacco industry documents are a testament to how information in internal company documents can contribute to improving the community's understanding of enhancing transparency in communicating research findings.
(DOCX)
(PDF)
(DOCX)
Many thanks to colleagues who read articles in non-English languages to assess eligibility: Isabelle Boutron, Peter Doshi, Stephan Ehrhardt, Tom Jefferson, Karmela Krleza-Jeric, Tianjing Li, Joerg Meerpohl, Isabel Rodriguez-Barraquer, Rob Scholten, Gerard Urrutia-Cuchi.
Thanks also to authors of eligible studies who filled in details as necessary of where they obtained the internal company documents their articles referenced. Thanks to Tom Greene for advising on potential ways of accessing federal court papers.