Successful publishing of an article depends on several factors, including the structure of the main text, the so-called introduction, methods, results and discussion structure (IMRAD). The first objective of our work is to provide recent results on the number of paragraphs (pars.) per section used in articles published in major medical journals. Our second objective is the investigation of other structural elements, i.e., number of tables, figures and references and the availability of supplementary material. We analyzed data from randomly selected original articles published in years 2005, 2010 and 2015 from the journals The BMJ, The Journal of the American Medical Association, The Lancet, The New England Journal of Medicine and PLOS Medicine. Per journal and year 30 articles were investigated. Random effect meta-analyses were performed to provide pooled estimates. The effect of time was analyzed by linear mixed models. All articles followed the IMRAD structure. The number of pars. per section increased for all journals over time with 1.08 (95% confidence interval (CI): 0.70–1.46) pars. per every two years. The largest increase was observed for the methods section (0.29 pars. per year; 95% confidence interval (CI): 0.19–0.39). PLOS Medicine had the highest number of pars. The number of tables did not change, but number of figures and references increased slightly. Not only the standard IMRAD structure should be used to increase the likelihood for publication of an article but also the general layout of the target journal. Supplementary material has become standard. If no journal-specific information is available, authors should use 3/10/9/8 pars. for the introduction/methods/results/discussion sections.
Citation: Heßler N, Rottmann M, Ziegler A (2020) Empirical analysis of the text structure of original research articles in medical journals. PLoS ONE 15(10): e0240288. https://doi.org/10.1371/journal.pone.0240288
Editor: Omid Beiki, Karolinska Institutet, SWEDEN
Received: May 26, 2020; Accepted: September 24, 2020; Published: October 8, 2020
Copyright: © 2020 Heßler et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have read the journal’s policy and the authors of this manuscript have the following competing interests: Author MR became a paid employee of Dr. Hüsing Aktuar GmbH after the completion of this study. The present position of MR does not alter our adherence to PLOS ONE policies on sharing data and materials. There are no patents, products in development or marketed products to declare. Additionally, AZ is a licensed Tim Albert trainer and has held several courses in the past based on Albert’s concept and intends to continue these courses.
Publish and prosper is one of the sayings scientists often encounter. Working in the field of research means constant publishing. The competition among scientists is strong, and journal space is limited. However, the world of publication can be a black box, and writing is challenging for many . Concurrently, the art of scientific communication is rarely taught, and scientific writing distinguishes fundamentally from literary writing. Only a few authors focus on the process of writing. One such procedural system was developed by Albert , and he demystified article writing using a 10-step process . In one of these steps, Albert’s concept looks at a sales model for the article. One key element of this sales approach to publication is that the target audience is the journal editor because the editor is the gatekeeper for acceptance or rejection of an article or even the early rejection, sometimes called desk-rejections. As always the first impression of a manuscript is the best impression . Thrower has provided reasons why he accepted  or rejected  manuscripts for publication. One important aspect is the presentation of the material . It is clear that authors should follow author instructions of the target journal. The referencing pattern also plays a role, and “a well formatted paper makes the editor happy as he need not to do anything further from his side as far as presentation is concerned” . Reasons for early rejections of articles have been given in an editorial by Froese and Bader , and they stated that manuscripts not following the structure a standard article are likely to get rejected. In this context, we need to look at the question what the typical structure of an article is.
The typical structure for the main text of an article in a medical journal is the introduction, methods, results and discussion (IMRAD) structure. Although it was recommended as standard structure at the beginning of the 20th century, it became the primary structure only after 1965, and it was the only article structure in the 1980s . The major change in the mid 1960s is most likely related to a conference of medical editors held at the 19th General Assembly of the World Medical Association , during which Hill  suggested that research articles should answer the four questions: why did you start, what did you do, what answer did you get and what does it mean anyway. Little research has been done on details of these four sections of an article [2, 12]. Albert  an analysis of 50 consecutive articles published after June 1, 1997, from each of the 6 journals Archives of Disease in Childhood (Arch Dis Child), Journal of Pediatrics (J Pediatr), Pediatric Research (Pediatr Res), The BMJ (BMJ), The Lancet (Lancet) and The New England Journal of Medicine (NEJM). He reported the means and standard deviations (SD) of the number of paragraphs per section as given in Table 1. From this, Albert  derived the `typical journal structure`, which he calls 2/7/7/6 and which means that 2 paragraphs should be used for the introduction, 7 for methods, 7 for results and 6 for the discussion. Albert  was less rigorous and suggested a 2-3/4-6/4-6/5-8 structure. Other recommendations based on Albert’s empirical work have also been proposed, such as 2/5/5/4 for shorter articles . Soares de Araújo  considered original articles published in the January 2012 and 2013 issues of Arquivos Brasileiros de Cardiologia and the first two issues of the Journal of the American College of Cardiology from the same years and recommended 3/6-9/4-9/≤10 paragraphs. If we meta-analyze the data provided by Albert  per section using the DerSimonian and Laird  approach, the recommended structure is 3/8/7/7 (Table 1), i.e., 25 paragraphs in total. Albert’s work is, however, more than 20 years old and has not been updated.
The first aim of our article is to provide recent results on characteristics of the structure of original articles published in major medical journals. Since the reporting of studies changed over the past 20 years due to the availability of reporting guidelines, such as the CONSORT  or STARD  statement, we hypothesize that more recent journal articles have an increasing number of paragraphs over time, especially in the methods section. To this end, we analyze data from randomly selected original articles published in years 2005, 2010 and 2015 from the journals BMJ, The Journal of the American Medical Association (JAMA), Lancet, NEJM and PLOS Medicine (PLOS). We expect that PLOS, an electronic only journal, has more paragraphs than the print journals BMJ, JAMA, Lancet and NEJM because there are, in principle, no page restrictions. The second aim of our work is to expand the statistics to other structural elements, i.e., the number of tables, figures and references as well as the availability of supplementary material. We expect that recently published articles have more likely supplementary material.
Materials and methods
In original articles published in the English language medical journals BMJ, JAMA, Lancet, NEJM and PLOS in the years 2005, 2010 and 2015, we investigated the number of paragraphs per section. Per journal and year of publication we randomly selected 30 articles, totaling to 450 original research articles. We checked the presence of the IMRAD structure, counted the number of tables, figures and references and checked if supplementary material was available for an article. Data were extracted by M.R.
For each year and each journal, means and SD were calculated for continuous outcomes, and absolute and relative frequencies were reported for the availability of supplementary material. The DerSimonian and Laird  approach was used to perform random effect (RE) meta-analyses which allows for variability in the number of paragraphs between journals and over time. These analyses were performed within each journal over the years, within each year over the journals, and for all years and all journals together. Pooled RE estimates and standard errors were calculated. The effect of time was investigated by linear mixed models with journal as RE and year as fixed effect (FE) for the number of paragraphs and the proportion of journals with supplementary information. Effect estimates and corresponding 95% CIs were reported. The specific hypothesis that PLOS has more paragraphs than the print journals was investigated with a mixed linear model with year as RE and journal as FE. Methodological details are provided in the S1 Appendix.
No adjustments were made for multiple testing, and the significance level was set to 0.05 for all analyses. All statistical analyses were done in R version 3.6.2. Data and analysis code in Markdown are available as supplementary material.
A total of 450 original research articles from 5 medical journals were analyzed. The IMRAD structure was used in all of them. S1 Table displays the descriptive statistics for the number of paragraphs per section for each year and each journal. An increase in the number of paragraphs per section and in total can be seen for all journals over time. The increase is, however, small for JAMA and NEJM. An obvious increase in the methods section can be observed for BMJ and Lancet with an average of 5 and four additional paragraphs, respectively (Fig 1). The increase in the number of paragraphs was also visible in PLOS with one, 5, three and two additional paragraphs in the introduction, methods, results and discussion for years 2015 when compared with 2005 (S1 Table and Fig 1).
Pooled means of number of paragraphs (black squares) and 95% confidence intervals (CI, lines) are displayed for each journal for A) methods section in 2005, B) methods section in 2010, C) methods section in 2015 and D) the total number of paragraphs in 2015, respectively. Summary statistics (black diamond) was calculated using the random effects DerSimonian & Laird approach . BMJ: The BMJ, JAMA: The Journal of the American Medical Association, NEJM: The New England Journal of Medicine, PLOS: PLOS Medicine.
While the number of figures and tables overall remained rather constant, the number of references increased between 2005 and 2010 and 2015, respectively, with the largest increase in the number of references for BMJ and PLOS (S1 Table). The most striking change is seen for the number of original articles with supplementary material. For example, while no JAMA article had supplementary material in 2005, 97% (29 of 30) had supplementary material in 2015.
Table 2 shows article characteristics by year averaged over the 5 journals. The smallest and largest increase per year for the number of paragraphs was observed for the introduction (0.03 pars. per year, 95% CI: 0.00–0.06) and methods sections (0.29 pars. per year, 95% CI: 0.19–0.39). While the standard article structure was 3/9/9/8 in 2005, it increased to 3/11/10/8 in 2015. On average within 10 years, methods thus increased by two paragraphs and results by one paragraph. This corresponds to an increase of approximately one paragraph every two years (1.08 pars. per 2 years; 95% CI: 0.70–1.45). Consideration of all years and all journals, the pooled standard article structure is 3/10/9/8.
The number of tables did not alter (-0.02 per year; 95% CI: -0.06–0.02), and figures increased slightly (0.08 per year; 95% CI: 0.03–0.13). Per year 0.75 more references were observed when averaged over the journals (95% CI: 0.30–1.20). This increase is, however, primarily caused by BMJ and PLOS, while the number of references was similar over the years for JAMA, Lancet and NEJM.
During this period, the percentage of articles with supplementary material increased from 39% in 2005 to 93% in 2015. This confirms our hypothesis about the increasing number of supplementary material per years (p < 0.001).
Results per journal pooled over the years are displayed in Table 3 and S1 and S2 Figs. The only considered online only journal PLOS had the largest number of paragraphs in the introduction, results and discussion sections. While PLOS did not have significantly more paragraphs in the methods section, it had approximately one additional paragraph in the introduction and the discussion, respectively. On average, PLOS articles hat 3.52 paragraphs more than the other journals (95% CI: 1.60–5.43). The NEJM had the lowest number of paragraphs per article among the 5 journals considered. It is the only journal still with a 2 before the decimal point for the average number of paragraphs in the introduction. Only BMJ and NEJM had less than 10 paragraphs on average for the methods, and NEJM had approximately three fewer paragraphs in the discussion compared with the other four medical journals considered. Articles published in NEJM also had the lowest number of tables and references. However, BMJ and Lancet had the lowest number of figures. Unexpectedly, the online only journal PLOS did not have more tables than the print journals (0.10; 95% CI: -0.29–0.49), but 1.63 (95% CI 1.14–2.13) more figures and 14.45 (95% CI: 9.85–19.05) more references compared to the print journals.
All original articles from the 5 major medical journals considered followed the IMRAD structure. From 2005 to 2015, the total number of paragraphs increased by one every two years, and the largest increase was observed for the methods section. While the NEJM had a large number of paragraphs in the Albert analysis from 1996 , it had the smallest number of paragraphs averaged over the years. As expected, PLOS had the highest number of paragraphs, and PLOS articles were approximately 3.5 paragraphs longer than articles in the print journals. PLOS also had the largest number of figures and references per article. Compared with 2005, it is now standard for all 5 investigated journals to have supplementary material.
Albert used the number of paragraphs instead of words as measure for the text structure of a scientific article because paragraphs are a more manageable unit than words alone . Paragraphs keep the potential reader interested when they are written so that they “become in effect a series of inverted triangles”. This means that the first sentence of a paragraph is the key sentence or topic sentence. The following sentences should be only supportive and elaborate the topic sentence.
In addition, paragraphs are related to items from publication recommendations. The EQUATOR network–EQUATOR stands for Enhancing the QUAlity and Research Transparency Of health Research–provides reporting guidelines and checklists for a wide variety of research and study types as help for authors to make their research transparent. Several journals require authors to fully adhere to these guidelines. Some of the guidelines have been updated over time due to practical experience. For example, the first version of the CONSORT (Consolidated Standard of Reporting Trials) statement was published in 1996 and included 21 items. The last revision, termed CONSORT 2010 provides a checklist with 25 items, of which 12 are further divided into a and b, making 37 items in total. Similarly, the STARD (Standards for Reporting of Diagnostic Accuracy Studies) statement for the reporting of diagnostic studies has also increased in the number of items from its first version, which was published in 2004, to its current edition from 2015. While the first STARD statement has 25 items, the STARD 2015 statement has 30 items, 4 of which have an a/b division, making 34 in total. An increase in the number of paragraphs can therefore be expected if a publication recommendation is updated and the number of items increased.
Based on his empirical work from almost 25 years back in time, Albert recommended the use of 2/7/7/6 paragraphs for the introduction/methods/results/discussion sections. Other authors were less strict in their recommendations [3, 12]. We provide an update about the standard structure for an article, and we now generally recommend 3/10/9/8 paragraphs for the four main sections. The total number of paragraphs thus is approximately 30. This directly leads to a recommendation for the number of words per paragraph. For example, JAMA has a word limit of 3000. With 30 paragraphs this gives approximately 100 words per paragraph in the main text. Soares de Araújo  recommended 130 words for cardiovascular journals because these have an upper word count limit of approximately 4000 words after subtraction of references.
Suggestions for the topics of the different paragraphs have been provided by Albert  and Soares de Araújo . Their suggestions for topics differ, however, substantially. Specifically, the first paragraph of the discussion is a brief summary of the main findings according to Albert, while the study problem should be discussed again in the first paragraph of the discussion according to Soares de Araújo. In our opinion, the repetition of the study problem is superfluous, and we agree with Albert’s general concept of article writing. He suggested 6 topics for the discussion section: 1) summary of main findings, 2) weaknesses of the study, 3) strengths of the study, 4) how it fits in the literature, 5) implications for future research and 6) implications for policy/treatment. If 8 sections are used in the discussion, the summary of the main findings and its fit to the literature are generally expanded by one additional paragraph each, but the topics are not changed per se. Instead, it seems very natural that the fit in the published literature is more elaborated, e.g., by integrating related topics, systematic reviews and judging original studies, such as randomized controlled trials. Writers and scientists also presented concepts how the complex problem of writing a whole journal article can be divided into smaller problems. One such approach was suggested by O’Connor and Holmquist , and another concept was developed by Albert . The ideas presented differ substantially, and this is best illustrated from the “summary statements”  or the “message” . Albert does not get weary of emphasizing the importance of the message of the paper. The message should have 12 words, should have a verb, should not be a question and, in our experience, should not include an “and”. The paper is then streamlined along this single message. This is in contrast to the concept of O’Connor and Holmquist  because a manuscript cannot be written according to a single message if there are up to three “conclusions summarizing the major contributions of the manuscript to the scientific community”.
In our study, we did not investigate the effect of the article structure on citation frequency. Several studies observed relationships between the length of the title and citation frequency [18–24]. Shorter titles have a higher citation frequency, and the conclusion is: keep the title short. Linguistic complexity of title, abstract and main text has also been studied in relation with citation frequency [20, 25–28]. While no difference was found between citation frequency and linguistic complexity of the main text , top ranked journals use a simple language in title and abstract . However, scientific articles are generally difficult to read [29, 30]. For example, the Gunning fog index, which looks at sentence length and word complexity , is approximately 17 for scientific articles [32–34]. Although text complexity is reduced are after peer-review, texts are still substantially more complex compared to daily newspaper articles, which have a fog index of 12 . Insurance policies are in contrast even more complex with a fog index of approximately 20 .
Authors also investigated the association between citation frequency, page length, the number of references and author recognition [36–38]. It is obvious that larger articles may represent review articles, as may articles with a higher number of references. With these arguments, it can already be expected that articles with more pages and more references have higher citation numbers.
A reviewer of our work has pointed to the importance of investigating the influence of media paying possibly more attention to journal articles in the past years and changes in journal strategies and instructions for authors. These aspects should be assessed in future research.
In conclusion, authors should not only use the standard IMRAD structure to increase the likelihood for publication of their work. They should also take into account the general layout of their target journal. If a journal-specific structure is not at hand, authors should use 3, 10, 9 and 8 paragraphs for the introduction, methods, results and discussion sections, respectively. Supplementary material has become a standard and should be used when deemed appropriate. Authors should be aware that print journals might differ in their structure from online only journals because of the absence of page limits for online articles. Finally, and most importantly, the instructions to authors of the target journal must definitely be met.
S1 Fig. Forest plots from meta-analyses for introduction, results and discussion section.
Pooled means of number of paragraphs (black squares) and 95% confidence intervals (CI, lines) are displayed for each journal for A) introduction section in 2005, B) introduction section in 2015, c) results section in 2005, D) results section in 2015, E) discussion section in 2005 and F) discussion section in 2015, respectively. Summary statistics (black diamond) was calculated using the random effects DerSimonian & Laird approach (14). BMJ: The BMJ, JAMA: Journal of the American Medical Association, NEJM: The New England Journal of Medicine, PLOS: PLOS Medicine.
S2 Fig. Forest plots from meta-analyses for tables, figures and references.
Pooled means of number of tables, figures and references (black squares) and 95% confidence intervals (CI, lines) are displayed for each journal for A) tables in 2005, B) tables in 2015, c) figures in 2005, D) figures in 2015, E) references in 2005 and F) references in 2015, respectively. Summary statistics (black diamond) was calculated using the random effects DerSimonian & Laird approach (14). BMJ: The BMJ, JAMA: Journal of the American Medical Association, NEJM: The New England Journal of Medicine, PLOS: PLOS Medicine.
S1 Table. Descriptive statistics per journal and per year.
Means and standard deviations for the number of paragraphs per section, the total number of paragraphs (Total), the number of tables, figures and references by journal and year of publication. The last column provides absolute frequencies and relative frequencies (in parenthesis) for the availability of supplementary material (Suppl). BMJ: The BMJ, JAMA: The Journal of the American Medical Association, NEJM: The New England Journal of Medicine, PLOS: PLOS Medicine.
- 1. Hochberg M. An Editor’s Guide to Writing and Publishing Science. Oxford: Oxford University Press; 2019.
- 2. Albert T. Winning the Publications Game: The Smart Way to Write Your Paper and Get It Published. 4th ed. Boca Raton: CRC Press; 2016. 140p p.
- 3. Albert T. Publish and prosper. BMJ. 1996;313:S2–7070.
- 4. Nahata A. Reply to: getting your paper published: what do editors like and not like? 2013 [https://www.researchgate.net/post/Getting_your_paper_published_what_do_editors_like_and_not_like.
- 5. Thrower P. 8 reasons I accepted your article: Elsevier Connect; 2012 [https://www.elsevier.com/connect/8-reasons-i-accepted-your-article.
- 6. Thrower P. Eight reasons I rejected your article: Elsevier Connect; 2012 [https://www.elsevier.com/connect/8-reasons-i-rejected-your-article.
- 7. Körner AM. Guide to Publishing a Scientific Paper. London: Routledge; 2004.
- 8. Froese FJ, Bader K. Surviving the desk-review. Asian Business Management. 2019;18(1):1–5.
- 9. Sollaci LB, Pereira MG. The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. J Med Libr Assoc. 2004;92(3):364–7. pmid:15243643
- 10. Brain L. Structure of the scientific paper. Br Med J. 1965;2(5466):868–9. pmid:5827805
- 11. Hill AB. The reasons for writing. BMJ. 1965;2:870. pmid:20790709
- 12. Soares de Araújo CG. Detailing the writing of scientific manuscripts: 25–30 paragraphs. Arq Bras Cardiol. 2014;102(2):e21–3. pmid:24676380
- 13. Winnington A. Publish and prosper: how to write a scientific research article for an academic journal. NZ Stud J. 2005(2):27–9.
- 14. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials. 1986;7(3):177–88. pmid:3802833
- 15. Schulz KF, Altman DG, Moher D, Group ftC. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. PLoS Med. 2010;7(3):e1000251. pmid:20352064
- 16. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, et al. STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies. Clin Chem. 2015;61(12):1446–52. pmid:26510957
- 17. O’Connor TR, Holmquist GP. Algorithm for writing a scientific manuscript. Biochem Mol Biol Educ. 2009;37(6):344–8. pmid:21567769
- 18. Hudson J. An analysis of the titles of papers submitted to the UK REF in 2014: authors, disciplines, and stylistic details. Scientometrics. 2016;109(2):871–89. pmid:27795594
- 19. Habibzadeh F, Yadollahie M. Are shorter article titles more attractive for citations? Cross-sectional study of 22 scientific journals. Croat Med J. 2010;51(2):165–70. pmid:20401960
- 20. Jacques TS, Sebire NJ. The impact of article titles on citation hits: an analysis of general and specialist medical journals. JRSM Short Rep. 2010;1(1):2. pmid:21103094
- 21. Schreuder MF, Oosterveld MJS. Who ever said size doesn’t matter? The association between journal title length and impact factor. NDT Plus. 2008;1(2):126–7. pmid:28657050
- 22. Paiva CE, Lima JP, Paiva BS. Articles with short titles describing the results are cited more often. Clinics (Sao Paulo). 2012;67(5):509–13.
- 23. Falagas ME, Zarkali A, Karageorgopoulos DE, Bardakas V, Mavros MN. The impact of article length on the number of future citations: a bibliometric analysis of general medicine journals. PLoS One. 2013;8(2):e49476. pmid:23405060
- 24. Letchford A, Moat HS, Preis T. The advantage of short paper titles. R Soc Open Sci. 2015;2(8):150266. pmid:26361556
- 25. Goodman NW. Survey of active verbs in the titles of clinical trial reports. BMJ. 2000;320(7239):914–5. pmid:10742001
- 26. Fox CW, Burns CS. The relationship between manuscript title structure and success: editorial decisions and citation performance for an ecological journal. Ecol Evol. 2015;5(10):1970–80. pmid:26045949
- 27. Whissell C. Linguistic complexity of abstracts and titles in highly cited journals. Percept Mot Skills. 1999;88(1):76–86.
- 28. Lu C, Bu Y, Dong X, Wang J, Ding Y, Larivière V, et al. Analyzing linguistic complexity and scientific impact. J Informetr. 2019;13(3):817–29.
- 29. Albert T. Why are medical journals so badly written? Med Educat. 2004;38(1):6–8.
- 30. Hall JC. The readability of original articles in surgical journals. ANZ J Surg. 2006;76(1–2):68–70. pmid:16483300
- 31. DuBay WH. The Principles of Readability. Costa Mesa: Impact Information; 2004.
- 32. Weeks WB, Wallace AE. Readability of British and American medical prose at the start of the 21st century. BMJ. 2002;325(7378):1451–2. pmid:12493663
- 33. Yeung AWK, Goto TK, Leung WK. Readability of the 100 Most-Cited Neuroimaging Papers Assessed by Common Readability Formulae. Front Hum Neurosci. 2018;12:308. pmid:30158861
- 34. Roberts JC, Fletcher RH, Fletcher SW. Effects of peer review and editing on the readability of articles published in Annals of Internal Medicine. JAMA. 1994;272(2):119–21. pmid:8015120
- 35. Meadows J. Fog—thick in place. New Scientist. 1979;April 26, 1979:292.
- 36. Grover V, Raman R, Stubblefield A. What affects citation counts in MIS research articles? An empirical investigation. Commun Assoc Inf Syst. 2014;34(1):1435–56.
- 37. Hwang A, Arbaugh JB, Bento RF, Asarta CJ, Fornaciari CJ. What causes a business and management education article to be cited: article, author, or journal? Int J Manag Educ. 2019;17(1):139–50.
- 38. Didegah F, Thelwall M. Which factors help authors produce the highest impact research? Collaboration, journal and document properties. J Informetr. 2013;7(4):861–73.