Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Volumetric Breast Density Estimation from Full-Field Digital Mammograms: A Validation Study

  • Albert Gubern-Mérida ,

    Affiliations Department of Computer Architecture and Technology, University of Girona, Girona, Spain, Department of Radiology, Radboud University Medical Center, Nijmegen, The Netherlands

  • Michiel Kallenberg,

    Affiliation Department of Radiology, Radboud University Medical Center, Nijmegen, The Netherlands

  • Bram Platel,

    Affiliation Department of Radiology, Radboud University Medical Center, Nijmegen, The Netherlands

  • Ritse M. Mann,

    Affiliation Department of Radiology, Radboud University Medical Center, Nijmegen, The Netherlands

  • Robert Martí,

    Affiliation Department of Computer Architecture and Technology, University of Girona, Girona, Spain

  • Nico Karssemeijer

    Affiliation Department of Radiology, Radboud University Medical Center, Nijmegen, The Netherlands



To objectively evaluate automatic volumetric breast density assessment in Full-Field Digital Mammograms (FFDM) using measurements obtained from breast Magnetic Resonance Imaging (MRI).

Material and Methods

A commercially available method for volumetric breast density estimation on FFDM is evaluated by comparing volume estimates obtained from 186 FFDM exams including mediolateral oblique (MLO) and cranial-caudal (CC) views to objective reference standard measurements obtained from MRI.


Volumetric measurements obtained from FFDM show high correlation with MRI data. Pearson’s correlation coefficients of 0.93, 0.97 and 0.85 were obtained for volumetric breast density, breast volume and fibroglandular tissue volume, respectively.


Accurate volumetric breast density assessment is feasible in Full-Field Digital Mammograms and has potential to be used in objective breast cancer risk models and personalized screening.


Breast density has been identified as an important risk factor for developing breast cancer. Studies have reported that the risk of getting breast cancer in women with high breast density is four to six times as large as in women with low breast density [1][3]. Additionally, sensitivity of mammography screening is severely impaired in women with high density, since the presence of heterogeneous or extreme dense tissue patterns may obscure suspicious lesions. For this reason, the risk of missing cancers in screening programs increases with density [4][6]. Personalization of screening protocols, involving adjunct imaging modalities for women who are currently not adequately screened, has been suggested to circumvent this problem. Such protocols should include risk assessment based on models including family history and breast density biomarkers [7].

To develop such models, it is important to objectively measure breast density. Most studies to date have been performed using subjective visual measurements based on the 4-class Breast Imaging Reporting and Data Systems (BI-RADS) [8], which is used in current clinical practice, or on a visual thresholding technique using dedicated software, such as Cumulus [9]. Both are essentially 2D measurements that determine the area of dense tissue projected in mammograms. Fully automatic methods for area based breast density measurements have been proposed to take subjectivity away [10][13]. However, area based measurements do not take the thickness of dense tissue into account. This is a limitation since it is biologically more plausible that breast cancer risk is related to the volume of dense tissue in the breast rather than to its projection [3], [14], [15].

To overcome this limitation, methods for volumetric breast density estimation from mammograms have been proposed [16][21]. These methods are based on a physics based model of the X-ray image acquisition process and assume that the breast tissue consists of two types of tissue: fat and parenchyma. By knowing the X-ray attenuation of these tissues, tissue composition at a given pixel can be computed. Initially, researchers have struggled to successfully apply this approach to digitized film mammograms. However, with the introduction of Full-Field Digital Mammograms (FFDM), the development of robust methods and commercial products became possible. Those can be applied to raw (unprocessed) FFDM data, which is made available by all modality manufacturers. Unfortunately, though, raw data is often not archived in clinical practice.

The performance of volumetric breast density estimation methods has been evaluated in several studies. To determine robustness and consistency, comparisons have been made of breast density estimates in the left and right breasts, and in mediolateral oblique (MLO) and cranial-caudal (CC) exposures of the same breast [16], [17]. One would expect to find similar values in CC and MLO views and in regular cases without abnormalities breast density in the left and right breast should be highly correlated. Other studies compared volumetric estimates to BI-RADS density scoring [22], [23]. These previously mentioned validation strategies may not reveal systematic errors, while subjective BI-RADS scorings are coarse and inaccurate by nature and are only useful to determine large errors of the automated methods. Comparison of breast density estimates from FFDM to reference standard measurements obtained from three-dimensional imaging modalities, such as Magnetic Resonance Imaging (MRI) and Computed Tomography (CT), is arguably the most objective and complete validation method [17], [19], [24], [25]. The volume of dense breast tissue can accurately be derived from MR and CT images, as these are 3D acquisitions and no projection is involved. However, quantification of the volume of dense breast tissue is a time consuming task when done by means of manual segmentations because it requires segmentation of 3-dimensional data. For this reason we use computer algorithms to obtain breast density measurements.

In this paper, we evaluate a method for measuring volumetric breast tissue estimates from digital mammograms [17], [19]. We specifically studied the performance of the method for determination of fibroglandular tissue volume, breast volume, and volumetric breast density by comparing its results to volume estimates that were obtained from breast MRI data.

Materials and Methods


Ethics Statement: According to the Dutch Medical Research Involving Human Subjects Acts (WMO), retrospective studies using only patient records do not require a formal medical ethics review and informed consent is not needed. The need for signed informed consent was waived by the Independent Review Board (IRB). This was confirmed with the local medical ethical committee and can be read at The presented study complies with the Dutch Data Protection Authority requirements on the use of patient data.

In the Radboud University Nijmegen Medical Centre, breast MRI and mammography are used for screening of women with high familial or genetic risk. We included studies for which breast MRI data and FFDM were available with time interval between these exams of less than two months. We obtained 250 MRI volumes and 928 MLO and CC images from FFDM exams from 250 studies (132 different women). Mean time between MRI and FFDM acquisitions was six days. CC views were not available in some cases. All exams were performed between December 2000 and December 2011. The age of the screened women ranged from 24 to 77 years, and was 46.5±11.10 years on average.

The digital mammograms used in the study were acquired on a GE Senographe 2000D or on a GE Senographe DS using standard clinical settings, including the use of an anti-scatter grid. Breast MRI examinations were performed on 1.5 or 3 Tesla scanners (Magnetom Vision, Magnetom Avanto and Magnetom Trio, Siemens) with a dedicated breast coil (CP Breast Array, Siemens). In this study we used pre-contrast T1-weighted MR volumes.

Breast Density Quantification

In this study, volumetric breast density, breast volume and fibroglandular volume estimates were obtained from FFDM and MRI data. Volumetric breast density refers to the percentage of breast density, computed by dividing the fibroglandular tissue volume by breast volume.

Volumetric estimates from 250 FFDM studies were obtained using Volpara 1.4.3 (Ma?takina, Wellington, New Zealand), which is FDA-approved fully automated software to estimate volumetric breast density. The Volpara method is an extension of the algorithm presented in [17]. In particular, it incorporates a more detailed physics model including scatter components as described in [18], and it uses a more advanced method to determine a reference region of fatty tissue This reference region is used for calibration, and allows computation of fibroglandular tissue thickness at every pixel in the image. Breast volume is determined using a geometric model in which the periphery of the compressed breast is modeled by semi-circular cross sections, using the breast thickness measurement provided by the acquisition system in the image header.

Volumetric measurements from MRI were obtained using a multi-probabilistic atlas-based segmentation method based on [26], [27]. In short, the breast MRI segmentation method initially corrects the bias field and normalizes signal intensities among patients. Secondly, probabilistic atlases, which capture the anatomic variation of the pectoral muscle and chest wall, are used to segment the breast. A probabilistic atlas is a volume that contains the complete spatial distribution of probabilities of voxels to belong to one or more organs [27]. Finally, the fibroglandular tissue is segmented in each breast independently using automatic thresholding. In this work, this method was used to automatically segment breast and fibroglandular tissue in the 250 MRI studies. A radiologist with expertise in breast imaging carefully reviewed all slices of the segmentations and approved 186 (74.4%) MRI studies with segmentations to be suitable for the use as a reference standard for validation of FFDM density measurements. The other 64 (25.6%) studies were excluded from the study. The field of view of 5 of the excluded cases did not entirely cover the breast. In the rest of the excluded cases we observed that the main reason for the MRI segmentation failure was the presence of artifacts or bias field remaining after correction. These signal intensity distortions negatively affected the segmentation process.


The validation process is represented in Fig. 1. The Volpara method was validated on 186 FFDM exams including 680 mammographic views. The Pearson’s correlation coefficients between volumetric measures obtained from FFDM and volumetric measures obtained from MRI were calculated per breast and per study. The volumetric estimations per breast from FFDM were averaged over available measures of CC and MLO views for each breast independently. Measures per study were computed by averaging right and left breast volumetric estimates. Because of the log-normal distribution of the data, correlation coefficients were computed after converting the measurements using the natural logarithmic transform [28].

Scatter plots are used to visualize the comparison between breast volumetric estimations. Volpara Density Grade (VDG) thresholds are also shown for volumetric breast density estimates obtained from FFDM. The VDG is a grading system that maps the percent density output of Volpara into four categories similar to the BI-RADS density score. The ranges of the percentage of dense tissue for VDG 1, 2, 3 and 4 are 0 − 4.5%, 4.5 − 7.5%, 7.5 − 15.5% and 15.5% and up, respectively [29].

BI-RADS density scoring (1 to 4) was also performed on the 250 FFDM studies. Each study was classified as (1) fatty, (2) scattered dense, (3) heterogeneously dense or (4) extreme dense by a breast radiologist. Volumetric breast density measurements obtained from FFDMs and MRI, computed per study, were compared to its BI-RADS category provided by the radiologist and the Spearman Ranked correlations were computed for each modality. Finally, to quantify the concordance between VDG and BI-RADS density score, the weighted kappa with quadratic weights coefficient was measured.


Table 1 summarizes the results obtained in this validation study. Figure 2 shows the relation between percentage of volumetric breast density from mammograms and MRI data per breast (a) and per study (b). Correlations per breast and per study are 0.91 and 0.93, respectively. Figure 3 shows the relation between breast volume estimates from mammograms and MRI data. Per breast (a) and per study (b) correlations are 0.97 and 0.97, respectively. Additionally, Fig. 4 shows the relation between fibroglandular tissue volume estimates from mammograms and MRI data. Correlation per breast (a) is 0.84 and correlation per study (b) is 0.85.

Figure 2. Comparison of percentage of breast density from MRI and FFDMs (a) per breast (n = 353) and (b) per study (n = 186).

Each point is labeled with the BI-RADS score. VDG 1, 2, 3 and 4 refer to Volpara Density Grade breast density percentage ranges.

Figure 3. Comparison of breast volume obtained from MRI and FFDMs per (a) breast (n = 353) and (b) per study (n = 186).

Figure 4. Comparison of fibroglandular tissue volume obtained from MRI and FFDMs (a) per breast (n = 353) and (b) per study (n = 186).

Table 1. Summary of the dataset and the results obtained in this study.

Overall, high correlation between FFDM and MRI measurements iss observed. However, results indicate that Volpara tends to underestimate breast density in dense breasts compared to MRI. Correlation drops for volumetric breast density measurements classified within the VDG 4 range.

Furthermore, Fig. 5 shows the association between volumetric breast density estimates and BI-RADS category. The estimates are obtained from FFDMs on Fig. 5(a), and obtained from MRI on Fig. 5(b). Spearman Rank correlation coefficients are 0.79 and 0.78 for FFDM and MRI, respectively. The reported correlations are not statistically significantly different (p-value = 0.71, two-tailed z-test). Following the trend observed before, volumetric breast density estimates are larger when obtained from MRI than when computed on FFDMs. The median estimates obtained with Volpara range from 5.66%, in the lowest BI-RADS category, to 26.69%, in the top category. Median estimates obtained from MRI data range from 3.80% to 52.00%. Figure 6 shows the number of studies scored with BI-RADS categories 1, 2, 3 and 4 for (a) the initial dataset and for (b) the dataset after excluding studies with poor MR segmentations. Finally, Table 2 shows the confusion matrix for the VDG using the Volpara method versus BI-RADS density score given by the breast radiologist. The weighted kappa with quadratic weights statistic was 0.40.

Figure 5. Association between volumetric breast density estimates per study and BI-RADS category.

Figure 6. Frequency of studies scored with BI-RADS categories 1, 2, 3 and 4 for (a) the complete dataset (n = 250) and (b) for the cases of the dataset with reference standard estimates (n = 186).

Table 2. Volpara Density Grade (VDG) versus BI-RADS density score from a breast radiologist.


In this study we have presented a validation of Volpara 1.4.3 (Mātakina, Wellington, New Zealand), which is a commercially available method for assessing volumetric breast density on FFDM. Volpara has been evaluated on 186 studies including 680 mammographic views of 353 breasts in total. Volumetric estimates obtained from FFDM have been compared to objective reference standard measures computed from MRI. Volumetric breast density and breast tissue volume values obtained with Volpara present high correlation when compared to MRI measurements. To date, this is the largest validation study that compares volumetric breast density estimates from FFDM to reference standard measurements obtained from MRI, a 3D imaging modality.

In previous work, Wang et al. [25] used a dataset of 123 patients and also compared volumetric measurements obtained from FFDM to estimates obtained from MRI. Correlations for breast volume, fibroglandular tissue volume and volumetric breast density were 0.94, 0.62 and 0.71, respectively. We found higher correlation values than the ones reported in their work (R = 0.97, R = 0.85 and R = 0.93 for breast volume, fibroglandular tissue volume and volumetric breast density, respectively). Van Engeland et al. [17] also compared density estimates from FFDM to estimates from MRI in a small study including 22 patients, but only reported correlation between fibroglandular tissue volume from mammograms and from MRI data. The correlation was 0.97. In our study we found a lower correlation between fibroglandular tissue volume from FFDM and from MRI (R = 0.84). In previous studies, Volpara was also compared to semi-automatic area-based density measurements. High correlation between the volumetric breast density obtained with Volpara and area-based percentage density using Cumulus was found (R = 0.85) [23]. Care should be taken when comparing the correlation coefficients obtained in this work to the values reported in similar studies; these similar studies were performed on different datasets. In our study, the dataset was mostly composed of pre-menopausal women participating in a high-risk screening program. In this dataset, a different distribution of breast density may be expected when compared to breast density distributions of other datasets, since there are many factors that influence breast density (such as age and use of hormone replacement therapy). On the other hand, we may assume that the appearance of fibroglandular tissue itself in our study group is similar to that in other studies, since there is no evidence that breast density patterns in women in a high risk population differ from those in the general population.

Compared to volumetric measurements obtained from MRI, results show that Volpara tends to underestimate breast density in very dense breasts. This effect has been also observed in other methods for volumetric breast density estimation [17], [30]. Like Volpara, these methods are also based on a physics-based image model and, to predict fibroglandular tissue thickness, use a set of pixels of the breast that belong to fatty tissue as an internal reference. The selection of the internal reference is more complex in dense breasts than in fatty breasts, which affects the calibration of fatty tissue attenuation and leads to breast density underestimation. However, the breast density underestimation in dense cases does not seem to affect the final VDG categorization. We observed that the cases with the largest negative difference between estimates from FFDM and MRI obtained a volumetric breast density estimate from FFDM greater than 15% and were classified as VDG 4.

Compared to BI-RADS density scores given by a breast radiologist, a clear association is observed, but low agreement between VDG scores and BI-RADS density scores was found (weighted kappa with quadratic weights coefficient = 0.40). In general, VDG scores tend to be higher than the BI-RADS density scores. For instance, 70 studies that were scored with BI-RADS 2 obtained a VDG score of 3. The same trend was observed on 55 studies that were scored with BI-RADS 3, which obtained a VDG of 4. One should note that the VDG thresholds were set based on a US radiologist’s assessment of BI-RADS density. The low agreement and the perceived overestimation might be caused by the fact that the BI-RADS scoring in this work was done by an European radiologist. BI-RADS density grades have been suggested to be underestimated according to EU standards when compared to US radiologist [31]. However, further research is still required to investigate this effect as only a single radiologist participated in the presented study.

Regarding the validation process, it was a limitation of our study that we had to exclude cases without reliable breast MRI fibroglandular tissue segmentation. However, we do not think this influenced our results because the causes for rejecting MRI cases were mostly not related to breast composition. Rejected cases were distributed evenly for the BI-RADS categories 1, 2 and 3. A higher percentage of rejected cases was observed on BI-RADS category 4 (8 of 15). This fact is explained by the difficulty of automatically segmenting fibroglandular tissue in breasts with high density in MRI. One could think that the exclusion of these BI-RADS category 4 cases increases the correlation coefficients between FFDM and MRI measurements. However, these rejected cases had minor influence on the complete dataset (3% of the total number of studies).

In conclusion, our study shows that it is feasible to obtain accurate measurements of absolute and relative volumes of dense breast tissue from full field digital mammograms. Availability of such measurements is crucial for the development of objective breast cancer risk models and may be used in the development of personalized screening protocols.


Special thanks to Ralph Highnam at Volpara for providing access to Volpara software.

Author Contributions

Conceived and designed the experiments: AGM MK BP RMM RM NK. Performed the experiments: AGM MK. Analyzed the data: AGM MK BP RMM RM NK. Contributed reagents/materials/analysis tools: AGM MK BP RMM RM NK. Wrote the paper: AGM MK BP RMM RM NK.


  1. 1. Boyd NF, Martin LJ, Yaffe MJ, Minkin S (2011) Mammographic density and breast cancer risk: current understanding and future prospects. Breast Cancer Res 13: 223.
  2. 2. McCormack VA, dos Santos Silva I (2006) Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis. Cancer Epidemiol Biomarkers Prev 15: 1159–1169.
  3. 3. Vachon CM, van Gils CH, Sellers TA, Ghosh K, Pruthi S, et al. (2007) Mammographic density, breast cancer risk and risk prediction. Breast Cancer Res 9: 217.
  4. 4. Boyd NF, Guo H, Martin LJ, Sun L, Stone J, et al. (2007) Mammographic density and the risk and detection of breast cancer. N Engl J Med 356: 227–236.
  5. 5. van Gils C, Otten JD, Verbeek AL, Hendriks JH (1998) Mammographic breast density and risk of breast cancer: masking bias or causality? Eur J Epidemiol 14: 315–320.
  6. 6. Mandelson MT, Oestreicher N, Porter PL, White D, Finder CA, et al. (2000) Breast density as a predictor of mammographic detection: comparison of interval- and screen-detected cancers. J Natl Cancer Inst 92: 1081–1087.
  7. 7. Schousboe JT, Kerlikowske K, Loh A, Cummings SR (2011) Personalizing mammography by breast density and other risk factors for breast cancer: analysis of health benefits and cost-effectiveness. Ann Intern Med 155: 10–20.
  8. 8. D’Orsi CJ, Bassett LW, Berg WA, Feig SA, Jackson VP et al.. (2003) Breast Imaging Reporting and Data System (BI-RADS) Atlas, 4 edition, Reston, VA.
  9. 9. Byng JW, Boyd NF, Fishell E, Jong RA, Yaffe MJ (1994) The quantitative analysis of mammographic densities. Phys Med Biol 39: 1629–1638.
  10. 10. Li H, Giger ML, Olopade OI, Lan L (2007) Fractal analysis of mammographic parenchymal patterns in breast cancer risk assessment. Acad Radiol 14: 513–521.
  11. 11. Nielsen M, Karemore G, Loog M, Raundahl J, Karssemeijer N, et al. (2011) A novel and automatic mammographic texture resemblance marker is an independent risk factor for breast cancer. Cancer Epidemiol 35: 381–387.
  12. 12. Oliver A, Lladó X, Pérez E, Pont J, Denton ERE, et al. (2010) A statistical approach for breast density segmentation. J Digit Imaging 23: 527–537.
  13. 13. Torrent A, Bardera A, Oliver A, Freixenet J, Boada I et al.. (2008) Breast Density Segmentation: A Comparison of Clustering and Region Based Techniques. In IWDM’08: Proceedings of the 9th international workshop on Digital Mammography, Berlin, Heidelberg: Springer-Verlag 9–16.
  14. 14. Ng KH, Yip CH, Taib NAM (2012) Standardisation of clinical breast-density measurement. Lancet Oncol 13: 334–336.
  15. 15. Shepherd JA, Kerlikowske K, Ma L, Duewer F, Fan B, et al. (2011) Volume of mammographic density and risk of breast cancer. Cancer Epidemiol Biomarkers Prev 20: 1473–1482.
  16. 16. Alonzo-Proulx O, Jong R, Yaffe M (2012) Volumetric breast density characteristics as determined from digital mammograms. Phys Med Biol 57: 7443.
  17. 17. van Engeland S, Snoeren PR, Huisman H, Boetes C, Karssemeijer N (2006) Volumetric breast density estimation from full-field digital mammograms. IEEE Trans Med Imaging 25: 273–282.
  18. 18. Highnam R, Brady M (1999) Mammographic Image Analysis. Kluwer Academic Publishers.
  19. 19. Highnam R, Brady M, Yaffe MJ, Karssemeijer N, Harvey J (2010) Robust Breast Composition Measurement - Volpara. In IWDM’10: Proceedings of the 10th international workshop on Digital Mammography, Berlin, Heidelberg: Springer-Verlag 342–349.
  20. 20. Kaufhold J, Thomas JA, Eberhard JW, Galbo CE, Trotter DEG (2002) A calibration approach to glandular tissue composition estimation in digital mammography. Med Phys 29: 1867–1880.
  21. 21. Pawluczyk O, Augustine BJ, Yaffe MJ, Rico D, Yang J, et al. (2003) A volumetric method for estimation of breast density on digitized screen-film mammograms. Med Phys 30: 352–364.
  22. 22. Ciatto S, Bernardi D, Calabrese M, Durando M, Gentilini MA, et al. (2012) A first evaluation of breast radiological density assessment by QUANTRA software as compared to visual classification. Breast 21(4): 503–506.
  23. 23. Jeffreys M, Harvey J, Highnam R (2010) Comparing a New Volumetric Breast Density Method (Volpara TM) to Cumulus. In IWDM’10: Proceedings of the 10th international workshop on Digital Mammography. Edited by Martí J, Berlin, Heidelberg: Springer-Verlag 408–413.
  24. 24. Kontos D, Bakic P, Acciavatti RJ, Conant EF, Maidment ADA (2010) A comparative study of volumetric and area-based breast density estimation in digital mammography: results from a screening population. In IWDM’10: Proceedings of the 10th international workshop on Digital Mammography, Berlin, Heidelberg: Springer-Verlag 378–385.
  25. 25. Wang J, Aziz A, Newitt D, Joe BN, Hylton N et al.. (2012) Comparison of Hologic’s Quantra volumetric assessment to MRI breast density. In Proceedings of the 11th International Conference on Breast Imaging, IWDM’12, Berlin, Heidelberg: Springer-Verlag 619–626.
  26. 26. Gubern-Mérida A, Kallenberg M, Martí R, Karssemeijer N (2011) Fully automatic fibroglandular tissue segmentation in breast MRI: atlas-based approach. In MICCAI Workshop: Breast Image Analysis 73–80.
  27. 27. Gubern-Mérida A, Kallenberg M, Martí R, Karssemeijer N (2012) Segmentation of the pectoral muscle in breast MRI using atlas-based approaches. In Medical Image Computing and Computer-Assisted Intervention. Volume 7511 of Lect Notes Comput Sci 371–378.
  28. 28. Jeffrey M, Warren R, Highhnam R, Smith GD (2006) Initial experiences of using an automated volumetric measure of breast density: the standard mammogram form. Br J Radiol 79: 378–382.
  29. 29. Highnam R, Sauber N, S Destounis JH, McDonald D (2012) Breast Density into Clinical Practice. In IWDM’12: Proceedings of the 11th International Workshop on Breast Imaging, Volume 7361 of Lect Notes Comput Sci 466–473.
  30. 30. Kallenberg MGJ, van Gils CH, Lokate M, den Heeten GJ, Karssemeijer N (2012) Effect of compression paddle tilt correction on volumetric breast density estimation. Phys Med Biol 57: 5155–5168.
  31. 31. Sauber N, Chan A, Highnam R (2013) BI-RADS breast density classification – an international standard? In European Congress of Radiology: Scientific Exhibit.