Catalog of Regulatory Science Tools to Help Assess New Medical Devices
This regulatory science tool (RST) is a software suite that can assist with the objective task-based assessment of digital breast tomosynthesis (DBT) systems imaging performance that uses CDMAM 4.0 phantom and a deep learning (DL) model observer. The RST includes a set of Python scripts, a pre-trained baseline model, and the user’s manuals.
Technical Description
Tool Components: Python scripts, pre-trained Pytorch model, user’s manuals.
Tool Inputs: DBT scans of the CDMAM 4.0 phantom assembly (see below).
Tool Outputs: average proportion of correct responses (PC) in a four alternative forced choice (4-AFC) experiment and its standard error as an objective measure of imaging performance.
Version Control: via GitHub.
The goal of this project was to develop a convolutional neural network (CNN) that can automatically score the CDMAM 4.0 phantom placed adjacent to the SunNuclear BR3D “swirl” slab and blocks of PMMA for assessing DBT imaging performance (Figure 1(a)). The Artinis CDMAM 4.0 phantom is an established tool for digital mammography systems image quality evaluation in Europe and its use is detailed in the European guidelines for quality assurance in breast cancer screening and diagnosis (EUREF) (Ref.1).
Due to the difficulty of detecting smaller, lower-contrast CDMAM details when imaged with the BR3D swirl background, we evaluate imaging performance using only a subset of available regions of interest (ROIs). We use the four highest-contrast columns and fourteen rows of the largest detail diameters in the CDMAM 4.0 to calculate the proportion of correct responses (PC) in a four-alternative forced-choice (4-AFC) detection task. Pilot studies with human readers showed that with this subset of CDMAM 4.0 details, a reasonable performance (Ref.2) of 0.75 ≤ PC ≤ 0.85 can be achieved. Therefore, only these 56 CDMAM cells were selected for evaluation, as shown in Figure 1(b). An example ROI used for image analysis is shown in Figure 2. For the new system image quality evaluation 18 DBT scans of the phantom assembly are acquired for three equivalent breast thicknesses, modeled as 20, 40, and 50 mm of PMMA. The performance metric is then derived from fine-tuning the baseline model using the new data (1,008 ROIs per PMMA thickness). Fine-tuning is done separately for each thickness of added PMMA via multiple cross-validation runs to estimate the mean PC score and its standard error.
Intended Purpose
Applicable Medical Devices: this tool is intended for assessing digital breast tomosynthesis (DBT) systems imaging performance.
Intended User Population: medical device manufacturers preparing regulatory submissions and FDA reviewers evaluating DBT systems performance.
Value Proposition: the tool provides a task-based objective and reproducible method for assessing DBT image quality that is less burdensome and less variable than human reader studies.
Underlying need and development rationale: In DBT image quality testing it is important to have an inhomogeneous background due to inter-slice (z-)blurring in reconstructed images. Since the original intent of the CDMAM phantom was image quality evaluation of digital mammography systems, it does not provide anatomical background, required for testing DBT systems. In the past, researchers have modified CDMAM for assessing DBT by adding structured background (CIRS/SunNuclear BR3D swirl slabs). Figure 1(a) illustrates the sandwich-like phantom assembly. However, it was found that model observer software (Artinis CDCOM) does not work with structured backgrounds and the test is too burdensome with human readers (required too many readers to compensate for large uncertainties in scoring). This prompted the FDA to explore an alternative approach in which the CDMAM image is analyzed and scored by a CNN-based model observer. The tool is proposed as a novel tasked-based approach for automatic assessment of imaging performance of DBT technology. It uses the CDMAM 4.0 phantom, coupled with a structural background slab, and uniform blocks of PMMA mimicking compressed breast tissue thickness. A pre-trained baseline model CNN is developed, and a procedure for using this model in a fine-tuning cross-validation process that can evaluate performance of a new system is provided.
Testing
Testing was conducted using imaging data from seven DBT systems (some collected by the FDA, some by DBT manufacturers through AdvaMed trade association). These systems differ in terms of x-ray detector type, gantry angular range, acquiring mode (step and shoot vs. continuous), x-ray tube design, reconstruction and post-processing algorithms, etc. Square image crops containing CDMAM 4.0 “cells” were extracted from in-focus DBT planes, using a semi-automated registration approach. Combined approximately ~88,000 ROI images were collected to train the production baseline CNN model.
We trained the baseline model with the fixed optimal architecture and hyperparameters seven times using a leave-one-system-out approach. This yielded seven distinct baseline model versions, each trained using data from 6 out of 7 systems. These models were subsequently used for imaging performance evaluation with previously unseen data from matching “left out” systems. Model fine-tuning was done by conducting multiple cross-validation runs with varying number of images, over which mean PC score and its uncertainties were estimated. The results of these validation tests are provided in the accompanying peer-reviewed journal publication (Ref.3). Such an arrangement fully mimics the proposed new DBT system testing process for regulatory clearance.
Limitations
The proposed methodology involves imaging of the SunNuclear BR3D 50/50 glandular/adipose mimicking composition “swirl” background slabs. It is assumed that different samples of these slabs are manufactured to have similar x-ray attenuation properties and background distribution for consistent and repeatable measurements.
Supporting Documentation
Accessibility Information: https://github.com/DIDSR/CDMAM-4.0-DLMO
User manuals: the RST package includes PDF documents ‘cdmam_dlmo_manual_part1.pdf’ and ‘cdmam_dlmo_manual_part2.pdf’ describing data collection and imaging perfromance analysis. Software installation instructions are provided in the GitHub README file.
Ref.1 https://screening.iarc.fr/doc/ND7306954ENC_002.pdf
Ref.2 R. Wagner et al., “Assessment of medical imaging and computer-assist systems: lessons from recent experience”, Acad. Radiol. 2002 9(1), pp. 1264-1277.
Ref.3 A. Makeev and S. J. Glick, “Deep learning-based observer for DBT image quality evaluation with CDMAM 4.0 phantom and swirl background” to be submitted to Physics in Medicine and Biology journal, 2026, Under Review.
Contact
Tool Reference
- RST Reference Number: RST26MD05.01
- Date of Publication: 05/04/2026
- Recommended Citation: U.S. Food and Drug Administration. (2026). Assessing DBT Image Guality with CDMAM 4.0 Phantom and DL-Based Model Observer (RST26MD05.01). https://cdrh-rst.fda.gov/assessing-dbt-image-guality-cdmam-40-phantom-and-dl-based-model-observer