MRI-IQ-ETK: Digital MRI Test Phantoms and Automated Image Quality Evaluation Toolkit for Assessing MRI Image Reconstruction Methods | Center for Devices and Radiological Health

Catalog of Regulatory Science Tools to Help Assess New Medical Devices

MRI-IQ-ETK is a software toolkit that contains digital test phantom generation and automated image quality evaluation components intended to help assess the performance of MRI image reconstruction methods.

Technical Description

MRI-IQ-ETK is a software toolkit that generates three types of digital image quality (IQ) test phantoms in the continuous Fourier domain for assessing the performance of magnetic resonance imaging (MRI) image reconstruction methods. The software toolkit supports the evaluation of the following IQ metrics: geometric accuracy, intensity uniformity, percentage ghosting, sharpness, signal-to-noise ratio, high-contrast resolution, and low-contrast object detectability.

The toolkit has two major components:

Digital test phantom generation

This component contains functions to generate three types of digital IQ test phantoms: disk phantom, resolution phantom, and low-contrast phantom. These phantoms are composed of single or multiple geometrically defined disk objects of different sizes and contrasts arranged in patterns similar to the design of the ACR Large phantom structure[1]. Based on the geometric object description, the phantom data are generated in the continuous Fourier domain before being discretized in the k-space domain to avoid any loss of resolution in the object model. Accordingly, each phantom generation function takes the phantom parameters (radius, center location, and intensity of each disk), additive Gaussian noise level, and the desired image parameters (image field of view and matrix size) as inputs. The outputs are the k-space data and an inverse Fourier transform (IFT) reconstructed MRI image of the phantom. See Fig. 1 for an illustration of the three types of phantoms in the k space and the image space. Users can replace the IFT reconstruction by applying their own reconstruction methods to the k-space data and then evaluate the image quality using the automated IQ evaluation functions in MRI-IQ-ETK as described next.

Automated IQ evaluation

This component takes the reconstructed digital test phantom MRI images as input, analyzes the images and outputs the image quality values, as listed below and shown in Fig. 1,

From a disk phantom image, geometric accuracy, intensity uniformity, percentage ghosting, sharpness and signal to noise ratio (SNR) are measured.
From a resolution phantom image, high-contrast resolution is measured.
From a low contrast phantom image, low-contrast object detectability, defined as the number of fully visible spokes, is measured.

The test object composition of the digital phantoms and the image quality metrics are designed based on the Phantom Test Guidance for Use of the Large MRI Phantom for the MRI Accreditation Program [1]. However, users can change the object parameters to create modified versions. Detailed description on how these digital phantoms are created and how these metrics are evaluated can be found in this paper [2].

Figure 1. In the left panel are example presentations of the three types of digital test phantoms in k-space and image space. From top to bottom are the disk phantom, resolution phantom and low-contrast phantom, respectively. In the right panel are listed the image quality (IQ) metrics that are evaluated from the corresponding phantom.

Intended Purpose

The MRI-IQ-ETK software toolkit is intended to quantitatively evaluate magnetic resonance imaging (MRI) reconstruction using digital test phantoms via a set of image quality (IQ) metrics (geometric accuracy, intensity uniformity, percentage ghosting, sharpness, signal-to-noise ratio, high-contrast resolution, and low-contrast object detectability). The evaluation tool can be used to facilitate the optimization of advanced reconstruction and post-processing methods for accelerated MRI imaging such as compress sensing and machine learning-based methods during product development and to assess the performance improvement of a new reconstruction and post-processing method with respect to a reference reconstruction method.

Intended users are MRI device developers, MRI image reconstruction developers, and MRI image post-processing (denoising, dealiasing, deblurring) software developers.

Testing

Code evaluation was performed to ensure that the MRI-IQ-ETK tool functions as designed for each of the components, as described below:

The test objects created by the digital phantom generation code were verified to have the desired properties specified by the input parameters (disk size, location, intensity, etc.).
The code for evaluating the image quality metrics (geometrical accuracy, intensity uniformity, percentage ghosting, sharpness, signal-to-noise ratio, high-contrast resolution, low contrast detectability) was verified to have implemented the steps described in this paper [2].

The digital phantom based automated IQ evaluation tool was shown to effectively differentiate the performance of a machine-learning based MRI image reconstruction model trained with multiple clinical MRI dataset of different acceleration factors and coil sensitivity in multiple test settings of varying sampling rate and noise level [2].

Limitations

Each phantom in this MRI-IQ-ETK software toolkit is created by calculating the continuous Fourier functions of the geometric objects contained in the phantom. To demonstrate the utility of the IQ evaluation framework, this toolkit simulated a simplified MRI acquisition setting that has a single-coil of uniform sensitivity, cartesian grid k-space sampling and Gaussian noise, and applied the inverse Fourier transform (IFT) to perform image reconstruction. Users can replace the MRI simulation with a more realistic acquisition setting such as multi-coil, different k-space trajectories (spiral, radial, or others) and replace the IFT reconstruction with the user’s own reconstruction method to be evaluated. Recognizing that a simulated MRI acquisition may not model all physical aspects of an MRI scanner, developers are advised to always validate the conclusions drawn from the simulated data with several physical MRI scans using the ACR Large MRI phantom or other IQ phantoms with similar testing patterns.
The low contrast object detectability metric is defined to be the number of fully visible spokes in a low-contrast phantom MRI image. A model observer (MO) method based on non-prewhitening template matching is used to automatically determine whether a disk object is detectable in this toolkit. Internal validation was conducted to verify that the automatic evaluation results agreed with human reading results in a small set of low-contrast MRI phantom images reconstructed with the IFT and the AUTOMAP reconstructed images described in the paper [2]. For a different reconstruction method, users should validate the low-contrast object detectability evaluation output on a few sample images to ensure it agrees with human’s reading. If the agreement is not acceptable, other types of MOs [4] that can better track human reading performance may be developed to replace the non-prewhitening template matching method. Alternatively, human observers can always be resorted to determine the number of fully visible spokes in a low-contrast phantom image.

Supporting Documentation

The Python code of the MRI-IQ-Evalulation tool is released in a GitHub repository [3] along with the documention and instructions to setup the environment and run the code. Demo code and User Manual are also provided to help users understand how to use the phantom creation code and IQ evaluation code with appropriate inputs,

MRI_IQ_ETK_UserManual.pdf: This document provides a summary of the toolkit, instructions on how ot install and run the code, and a description of all the functions contained in the software toolkit with their purposes, inputs and outputs.
Demo.py: This code demonstrates a full process from creating the k-space data of the three types of digital phantoms (Fig. 1) to reconstructing the k-space data into MRI images using a simple inverse Fourier transform (IFT) method, and evaluting the image quality metrics using the reconstructed phantom images.

References

[1] American College of Radiology. Phantom Test Guidance for Use of the Large MRI Phantom for the ACR. https://www.acraccreditation.org/-/media/acraccreditation/documents/mri/largephantomguidance.pdf

[2] Tan F, Delfino JG, Zeng R. Evaluating Machine Learning-Based MRI Reconstruction Using Digital Image Quality Phantoms. Bioengineering. 2024; 11(6):614. https://doi.org/10.3390/bioengineering11060614

[3] link to the MRI-IQ-Evaluation Github website: https://github.com/DIDSR/mr-recon-eval-core/

[4] H.H. Barrett, J. Yao, J.P. Rolland, K.J. Myers, Model observers for assessment of image quality., Proc. Natl. Acad. Sci. U.S.A. 90 (21) 9758-9765, https://doi.org/10.1073/pnas.90.21.9758 (1993).

Contact

RST_CDRH@fda.hhs.gov

Tool Reference

RST Reference Number: RST26MD03.01
Date of Publication: 05/04/2026
Recommended Citation: U.S. Food and Drug Administration. (2026). MRI-IQ-ETK: Digital MRI Test Phantoms and Automated Image Quality Evaluation Toolkit for Assessing MRI Image Reconstruction Methods (RST26MD03.01). https://cdrh-rst.fda.gov/mri-iq-etk-digital-mri-test-phantoms-and-automated-image-quality-evaluation-toolkit-assessing-mri