U.S. flag An official website of the United States government

SegVal-WSI: Whole Slide Image Segmentation Algorithm Performance Assessment Tool

Catalog of Regulatory Science Tools to Help Assess New Medical Devices 

This regulatory science tool (RST) is a software program written in Python for performance assessment of segmentation algorithms applied to digital pathology whole slide images (WSIs).

Technical Description

The Whole Slide Image Segmentation Algorithm Performance Assessment Tool is a software program written in Python for performance assessment of segmentation algorithms applied to digital pathology whole slide images (WSIs). The tool accepts the segmentation results in the form of confusion matrices across a set of WSIs to generate Dice score performance results for the entire sets of whole slide images containing different number of annotated ROIs.

The tool provides the following components:

  • Dice score calculations for the entire set of WSIs.
  • Bootstrapped confidence intervals of the Dice score values.

Intended Purpose 

The Whole Slide Image Segmentation Algorithm Performance Assessment Tool is designed to provide benchmark performance evaluation methods for segmentation algorithms applied to digital pathology WSIs. In a WSI, usually, several regions-of-interests (ROIs) are annotated as the reference standard and the algorithm segmentation results are compared to these annotations across the ROIs. The Whole Slide Image Segmentation Algorithm Performance Assessment Tool provides segmentation performance results in terms of Dice score across the entire set of WSIs containing different number of annotated ROIs.

Testing

The tool has been tested to ensure the input confusion matrices generate the intended output. The output results have been validated manually using well-defined example cases to assure the algorithm provided the correct results. The tool has also been used internally to produce results for one publication and has been used in a research project and has been peer-reviewed as part of our submission [1].

Limitations

  1. Only one performance metric, namely Dice score, is implemented.
  2. The tool assumes a dataset where each patient has one WSI (i.e., the WSIs are independent). Extension of the code is needed to deal with the situation with multiple WSIs per patients. 

Supporting Documentation

The documentation is embedded in the tool as a user guide for Python scripts usage. Please refer to the User-Manual PDF file and access online documentation at the following links: 

Contact

Tool Reference