Datasets Loader

class DatasetLoader

Loader for MAPLES-DR dataset.

__init__()
property cfg: DatasetConfig

Return the default configuration of the loaded dataset.

check_maples_dr_integrity(path, biomarkers, images_names)

Check if the MAPLES-DR dataset contains all segmentation maps.

Paramètres:
clear_cache()

Clear the cache.

static clear_download_cache()

Clear the cache where the MAPLES-DR archive is downloaded and extracted.

configure(maples_dr_path='UNSET', messidor_path='UNSET', *, cache='UNSET', resize=None, image_format=None, preprocessing=None, exclude_missing_macula=None, exclude_missing_cup=None, disable_check=False)

Configure the default behavior of the MAPLES-DR dataset.

Any parameters left to None (or “UNSET” for the first two paths) will leave the current configuration unaffected.

Paramètres:
  • maples_dr_path (Optional[str], optional) –

    Path to the MAPLES-DR additional data. Must point to the directory or to the zip file.

    If None (by default), then the dataset is downloaded from figshare.

  • messidor_path (Optional[str], optional) –

    Path to the MESSIDOR dataset.

    Must point to a directory containing the « Base11 », « Base12 », … subdirectories or zip files.

  • cache (Optional[str], optional) –

    Path to the cache directory. The cache is used to store the downloaded dataset and the generated images.

    • If cache is a str or a Path, then the cache is stored in the given directory.

    • If False (by default), then the cache is disabled.

    • If True, then the cache is stored in the default cache directory.

  • resize (Optional[int], optional) –

    Set the size of the images (fundus and biomarkers) generated by maples_dr.

    • If resize is an int, crop the image to a square ROI and resize it to the shape (resize, resize);

    • If True, keep the original MAPLES-DR resolution of 1500x1500 px;

    • If False, use the original MESSIDOR resolution if MESSIDOR path is configured, otherwise fallback to MAPLES-DR original resolution.

  • image_format (Optional[ImageFormat], optional) –

    Python format of the generated images. Must be either « PIL », « rgb » or « bgr ».

    If « rgb » or « bgr » is selected, images will be formatted as numpy array of shape: (height, width, channel).

    By default, « PIL » is used.

  • preprocessing (Optional[Preprocessing], optional) –

    Preprocessing algorithm applied on the fundus images.

    By default, no preprocessing is applied.

  • disable_check (bool, optional) – If True, disable the integrity check of the dataset.

  • exclude_missing_macula (bool, optional) –

    If True, exclude images with missing macula segmentation (one image of the train set).

    By default: False.

  • exclude_missing_cup (bool, optional) –

    If True, exclude images with missing optic cup segmentation (4 images of the train set, 2 of the test set).

    By default: False.

static discover_messidor_images(images, path=None)

Discover the MESSIDOR images corresponding to the given MAPLES-DR images.

Paramètres:
  • images (list[str]) – List of MAPLES-DR images names. The image name should not contain the extension.

  • path (str | Path | None) – Path to the MESSIDOR dataset.

Type renvoyé:

Dict[str, Path]

ensure_configured()

Ensure the dataset is initialized.

image_names(subset=DatasetSubset.ALL, extension=False)

Return the list of images names of the given subset.

Paramètres:
  • subset (DatasetSubset | str) – Subset to return the images names from. If None, return all images names. Must be either None, « train », « test » or « duplicates ».

  • extension (bool | str) – Control whether the images names should include the extension or not. - If False (default), return the images names without the extension. - If True, return the images names with a png extension. - If a string, return the images names with the given extension.

Type renvoyé:

list[str]

is_biomarker_segmented(biomarker, name)

Check if the given biomarker is segmented in the MAPLES-DR dataset.

Note

The macula segmentation is missing for one image centered on the optic disc.

The optic cup boundaries are too fuzzy to be segmented on six images.

Paramètres:
Renvoie:

True if the biomarker is segmented, False otherwise.

Type renvoyé:

bool

is_configured()

Check if the dataset is initialized.

Type renvoyé:

bool

static load_biomarkers_annotation_infos(path)

Load the MAPLES-DR biomarkers annotation infos file.

Paramètres:

path (str) – Path to the MAPLES-DR biomarkers annotation infos file.

Type renvoyé:

DataFrame

load_dataset(subset=DatasetSubset.ALL)

Return the MAPLES-DR dataset.

Paramètres:

subset (DatasetSubset | str | list[str]) – Subset of the dataset to return. If None, return the whole dataset. Must be either None, « train » or « test » or a list of valid image name.

Type renvoyé:

Dataset

static load_dataset_record_and_rois(path)

Load the MAPLES-DR dataset record and the rois in MESSIDOR images.

Paramètres:

path (str) – Path to the MAPLES-DR dataset folder.

Type renvoyé:

Tuple[Dict, Dict]

static load_maples_dr_diagnosis(path)

Load the MAPLES-DR diagnostic file.

Paramètres:

path (str) – Path to the MAPLES-DR diagnostic file.

Type renvoyé:

DataFrame

property maples_dr_folder: Path

Return the path to the MAPLES-DR dataset folder.

class NotConfiguredError

Exception raised when the dataset loader is not configured.