Datasets
- class Dataset
A set of samples from the MAPLES-DR dataset.
Datasets are a utility class to access and export samples from the MAPLES-DR dataset.
They are equivalent to a list of samples, each sample being stored as a dictionary. See
Field
for the list of available fields.- __getitem__(idx)
Get a sample from the dataset.
The sample is returned as a
DataSample
.
- annotations_infos()
Get the annotations information of the dataset.
- Returns:
The annotations information of the dataset.
- Return type:
pd.DataFrame
- property data: DataFrame
The data of the dataset.
A dataframe containing the information of each sample. It has the following columns:
index: the name of the sample.
fundus
: the path to the fundus image.BiomarkerField
(accept aggregated): the path to the biomarkers masks.BiomarkersAnnotationTasks
_BiomarkersAnnotationInfos
: the additional annotations informations.dr
: the consensus DR grade.me
: the consensus ME grade.dr_{A|B|C}
: the DR grade given by one retinologist.me_{A|B|C}
: the ME grade given by one retinologist.dr_{A|B|C}_comment
: comments from the retinologist when grading the DR diagnosis.
- diagnosis(pathology=None)
Get the diagnosis of the dataset.
- Parameters:
pathology (Pathology, optional) –
The pathology to get the diagnosis for.
If None, get the diagnosis for both pathologies.
- Returns:
The diagnosis of the dataset.
- Return type:
pd.Series
- export(path, fields=None, fundus_as_jpg=True, missing_as_blank=False, pre_annotation=False, n_workers=None)
Export the dataset to a folder.
- Parameters:
path (str | Path) – Path of the folder where to export the dataset.
fields (FundusField | BiomarkerField | List[FundusField | BiomarkerField] | Dict[FundusField | BiomarkerField, str] | None) –
The fields to be exported.
If None, export the whole dataset.
If
fields
is a string or list, export only the given fields.If
fields
is a dictionary, export the fields given by the keys
and rename their folder to their corresponding dictionary values.
fundus_as_jpg (bool) – If True, save the fundus images (raw and preprocessed) as JPEG images. This format will drastically reduce load times but may introduce compression artifacts.
pre_annotation (bool) – If set to
True
, write the pre-annotation biomarkers instead of the reviewed ones.missing_as_blank (bool) – If set to
True
, missing biomarkers will be exported as blank images.n_workers –
The number of workers to use for the export.
If None, use the number of CPUs available.
- get_sample_infos(idx)
Get the information of a sample.
- Parameters:
idx (str | int) – Index of the sample. Can be an integer or the name of the sample (i.e. “20051116_44816_0400_PP”).
- Returns:
The information of the sample.
- Return type:
pd.Series
- Raises:
IndexError – If the index is out of range.
KeyError – If the image name is unknown.
- keys()
Get the names of the samples in the dataset.
- Returns:
The names of the samples.
- Return type:
List[str]
- subset(*arg, start=None, end=None, step=None)
Get a subset of the dataset.
- class DataSample
A sample from the MAPLES-DR dataset.
A sample is a dictionary containing the information of a single sample from the dataset.
- export(path, fields=None, fundus_as_jpg=True, *, pre_annotation=False, missing_as_blank=False)
Export the sample to a folder.
- Parameters:
path (str | Path) – Path of the folder where to export the sample.
fields (FundusField | BiomarkerField | List[FundusField | BiomarkerField] | Dict[FundusField | BiomarkerField, str] | None) –
The fields to be exported.
If None, export all available fields.
If
fields
is a string or list, export only the given fields.If
fields
is a dictionary, export the fields given by the keys
and rename their folder to their corresponding dictionary values.
fundus_as_jpg (bool) – If True, save the fundus images (raw and preprocessed) as JPEG images. This format will drastically reduce load times but may introduce compression artifacts.
pre_annotation (bool) – If set to
True
, write the pre-annotation biomarkers instead of the reviewed ones.missing_as_blank – If set to
True
, missing biomarkers will be exported as blank images.
- Returns:
A mapping of the exported fields to their paths (or None if the field was not exported).
- Return type:
Mapping[Field, Optional[str]]
- keys() a set-like object providing a view on D's keys
- Return type:
- read_biomarker(biomarkers, image_format=None, resize=None, pre_annotation=False, no_cache=False)
Read a biomarker from the sample.
- Parameters:
biomarkers (BiomarkerField | str | Iterable[BiomarkerField | str]) –
Name of the biomarker(s) to read. Possible values are:
'opticCup'
,'opticDisc'
,'macula'
,'vessels'
,'brightLesions'
,'cottonWoolSpots'
,'drusens'
,'exudates'
,'brightUncertains'
,'redLesions'
,'hemorrhages'
,'microaneurysms'
,'neovascularization'
,'redUncertains'
(seeBiomarkerField
for more details).If multiple biomarkers are given, the corresponding masks will be merged.
image_format (ImageFormat | None) –
Format of the image to return. Possible values are:
'PIL'
,'BGR'
or'RGB'
(seeImageFormat
for more details.).If
None
(by default), use the format defined in the configuration.Resize the image to the given size.
If
resize
is an int, crop the image to a square ROI and resize it to the shape(resize, resize)
;If
True
, keep the original MAPLES-DR resolution of 1500x1500 px;If
False
, use the original MESSIDOR resolution if MESSIDOR path is configured, otherwise fallback to MAPLES-DR original resolution.If
None
(by default), use the size defined in the configuration.
pre_annotation (bool) –
If set to
True
, read the pre-annotation biomarkers instead of the final ones.Warning
Only hemorrhages, microaneurysms, exudates and vessels have pre-annotations.
no_cache (bool) – If set to
True
, the cache will not be used to read the biomarker, regardless of the configuration.
- Return type:
The biomarker mask under the format specified.
- read_field(field, image_format=None, resize=None, pre_annotation=False)
Read one field of the sample.
This function is similar to __getitem__ but provides more options to format the result (resize, image format…).
- Parameters:
field (Field | str) –
Any field from:
BiomarkerField
: a biomarker name, possible values are:'opticCup'
,'opticDisc'
,'macula'
,'vessels'
,'brightLesions'
,'cottonWoolSpots'
,'drusens'
,'exudates'
,'brightUncertains'
,'redLesions'
,'hemorrhages'
,'microaneurysms'
,'neovascularization'
,'redUncertains'
.DiagnosisField
: a diagnosis name, possible values are:'dr'
,'me'
.FundusField
: a fundus field, possible values are:'fundus'
,'rawFundus'
,'mask'
.
image_format (ImageFormat | None) –
Format of the image to return.
If
None
(by default), use the format defined in the configuration.Resize the image to the given size.
If
resize
is an int, crop the image to a square ROI and resize it to the shape(resize, resize)
;If
True
, keep the original MAPLES-DR resolution of 1500x1500 px;If
False
, use the original MESSIDOR resolution if MESSIDOR path is configured, otherwise fallback to MAPLES-DR original resolution.If
None
(by default), use the size defined in the configuration.
pre_annotation (bool) –
If set to
True
, read the pre-annotation biomarkers instead of the final ones.Warning
Only hemorrhages, microaneurysms, exudates and vessels have pre-annotations.
- Returns:
The field under the format specified.
- Return type:
Image.Image | np.ndarray | str
- read_fundus(preprocess=None, image_format=None, resize=None, no_cache=False)
Read the fundus image of the sample.
- Parameters:
preprocess (Preprocessing | str | bool | None) –
Preprocessing to apply to the image.
If a
Preprocessing
(or an equivalent string), the image is preprocessed with the given preprocessing;if
False
, the image is not preprocessed.if
None
(by default) orTrue
, use the preprocessing defined in the configuration.
image_format (ImageFormat | None) –
Format of the image to return. Possible values are:
'PIL'
,'BGR'
or'RGB'
(seeImageFormat
for more details.).If
None
(by default), use the format defined in the configuration.Resize the image to the given size.
If
resize
is an int, crop the image to a square ROI and resize it to the shape(resize, resize)
;If
True
, use the original MAPLES-DR resolution of 1500x1500 px;If
False
, keep the original MESSIDOR resolution.If
None
(by default), use the size defined in the configuration.
no_cache (bool) – If set to
True
, the cache will not be used to read the fundus image, regardless of the configuration.
- Return type:
The fundus image under the format specified.
- read_multiple_biomarkers(biomarkers, image_format=None, pre_annotation=False, resize=None)
Read multiple biomarkers at once, assigning a class to a biomarker or a group of them.
- Parameters:
biomarkers (Mapping[int, BiomarkerField | str | List[BiomarkerField | str]]) – Name of the biomarker(s) to read. Possible values are:
'opticCup'
,'opticDisc'
,'macula'
,'vessels'
,'brightLesions'
,'cottonWoolSpots'
,'drusens'
,'exudates'
,'brightUncertains'
,'redLesions'
,'hemorrhages'
,'microaneurysms'
,'neovascularization'
,'redUncertains'
(seeBiomarkerField
for more details).image_format (ImageFormat | None) –
Format of the image to return. Possible values are:
'PIL'
,'BGR'
or'RGB'
(seeImageFormat
for more details.).If None (by default), use the format defined in the configuration.
pre_annotation (bool) –
If set to
True
, read the pre-annotation biomarkers instead of the final ones.Warning
Only hemorrhages, microaneurysms, exudates and vessels have pre-annotations.
- read_roi_mask(image_format=None, resize=None)
Read the region of interest of the fundus image.
- Parameters:
image_format (ImageFormat | None) –
Format of the image to return. Possible values are:
'PIL'
,'BGR'
or'RGB'
(seeImageFormat
for more details.).If
None
(by default), use the format defined in the configuration.Resize the image to the given size.
If
resize
is an int, crop the image to a square ROI and resize it to the shape(resize, resize)
;If
True
, use the original MAPLES-DR resolution of 1500x1500 px;If
False
, keep the original MESSIDOR resolution.If
None
(by default), use the size defined in the configuration.
- Returns:
The region of interest of the fundus image.
- Return type:
np.ndarray
Datasets Fields
- Field
The name of a valid field in MAPLES-DR.
This type alias is a union of
DiagnosisField
,BiomarkerField
, andFundusField
.
- class DiagnosisField
String Enum of MAPLES-DR diagnosis fields.
- DR = 'dr'
The Diabetic Retinopathy grade.
R0
: No DR.R1
: Mild Non-Proliferative DR.R2
: Moderate Non-Proliferative DR.R3
: Severe Non-Proliferative DR.R4A
: Proliferative DR.R4S
: Treated and Stable Proliferative DR.R6
: Insufficient quality to grade.
- ME = 'me'
The Macular Edema grade.
M0
: No ME.M1
: ME without center involvement.M2
: ME with center involvement.M6
: Insufficient quality to grade.
- __new__(value)
- ImageField
The name of a valid image field in MAPLES-DR.
This type alias is a union of
BiomarkerField
andFundusField
.
- class FundusField
String Enum of MAPLES-DR fields associated with fundus images.
Warning
Path to MESSIDOR fundus images must be configured to use these fields! See
maples_dr.configure()
for more information.- FUNDUS = 'fundus'
The preprocessed fundus image (or the original fundus image if no preprocessing is applied).
- RAW_FUNDUS = 'rawFundus'
The raw fundus image. (If no preprocessing is applied, this is the same as
FundusField.FUNDUS
.)
- MASK = 'mask'
The mask of the fundus image.
- __new__(value)
- class BiomarkerField
String Enum of MAPLES-DR biomarkers fields.
- OPTIC_CUP = 'opticCup'
The optic cup mask.
- OPTIC_DISC = 'opticDisc'
The optic disc mask.
- MACULA = 'macula'
The macula mask.
- VESSELS = 'vessels'
The vessels mask.
- BRIGHT_LESIONS = 'brightLesions'
Union of all the bright lesions masks (CWS, exudates, drusens and uncertain).
- COTTON_WOOL_SPOTS = 'cottonWoolSpots'
The cotton wool spots mask.
- DRUSENS = 'drusens'
The drusens mask.
- EXUDATES = 'exudates'
The exudates mask.
- BRIGHT_UNCERTAINS = 'brightUncertains'
The mask of bright lesions with uncertain type (either CWS, exudates or drusens).
- RED_LESIONS = 'redLesions'
Union of all the red lesions masks (microaneurysm, hemorrhages and uncertain).
- HEMORRHAGES = 'hemorrhages'
The hemorrhages mask.
- MICROANEURYSMS = 'microaneurysms'
The microaneurysms mask.
- NEOVASCULARIZATION = 'neovascularization'
The neovascularization mask.
- RED_UNCERTAINS = 'redUncertains'
The mask of red lesions with uncertain type (either microaneurysm or hemorrhages).
- __new__(value)