Image Classification Model Validation
This file performs the evaluation on the trained CNN model recording the classification report and balanced accuracy metrics within a csv file for further use.
To visualize and analyse the validation result metrics,
please view notebook/image_classification/image_classification_visualization
.
The data collected is placed within the notebook/image_classificaiton/taxon_image_classification_cache/
directory.
This validation process is structured to be run within a Docker container in order to train on a single GPU unit. Please review the documentation or README how to run the training and validation processes. For easy access here is the command to run the model training:
docker run --gpus all -u $(id -u):$(id -g) -v /path/to/project/root:/app/ -w /app -t ghcr.io/trav-d13/spatiotemporal_wildlife_classification/validate_image:latest
Please note, when using this file to evaluate a flat-classification model, make use of the global_mean_image_prediction()
method.
This averages the predictions for sub-images into a single image.
The following lines should also be included:
file_paths = test_ds.file_paths
before the dataset prefetching.
accumulated_score, file_true = global_mean_image_prediction(file_paths, preds, true_labels)
after the model predictions.
These methods produce an averaged and uniform softmax prediction per image. Use the accumulated_score as a replacement
within the preds = np.argmax(accumulated_score, axis=1)
code.
Please additionally changes the report and accuracy paths to access the global_image_classification_results.csv
and global_image_classification_accuracy.csv
Please additionally change the dataset to species_train
and species_validate
directories.
Attributes:
Name | Type | Description |
---|---|---|
img_size |
int
|
The specified image size as input to the EfficientNet-B6 model (528) |
model_name |
str
|
The name of the CNN model to evaluate against a validation set. |
test_path |
str
|
The path to the validation set of images to use to validate the model |
model_path |
str
|
The path to the location of the model. Always |
report_path |
os.path
|
The path to the csv file collecting all model classification reports. Please review notebook |
accuracy_path |
os.path
|
The path to the csv file collecting all model balanced accuracy values. Please review notebook |
add_model_accuracy(y_true, y_pred, taxon_level, taxon_name)
This method adds the models balanced accuracy metric to the csv file containing all balanced accuracies for further visualization and analysis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y_true |
list
|
The list of True labels |
required |
y_pred |
list
|
The list of predicted labels' |
required |
taxon_level |
str
|
The taxonomic target level. |
required |
taxon_name |
str
|
The standardized name of the taxonomic parent node for which the classifier is built |
required |
Source code in src/models/image/evaluate_taxonomic_model.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
|
add_model_report(y_true, y_pred, taxon_level, classes)
This method adds the classification report to the report csv file.
Note, only the rows describing each class are added to the report. The end 3 rows (accuracy, macro_avg, weighted avg) are excluded from being written to the report file. If the validation set is missing labels over which it is validated, it may include the last 3 rows, which would required manual removal.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y_true |
list
|
The list of True labels |
required |
y_pred |
list
|
The list of predicted labels' |
required |
taxon_level |
str
|
The taxonomic target level. |
required |
classes |
list
|
The list of classes (alphabetical order) over which the model was trained. |
required |
Source code in src/models/image/evaluate_taxonomic_model.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
|
generate_test_set()
This method generates the test/ validation dataset from the test_path. Note this is a completely separate dataset from the training process. The shuffle parameter is set to False in order to maintain the dataset order to align with the generated labels.
Returns:
Type | Description |
---|---|
tf.data.Dataset
|
The validation dataset used to produced validation metrics for the produced model. |
Source code in src/models/image/evaluate_taxonomic_model.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
|
generate_training_set()
This method generates the original training dataset in order to gather all class labels trained over.
This method looks at the training data to ensure that all labels are accounted for when determining prediction labels from the validation set.
Returns:
Type | Description |
---|---|
tf.data.Dataset
|
The training dataset over which the model was trained. |
Source code in src/models/image/evaluate_taxonomic_model.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
|
get_image_labels(ds, classes)
Method generates class names for the validation dataset, using the classes trained over.
Due to difficulties importing the file methods within the Docker container, this is a duplicate method from taxonomic_modelling.py.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ds |
tf.data.Dataset
|
The validation dataset |
required |
classes |
list
|
A list of the class labels (alphabetically ordered). This is sourced from the original training dataset |
required |
Returns:
Type | Description |
---|---|
list
|
A list of labels in the provided dataset, in the same order as specified in the dataset. |
Source code in src/models/image/evaluate_taxonomic_model.py
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
|
global_mean_image_prediction(image_paths, predicted_labels, true_labels)
This method is used within the Flat-classification models to aggregate, and normalize the sub-image predictions into a single prediction for each image
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_paths |
list
|
A list of file paths for each image in the dataset. Read main file documentation for additional info. The code to generate the filenames is |
required |
predicted_labels |
list
|
The list of predicted labels for all sub-images |
required |
true_labels |
list
|
The list of true labels for all sub-images. They are in the same order. |
required |
Returns:
Name | Type | Description |
---|---|---|
mean_predictions |
list
|
The summed, averaged, and normalized predictions for a single image (not a sub-image). |
individual_file_label |
list
|
The list of true labels for each image. Of the same size and order as the mean_predictions. |
Source code in src/models/image/evaluate_taxonomic_model.py
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 |
|
set_paths(current_model, path)
This method sets the essential paths to model and data resources
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_model |
str
|
The name of the model to validate. Must adhere to naming conventions of taxonomic modelling. Example: |
required |
path |
str
|
The path to the validation directory. This is the path within |
required |
Source code in src/models/image/evaluate_taxonomic_model.py
219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
single_model_evaluation(current_model, path, taxon_level, display=False)
This method provides a simple means of validating a trained CNN image model, through simple specification of the model name and dataset path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_model |
str
|
The name of the model to validate. Must adhere to naming conventions of taxonomic modelling. Example: |
required |
path |
str
|
The path to the validation directory. This is the path within |
required |
taxon_level |
str
|
The taxonomic level at which classification takes place (Family, Genus, Species, Subspecies). This is the level of the taxonomic children being classified. |
required |
display |
bool
|
A boolean value indicating whether the Confusion matrix of the model validation should be created and saved. |
False
|
Source code in src/models/image/evaluate_taxonomic_model.py
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 |
|