You are here

ExperLifeCLEF 2018


Usage scenario

Automated identification of plants and animals has improved considerably in the last few years. In the scope of LifeCLEF 2017 in particular, we measured impressive identification performance achieved thanks to recent deep learning models (e.g. up to 90% classification accuracy over 10K species). This raises the question of how far automated systems are from the human expertise and of whether there is a upper bound that can not be exceeded. A picture actually contains only a partial information about the observed plant and it is often not sufficient to determine the right species with certainty. For instance, a decisive organ such as the flower or the fruit, might not be visible at the time a plant was observed. Or some of the discriminant patterns might be very hard or unlikely to be observed in a picture such as the presence of pills or latex, or the morphology of the root. As a consequence, even the best experts can be confused and/or disagree between each others when attempting to identify a plant from a set of pictures. Similar issues arise for most living organisms including fishes, birds, insects, etc. Quantifying this intrinsic data uncertainty and comparing it to the performance of the best automated systems is of high interest for both computer scientists and expert naturalists.

Data Collection

To conduct a valuable experts vs. machines experiment, we collected image-based identifications from the best experts in the plant and the fish domains. Therefore, we created sets of observations that were identified in the field by other experts (in order to have a near-perfect golden standard). These pictures will be immersed in a much larger test set that will have to be processed by the participating systems. As for training data, the datasets of the previous LifeCLEF campaigns will be made available to the participants and might be extended with new contents. It will contain between 1M and 2M pictures.

Task overview

The goal of the task will be to return the most likely species for each observation of the test set. The small fraction of the test set identified by the pool of experts will then be used to conduct the experts vs. machines evaluation.


The two main evaluation metrics will be the top-1 and top-3 accuracy to allow a fair comparison with the human experts.

Registration and data access

Please refer to the general LifeCLEF registration instructions

manmachine.png298.98 KB