Welcome to the FabSpace 2.0 Exploring Sentinel Copernicus Images pilot Task!
The aim of the pilot task is exploring Copernicus Earth Observation data (Sentinel 2 satellite images) in order to estimate the population of an area of interest.
Population estimation is fundamental to provide any services for a particular region. For instance, before engaging any rescue operation or humanitarian action, NGOs need to estimate the number of local population as accurately as possible. Traditional approaches as census data is possible but time consuming and expensive. The analysis of multi spectral satellite data is a quicker and cheaper process to estimate population. Counting the number of buildings can provide a first estimate; however, it may not be enough since people in various places in the globe do not live the same way, the population may vary in summer and winter in different touristic places, or population may vary where there is easy access to public services or amenities etc.
• 25.10.2016: Website is up!
• 28.11.2016: First participants registered
• 14.11.2016: Registration opens (Register here).
• 14.11.2016: Development data release (after having registered, you will recieve an email with instructions to get the data).
• 20.03.2017: Test data release.
• 01.05.2017: Deadline for submission of runs by the participants 11:59:59 PM GMT.
• 15.05.2017: Release of processed results by the task organizers.
• 26.05.2017: Deadline for submission of working notes papers by the participants 11:59:59 PM GMT.
• 17.06.2017: Notification of acceptance of the working notes papers.
• 01.07.2017: Camera ready working notes papers.
• 11.-14.09.2017: CLEF 2017, Dublin, Ireland.
In this pilot task, participants will have to estimate the population for different areas in two regions. To achieve this goal, we provide a set of satellite images (Copernicus Sentinel 2). A satellite image may not cover the whole study region for which they correspond. Therefore, we provide more than one satellite image for each region of interest. Moreover, the initial satellite images may cover a large area, much larger than the areas of study. We preprocessed the satellite images and clipped them to the bounding box of the areas of interest. The boundaries of the areas of interest are provided as shapefiles. The clipped satellite images are provided as well as the metadata of the original images (before clipping).
However, participants are allowed to use any other resource they think might help to reach the highest accuracy. In addition, The FabSpace online platform and personnel will provide support on demand. In their working note and publication associated to the task, participants will have to describe the resources they used to solve the task and to indicate how effectively the Sentinel 2 images helped.
The data set consists in the geographic information as:
- ESRI shape file: one for the region; each region is in turn divided into several areas for which the population has to be estimated. The projected shape file of the region has the necessary attributes.
- City of Lusaka: The subareas are based on Operational Divisions, a unit defined by Médecins Sans Frontières in 2016. This organization divided the city of Lusaka in 83 units. For this region, the data set consists of: (1) ESRI shape file including locational and attributes information, (2) Sentinel 2 satellite images covering the area in two images and for each image there are 1 to 12 bands, (3) XML meta data associated to image files.
- West Uganda: In Uganda there is 17 subdivisions and for this region data sets consists of: (1) ESRI shape file including locational and attributes information, (2) Sentinel 2 satellite images covering the area in five images and for each image there is 1 to 9 bands, (3) XML meta data associated to image files.
The remote sensing imagery provided comes from the Sentinel 2 platform. The imagery is multi spectral, cloud-free satellite downloaded from Sentinel Data Hub (https://scihub.copernicus.eu/dhus/#/home). As previously described, the images have been clipped to match the bounding box of the areas of interest. The bands for images from this platform have different spatial resolutions: 10 meters for bands B2 (490nm), B3 (560nm) B4 (665 nm) and B8 (84nm). 20 meters for bands B5 (705nm), B6 (749nm) B7 (783nm), B8a (865nm) B11 (1610nm) and B12 (2190nm). For the analysis, participants will probably use Red, Green and Blue bands or in some cases near infrared bands which are 10 meters resolution.
Information regarding the original image is provided in XML files. These files contain information like capture time/date, sensor mode, orbit number, the id of quality files, etc. Further information regarding the Sentinel 2 products, as well as file structure can be found in the Sentinel 2 User handbook (https://sentinel.esa.int/documents/247904/685211/Sentinel-2_User_Handbook).
The original data source Sentinel Data Hub provides information free of charge with a easy to use interface. However, in order to facilitate the access to the data, we have identified a number of images with low cloud cover that cover the area of interest. We have preprocessed these images so that they cover only the areas of interest. The use of the proposed images is not mandatory. The images we offer are stored in zipped files with the following folder structure:
[NAME OF THE STUDY REGION]: Lusaka or Uganda.
[shp]: This folder contains a shapefile with the boundaries of the study areas.
[ID_OF_SATELLITE_IMAGE]: Original id of the image as in the Sentinel Data Hub.
[bands]: This folder contains the bands of the image. Each band is a geoTiff file. Each band corresponds to a certain electromagnetic bandwith captured by the sensor.
[xml]: This folder contains the XML files that contain information regarding the images.The information applies to the original source (before image clipping). By using the information in this file a user can obtain the original dataset.
In this challenge, we are interested in quantifying the population in two regions of interestes geographically separated(Lusaka and West Uganda). To acchieve this task the participants will identify relevant datasets:
- Remote Sensing Imagery suggested by the organizers of the challenge.
- Other remote sensing imagery available to the participants.
- Datasets of a nature other than remote sensing, available to the participants.
Registering for the task and accessing the data
Please register by following the instructions found in the main webpage of ImageCLEF 2017 webpage.
Following the approval of registration for the task, the participants will be given access rights to download the data files.
The participant runs should be sent through the ImageCLEF system. Participants will be allowed to submit up to 10 runs.
A run consists of a csv file. This file will contain for each row:
- The id of the operational zone,
- The estimated population (number of individuals),
- The maximum estimated population,
- The minimum estimated population,
- The estimated number of dwellings (optional, keep empty if not estimated)
- The maximum estimated number of dwellings (optional, keep empty if not estimated),
- The minimum estimated number of dwellings (optional, keep empty if not estimated),
- The run name and detail calculation process
Each one of the runs must be submitted in a separated file that contains the estimations for both regions. The result for each area should be in a single line (thus each line ends with a carriage return). A sample of the run "Experiment_001" is provided.
The evaluation will be based on the comparison between the estimations and ground truth. For the city of Lusaka, the ground truth comes with a categorical evaluation measure of the population estimation, Good (23 over the 83 areas), Acceptable (37), Doubts (9), High doubts (6) and Unknown (8). For West Uganda, the ground truth corresponds to estimations that are based on a combination of Volunteered Geographic information (VGI) working on BING imagery (2012) with additional ground work. Both have been provided by NGOs.
To evaluate prediction accuracy, we will evaluate using several measures:
- The correlation (Pearson) between the participant’s run and ground truth,
- The deviations (sum of deltas) between the participant’s run and ground truth.
The ground truth has two origins. Even more, in one specific area, we have a qualitative evaluation of the estimation. To provide a better estimation of system accuracy, we will also split the results in three subsets. 1) for West Uganda, 2) for Lusaka operational zones with an evaluation equal to Good (23) or Acceptable (37) and 3) For Lusaka in which the estimations have been qualified as Doubts (9), High doubts (6) and Unknown (8).
Submitting a working notes paper to CLEF
Upon the completion of the task, participating teams are expected to present their systems in a working note paper, regardless their results. You should keep in mind that the main goal of the lab is not to win the benchmark but compare techniques based on the same data, so everyone can learn from the results. Authors are invited to submit using the LNCS proceedings format.
The CLEF 2017 working notes will be published in the CEUR-WS.org proceedings, facilitating the indexing by DBLP. According to the CEUR-WS policies, a light review of the working notes will be conducted by the task organizers to ensure quality.
Working notes will have to be submitted before 26th May 2017 11:59 pm - midnight - Central European Summer Time, through the EasyChair submission system. The working notes papers are technical reports written in English and describing the participating systems and the conducted experiments. To avoid redundancy, the papers should *not* include a detailed description of the actual task, data set and experimentation protocol. Instead of this, the papers are required to cite both the general ImageCLEF overview paper and the corresponding FabSpace 2.0 Exploring Sentinel Copernicus Images pilot task overview paper, and to present the official results returned by the organizers. Bibtex references will be available soon. A general structure for the paper should provide at a minimum the following information:
4. Email addresses of all authors
5. The body of the text. This should contain information on:
- tasks performed
- main objectives of experiments
- approach(es) used and progress beyond state-of-the-art
- resources employed
- results obtained
- analysis of the results
- perspectives for future work
The paper should not exceed 12 pages, and further instructions on how to write and submit your working notes will be available soon on this page.
Helpful tools and resources
- W. Shuo-sheng,Xiaomin Q.,Wang L.Population Estimation Methods in GIS and Remote Sensing: A Review. GIScience and Remote Sensing 1(42)(2005)80-96.
- J. Patino, Duque, J.C. A review of regional science applications of satellite remote sensing in urban areas.Computers, Environment and Urban Systems 37 (2013) 1-17.
- M. Langford, Unwin, D.J. Generating and mapping population density surfaces within a geographical information system. The Cartographic Journal 31(1)(1994).
- M.Alahmadi,P. Atkinson,D.Martin. A comparison of Small-Area Population estimation techniques using built-area and height data, Riyadh, Saudi Arabia.IEEE Geoscience and Remote Sensing Society. 9 (5) (2016).
- Helbert Arenas, helbert.arenas(at)irit.fr, Institut de Recherche en Informatique de Toulouse, UMR5505 CNRS, Université de Toulouse, France
- Bayzidul Islam, bislam(at)psg.tu-darmstadt.de, Technische Universität Darmstadt, Germany
- Josiane Mothe, josiane.mothe(at)irit.fr, Institut de Recherche en Informatique de Toulouse, UMR5505 CNRS, ESPE, Université de Toulouse, France (contact)
- Dimitrios Soudris, dimitrios.soudris(at)microlab.ntua.gr, Microprocessors and Digital Systems Laboratory of Institute of Communication and Computer Systems, Athens, Greece