Introduction |
The photo retrieval
task of ImageCLEF2008 will take a different approach to evaluation by
studying image clustering. A good image search engine ensures that duplicate or
near duplicate documents retrieved in response to a query are hidden from the
user. Ideally the top results of a ranked list will
contain diverse items representing different sub-topics within the results.
Providing this functionality is particularly important when a user types
in a query that is either poorly specified or ambiguous; a common query
in image search. Given such a query, a search engine that retrieves a diverse,
yet relevant set of images at the top of a ranked list
is more likely to satisfy its users [1,2].
Another reason why it's a
good idea to promote diversity is because often different people type in the
same query but wish to see different results. So if a search engine knows
nothing about the user entering the query, a good strategy for the engine is to
produce results that are both diverse (i.e.
representative of all sub-topics) and relevant, effectively the engine
is spreading its bets on what the user might want to retrieve.
This is an
ImageCLEF task. |
The Task - Promote
Diversity |
Participants will run each provided topic on their image search system and produce a
ranking that in the top 20, holds as many relevant images that are
representative of the different sub-topics within the results.
The definition of what consitutes diversity will vary across the topics,
but there will be a clear indication in the topic (using a new topic tag, "cluster")
indicating what the clustering criteria the evaluators will use.
For each topic in the ImageCLEFPhoto set,
relevant images will be manually clustered into sub-topics and relevance
judgements will be augmented to indicate which
cluster an image belongs to. Relevance assessors will be instructed to look for
simple clusters based on the form of a topic. For example if a topic asks for
images of beaches in Brazil, clusters will be formed based on location; if a
topic asks for photos of animals, clusters will be formed based on animal type.
Participating groups will
return to us, for each topic, a ranked list of images IDs. We will determine which
images are relevant and count how many clusters are represented in the ranking.
We do not require you to identify or label clusters in the ranked list
how you choose to do the clustering is an internal matter for you.
Evaluation will be based
on two measures: precision at 20 and instance recall at rank 20 (also called
S-recall) [3], which calculates the
percentage of different clusters represented in the top 20. It will
be important to maximise both measures: simply getting lots of relevant images
from one cluster or filling the ranking with diverse, but non-relevant images,
will result in a poor overall effectiveness score.
Note, that it is quite
possible to submit runs from a "standard" non-clustering image search system,
though we would expect clustering systems to out-perform the standard
systems in producing a diverse ranked list in the top 20.
A version of the
collection will be made available that allows participants to explore cross
language aspects of image clustering. In this version, members of the clusters
will be captioned in two languages.
|
Query Topics |
Topics download (note password protected).
We will use existing
topic statements of past ImageCLEFPhoto years. The topic format is the same as
previous years but with an additional tag, the cluster tag, which defines how
the clustering of images should take place.
<top> <num> Number: 5 </num>
<title> animals swimming </title>
<cluster>animal</cluster> <narr> </narr>
<image> 3739.jpg </image> <image> 4968.jpg
</image> <image> 30823.jpg </image> </top>
Note: this year we
will only offer topic languages in English.
|
Data Collection
|
The collection is composed of a set of images (full size and thumbnails) and a set of annotations. Note site password protected. Annotations from the Visual Concept Detection Task of ImageCLEF 2008 are being made available at this site.
The
image collection of the
IAPR
TC-12 photographic collection consists of 20,000 still natural images (plus
20,000 corresponding thumbnails) taken from locations around the world and
comprising an assorted cross-section of still natural images[4]. This includes
pictures of different sports
and actions, photographs of people, animals, cities, landscapes and many other
aspects of contemporary life. Each image is associated with an
alphanumeric caption stored in a semi-structured format. These captions include
the title of the image, its creation date, the location at which the photograph
was taken, the name of the photographer, a semantic description of the contents
of the image (as determined by the photographer) and additional notes.
|
<DOC>
<DOCNO>annotations/00/60.eng</DOCNO> <TITLE>Palma
</TITLE> <DESCRIPTION>two lane street with large shops on the
right and smaller shops on the left; people are walking on the sidewalk, some
are crossing the street; cars are parked along the left side of the street as
well; </DESCRIPTION> <NOTES>The main shopping street in
Paraguay; </NOTES> <LOCATION>Asunción, Paraguay
</LOCATION> <DATE>March 2002 </DATE>
<IMAGE>images/00/60.jpg </IMAGE>
<THUMBNAIL>thumbnails/00/60.jpg </THUMBNAIL> </DOC>
|
The collection will
be available in two forms:
- All English, where all
titles, descriptions, locations, etc are written in the English
language
- A multi-lingual
version, where each image is annotated in a different language. The languages
used are English and German. (We hope to release this
collection by the end of April.)
Further information
about the image collection and links to related publications can be found
here.
|
How to Cheat (but please don't) |
There is one way to cheat at this year's ImageCLEFPhoto, we're trusting you not to do any of these:
- Use last year's qrels: the topics this year are the same as last year, which means some of you might still have copies of last year's relevance judgements (the judgements we release this year will be clustering information on top of the old qrels). We are trusting you not use last year's relevance judgements for this year's runs.
|
Submission format and
guidelines |
Submission formats
When you submit your runs, please name your runs using the following code
elements. All submissions should be sent by email to Thomas Arni (t.arni@sheffield.ac.uk).
|
Dimension |
Available Codes |
Topic language |
EN |
Annotation language |
EN, RND (DE, EN) |
Query/run type
|
AUTO, MAN
|
Modality |
IMG, TXT, TXTIMG |
Run type |
MAN, AUTO |
|
Query language
Used to specify the query language used in the run (this year English
only).
Annotation
language Used to specify the
target language (i.e. the annotation set) used for the run: English (EN) or
random (RND).
Modality This
describes the use of visual (image) or text features in your submission. A
text-only run will have modality text (TXT); a purely visual run will
have modality image (IMG) and a combined submission (e.g. initial text
search followed by a possibly combined visual search) will have modality
text+image (TXTIMG).
Run Type
We distinguish between manual (MAN) and automatic (AUTO) submissions.
Automatic runs will involve no user interaction; whereby manual runs are those
in which a human has been involved in query construction and the iterative
retrieval process, e.g. manual relevance feedback is performed. We encourage
groups who want to investigate manual intervention further to participate in
the interactive evaluation (iCLEF)
task.
|
What
to submit Participants can
submit a run in any of the permutations detailed in the previous table (above)
:
- EN-EN-AUTO-TXTIMG for the
English-English monolingual run using fully automatic text and image clustering methods
- EN-EN-MAN-TXT for the
English-English monolingual run using text clustering methods using some manual intervention
- EN-RND-AUTO-IMG for the
English-Random language run using fully automated image clustering methods
It is extremely
important that we can get a description of the techniques used for each
submitted run. This should be as detailed as possible to ease the comparison or
classification of techniques and results.
Submission format
Participants are required to submit ranked lists
of (up to) the top 1000 images ranked in descending order of similarity (i.e.
the highest nearer the top of the list). The format of submissions for this
ad-hoc task can be found
here
and the filenames should distinguish different types of submission according to
the table above. Participants can submit (via email) as many system runs as
they require. Please note
that there should be at least 1 document entry in your results for each topic
(i.e. if your system returns no results for a query then insert a dummy
entry, e.g. 25 1 16/16019 0 4238 xyzT10af5 ). The reason for this is to make
sure that all systems are compared with the same number of topics and relevant
documents. Submissions not following the required format will not be evaluated.
|
Important Dates
|
22 April 2008: |
Data and Topic Release |
UPDATE : Extended to 22 June 2008: |
Submission of retrieval runs due
|
15 July 2008: |
Release of retrieval results |
15 August 2008: |
Workshop papers due |
17-19 September 2008: |
CLEF workshop in Aarhus, Denmark |
|
Organisers of
ImageCLEFphoto |
Primary
Contact
Thomas Arni,
Department of Information Studies, University of Sheffield, UK
Mark Sanderson, Department
of Information Studies, University of Sheffield, UK (m.sanderson@shef.ac.uk)
Paul Clough, Department of Information
Studies, University of Sheffield, UK (p.d.clough@sheffield.ac.uk)
Michael Grubinger, School
of Computer Science and Mathematics, Victoria University, Australia
|
Mailing List |
We have
set up a mailing list: imageclef@sheffield.ac.uk for participants. Please
contact Paul Clough to be added to the list.
|
References |
[1] |
Chen, H.
and Karger, D. R. 2006. Less is more: probabilistic models for retrieving fewer
relevant documents. In Proceedings of the 29th Annual international ACM SIGIR
Conference on Research and Development in information Retrieval (Seattle,
Washington, USA, August 06 - 11, 2006). SIGIR '06. ACM, New York, NY,
429-436.
|
[2] |
Song,
K., Tian, Y., Gao, W., and Huang, T. 2006. Diversifying the image retrieval
results. In Proceedings of the 14th Annual ACM international Conference on
Multimedia (Santa Barbara, CA, USA, October 23 - 27, 2006). MULTIMEDIA '06.
ACM, New York, NY, 707-710.
|
[3] |
Zhai, C.
X., Cohen, W. W., and Lafferty, J. 2003. Beyond independent relevance: methods
and evaluation metrics for subtopic retrieval. In Proceedings of the 26th
Annual international ACM SIGIR Conference on Research and Development in
informaion Retrieval (Toronto, Canada, July 28 - August 01, 2003). SIGIR '03.
ACM, New York, NY, 10-17.
|
[4] |
Grubinger, M., Clough, P., Müller, H. and Deselaers, T. (2006),
The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information
Systems, In Proceedings of International Workshop OntoImage2006
Language Resources for Content-Based Image Retrieval, held in
conjuction with LREC'06, pages 13-23, Genoa, Italy, 22 May 2006 (pdf).
|
|
|