Latest news
General task description
Organizers
LATEST NEWS
- Nov 11, 2009: Three more languages added to the task: Dutch, German and Italian. Trial data for them will be shortly released.
- Oct 27, 2009: Change of the English corpora. Training and trial data will be from OntoNotes and ARRAU. Both gold standard and automatically annotated information will be provided. The new release will be shortly announced.
- Sep 15, 2009: Trial data is available from the SemEval site.
- Sep 10, 2009: Postponed - Trial data will be released in five days.
- Sep 8, 2009: Scorers available from Download area.
- Aug 28, 2009: Trial data will be released Sep 8, 2009.
- Aug 11, 2009: Join our mailing list to stay up to date!
- Aug 11, 2009: Use the forum to contact the organizers and/or participants and leave feedback.
- June 4, 2009: Poster presented at SEW-2009 (Workshop on Semantic Evaluations: Recent Achievements and Future Directions) - SemEval-2010 Task 1: Coreference Resolution in Multiple Languages
- No 30, 2008: This new website has been posted. Welcome to the task!
- Nov 5, 2008: Changes in the web
GENERAL TASK DESCRIPTION
Using coreference information has been shown to be beneficial in a number of NLP applications including Information Extraction, Text Summarization, Question Answering and Machine Translation. This task is concerned with automatic coreference resolution for six different languages: Catalan, Dutch, English, German, Italian and Spanish. Two tasks are proposed for each of the languages:
- Full task. Detection of full coreference chains, composed by named entities, pronouns, and full noun phrases.
- Subtask. Pronominal resolution, i.e., finding the antecedents of the pronouns in the text.
In particular, we aim:
(i) To study the portability of coreference resolution systems across languages (Catalan, Dutch, English, German, Italian, Spanish)
- To what extent is it possible to implement a general system that is portable to the three languages?
- How much language-specific tuning is necessary?
- Are there significant differences between Germanic and Romance languages? And between languages of the same family?
(ii) To compare four different evaluation metrics (MUC, B-CUBED, CEAF and BLANC) for coreference resolution.
- Do all evaluation metrics provide the same ranking? Is there one that provides a more accurate picture of a system's accuracy?
- Is there a strong correlation between them?
- Can statistical systems be optimized under all four metrics at the same time?
Although we target at general systems addressing the full multilingual task, we will allow taking part in any full/sub-task of any language.
For further details see the sections:
Task Description
Datasets and Formats
Evaluation
[Back to the top]
ORGANIZERS
- Véronique Hoste (Hogeschool Gent)
- Lluís Màrquez (TALP, Universitat Politècnica de Catalunya)
- M. Antònia Martí (CLiC, University of Barcelona)
- Massimo Poesio (University of Essex / Università di Trento)
- Marta Recasens (CLiC, University of Barcelona)
- Emili Sapena (TALP, Universitat Politècnica de Catalunya)
- Mariona Taulé (CLiC, University of Barcelona)
- Yannick Versley (University of Tübingen)
- Other people behind the preparation of the corpora: Manuel Bertran (UB), Oriol Borrega (UB), Jesús Giménez (UPC), Richard Johansson (U.Trento) Xavier Lluís (UPC), Montse Nofre (UB), Lluís Padró (UPC), Mihai Surdeanu (U.Stanford), Lente Van Leuven (UB) and Rita Zaragoza (UB).
[Back to the top]
For queries, feedback or more information, feel free to post in the forum.