Over the last few years, the JOKER Track has created an active community of
researchers in NLP and IR working together on the non-literal use of language
in text—which is still challenging for both AI models and humans, as it
requires understanding implicit cultural references and double meanings. Its
benchmarks on humorous text analysis, retrieval, and translation have become
standard references. We made significant changes to the track's setup and
tasks in 2024 and 2025, and propose continuing these to complete the test
collections. The CLEF 2026 JOKER track will contain the following four tasks.
Task 1 (Humor-aware Information Retrieval): retrieve short humorous
texts for a query. Task 2 (Pun Translation): translate puns from English
to French and Spanish. Task 3 (Onomastic Wordplay Translation):
translate onomastic wordplay from English to French. Task 4 (Humor
Generation): guided creativity.
@inproceedings{ermakova2026clef,
author = {Liana Ermakova and Igor Kuzmin and Poojan
Vachharajani and Tristan Miller and Anne-Gwenn Bosser and Jaap
Kamps},
editor = {Ricardo Campos and Adam Jatowt and Yanyan Lan and
Mohammad Aliannejadi and Christine Bauer and Sean MacAvaney and Avishek Anand
and Zhaochun Ren and Suzan Verberne and Nan Bai and Masoud
Mansoury},
title = {{CLEF} 2026 {JOKER} Track: Humour Detection, Search,
and Translation},
booktitle = {Advances in Information Retrieval: 48th {European}
{Conference} on {Information} {Retrieval}, {ECIR}~2026, {Delft}, {The}
{Netherlands}, {March}~29–{April}~2, Proceedings, Part~{IV}},
Punning is a form of humorous wordplay based on semantic ambiguity between two
phonologically similar words – the pun and the target – in a
context where both meanings are more or less acceptable. While the pun is
expressed explicitly, the target is invoked implicitly in the text. Previous
work has attempted to quantify and compare phonological features of puns and
their targets, looking at correlations with the understandability of the
jokes in which they occur. Our study quantifies the phonological distance
between pun and target words and assesses possible correlations with
funniness ratings of the corresponding jokes. Our statistical analyses on a
large dataset of puns reveal a significant negative correlation between
phonological distance and perceived funniness for two of the four
phonological distance measures we applied. This finding supports the
hypothesis, often (implicitly) made in previous research but never verified
at this scale, that lower phonological distance between a pun and its target
is associated with higher funniness ratings. The parameters of our study
suggest that future work should examine the semantic features of pun and
target in order to create a more holistic understanding of what contributes
to the perceived funniness of punning jokes.
@article{palmann2025whats,
author = {Anna Palmann and Tristan Miller},
title = {What's in a Pun? Assessing the Relationship Between
Phonological Distance and Perceived Funniness of Punning Jokes},
journal = {Humor: International Journal of Humor
Research},
Large language models (LLMs) are on the rapid rise to empower human researchers
in science production at all stages, from the initial conception of research
problems to reporting scientific discoveries. In 2025, American publisher
Wiley surveyed 5,000 researchers across 70 countries and found that majority
support LLM adoption in scientific production. While LLMs could enable
faster, cost-effective research addressing global challenges, they raise
ethical and trust concerns. To explore these issues, we organized the
SciProdLLM workshop with the goal of proving a forum for presenting and
discussing research on integrating LLMs into the typical research workflow:
from ideation to experimentation to scientific writing, with a particular
focus on human-centered approaches that ensure ethical and responsible use of
LLMs. We also invite work that evaluates the quality of LLM-assisted research
workflows and the resulting outputs.
@book{zhao2025first,
editor = {Wei Zhao and Jennifer D'Souza and Steffen Eger and
Anne Lauscher and Yufang Hou and Nafise {Sadat Moosavi} and Tristan Miller
and Chenghua Lin},
title = {Proceedings of the First Workshop on Human--{LLM}
Collaboration for Ethical and Responsible Science Production
({SciProdLLM})},
month = dec,
year = {2025},
publisher = {Association for Computational
Linguistics},
With the advent of large multimodal language models, science is now at a
threshold of an AI-based technological transformation. Recently, a plethora
of new AI models and tools has been proposed, promising to empower
researchers and academics worldwide to conduct their research more
effectively and efficiently. This includes all aspects of the research cycle,
especially (1) searching for relevant literature; (2) generating research
ideas and conducting experimentation; generating (3) text-based and (4)
multimodal content (e.g., scientific figures and diagrams); and (5) AI-based
automatic peer review. In this survey, we provide an in-depth overview over
these exciting recent developments, which promise to fundamentally alter the
scientific research process for good. Our survey covers the five aspects
outlined above, indicating relevant datasets, methods and results (including
evaluation) as well as limitations and scope for future research. Ethical
concerns regarding shortcomings of these tools and potential for misuse (fake
science, plagiarism, harms to research integrity) take a particularly
prominent place in our discussion. We hope that our survey will not only
become a reference guide for newcomers to the field but also a catalyst for
new AI-based initiatives in the area of “AI4Science”.
@article{eger2025transforming,
author = {Steffen Eger and Yong Cao and Jennifer D'Souza and
Andreas Geiger and Christian Greisinger and Stephanie Gross and Yufang Hou
and Brigitte Krenn and Anne Lauscher and Yizhi Li and Chenghua Lin and Nafise
Sadat Moosavi and Wei Zhao and Tristan Miller},
title = {Transforming Science with Large Language Models: a
Survey on {AI}-assisted Scientific Discovery, Experimentation, Content
Generation, and Evaluation},