Selected publications expand all publications collapse all publications

How to Select Datapoints for Efficient Human Evaluation of NLG Models?TACL 2025
Estimating Machine Translation DifficultyEMNLP 2025
Findings of the WMT25 general machine translation shared task: Time to stop evaluating on easy test setsWMT 2025
Generating Difficult-to-Translate TextsIn review 2025
Searching for Difficult-to-Translate Test Examples at ScaleIn review 2025
AI-Assisted Human Evaluation of Machine TranslationNAACL 2025
Early-Exit and Instant Confidence Translation Quality EstimationIn review 2025
Pitfalls and Outlooks in Using COMETWMT 2024
Error Span Annotation: A Balanced Approach for Human Evaluation of Machine TranslationWMT 2024
Fine-Tuned Machine Translation Metrics Struggle in Unseen DomainsACL 2024
Quality and Quantity of Machine Translation References for Automated MetricsHumEval 2024
WMT24 General Machine Translation Shared Task: The LLM Era is Here but MT is Not Solved YetWMT 2024
Navigating the Metrics Maze: Reconciling Score Magnitudes and AccuraciesACL 2024
RELIC: Investigating Large Language Model Responses using Self-ConsistencyCHI 2024
Distributional Properties of Subword RegularizationEMNLP 2024
Evaluating Optimal Reference TranslationsJNLE 2024
A Diachronic Perspective on User Trust in AI under UncertaintyEMNLP 2023
WMT 2023 Shared Task on Machine Translation with TerminologiesEMNLP 2023
Tokenization and the Noiseless ChannelACL 2023
A Formal Perspective on Byte-Pair EncodingACL 2023
Re-visiting Automated Topic Model Evaluation with Large Language ModelsEMNLP 2023
Poor Man's Quality Estimation: Predicting Ref.-Based MT Metrics Without ReferenceEACL 2023
Neural Machine Translation Quality and Post-Editing PerformanceEMNLP 2021
Providing Backtranslation Improves Users Confidence in MT, Not QualityNAACL 2021
WMT20 Document-Level Markable Error ExplorationWMT 2020


Less-selected publications/projects

Findings of the WMT25 Multilingual Instruction Shared Task: Persistent Hurdles in Reasoning, Generation, and EvaluationWMT 2025
Findings of the WMT25 shared task on automated translation evaluation systems: Linguistic diversity is challenging and references still helpWMT 2025
Findings of the WMT25 Terminology Translation Task: Terminology is Useful Especially for Good MTsWMT 2025
COMET-poly: Machine Translation Metric Grounded in Other CandidatesWMT 2025
Deconstructing Self-Bias in LLM-generated Translation BenchmarksIn review 2025
How Important is `Perfect' English for Machine Translation Prompts?In review 2025
Co-DETECT: Collaborative Discovery of Edge Cases in Text ClassificationEMNLP 2025 Demo
Can Large Language Models Capture Human Annotator Disagreements?In review 2025
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreementEMNLP 2025
Biased Tales: Cultural and Topic Bias in Generating Children's StoriesEMNLP 2025
Large Language Models as Span AnnotatorsIn review 2025
QE4PE: Word-level Quality Estimation for Human Post-EditingTACL 2025
A Bayesian Optimization Approach to Machine Translation RerankingNAACL 2025
Findings of the IWSLT 2025 Evaluation CampaignIWSLT 2025
Are Large Language Models for Education Reliable for All Languages?BEA 2025
Interactive Analysis of LLMs using Meaningful CounterfactualsIEEEvis 2025
PWESuite: Phonetic Word Embeddings and Tasks They FacilitateLREC-COLING 2024
Two Counterexamples to Tokenization and the Noiseless ChannelLREC-COLING 2024
Knowledge Base Index Compression via Dimensionality and Precision ReductionSpaNLP 2022
Sampling and Filtering of Neural Machine Translation Distillation DataNAACL SRW 2021
Harmonizing Assistance: Moderating Visual andTextual Aids in AI-Enhanced Textbook Readingwith IReadIJAIED 2025
How to Engage Your Readers? Generating Guiding Questions to Promote Active ReadingACL 2024
AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and GuardrailsLearning@Scale 2024
Enhancing Textbooks with Visuals from the Web for Improved LearningEMNLP 2023
Shrinking Knowledge Base Size: Dimension Reduction, Splitting & FilteringMaster thesis 2022
Ryanize bib2023
ÚFAL Bilingual scientific abstracts corpus2022
Artefact Retrieval: Overview of NLP Models with Knowledge Base AccessAKBC CSKB 2021
Leveraging Neural Machine Translation for Word AlignmentPBML 116
Sentence Ambiguity, Grammaticality and Complexity ProbesBlackboxNLP 2022
Slow Align Displayer2020
Enabling Outbound Machine TranslationBachelor thesis 2020

Miscellaneous


I'm currently advised by Mrinmaya Sachan at LRE lab and Menna El-Assady at IVIA lab. Previously during my bachelor's and master's I was advised by Dietrich Klakow, and Ondřej Bojar. I got to intern at Amazon Translate and Google Translate and since 2025 my PhD is funded by a Google PhD Fellowship in Natural Language Processing. I had the privilige to supervise Yijie Tong, Haokun He, Abhinav Kumar, and David Gu.

In my free time I'm interested in veganism, electric guitar, {video,board}games, and literature.


Talks


I enjoy socializing and am grateful to have been invited to give the following talks: