I am a 2nd year PhD student in NLP at the University of Edinburgh (2022-current). I’m a part of the Machine Translation (StatMT) research group there, supervised by Dr. Alexandra Birch.

Broadly, I am interested in adapting NLP models for the long tail of under-represented languages and cultures across the globe. Currently, I am working on multilingual NLP (including synthetic instruction tuning for multilingual LLMs, low-resource MT and generation) as well as multicultural NLP (transcreation for cross-culturally understandable generation).

For more details, check out my top publications below or my career summary. Don’t hesitate to reach out if you have questions or would like to collaborate!


Sep 20, 2024 [WMT 2024] 2 long papers on low-resource LLM-MT and cultural transcreation of menus accepted at WMT (EMNLP) 2024! See you all in Miami :us: :sunny:
Jul 11, 2024 Gave an invited talk at IBM Research (slides) on past internship projects, as part of the “Papers We Wrote” program which features presentations from their top-performing interns. :memo: :star:
Jun 21, 2024 Presented a poster on our submission on adapting LLMs for very low-resource MT, ranked #3 at AmericasNLP shared task, at NAACL 2024 in Mexico City. :sunrise: :mexico:
Jun 4, 2024 [Interspeech 2024] My internship paper with Naver Labs Europe, on a 90M parameter speech foundation model for 147 languages (mHuBERT-147), has been accepted to Interspeech with strong reviews! :sound: :earth_africa:
Dec 6, 2023 [EMNLP 2023] Presented a poster on disambiguation-centric pretraining at EMNLP, followed by oral presentations at WMT’23 and CALCS’23 :sunglasses: :singapore:
Oct 7, 2023 [EMNLP 2023] 2 first-author acceptances on ambiguous MT - 1 paper each at Findings and WMT! :tada:
Jul 11, 2023 Received Outstanding Reviewer Award at ACL 2023! :medal_sports:
Jun 26, 2023 Started an internship on fine-tuning Massively Multilingual Speech Models at Naver Labs Europe :rocket: :fr:

noteworthy publications

  1. EMNLP (Findings)
    Code-Switching with Word Senses for Pretraining in Neural Machine Translation
    Vivek Iyer, Edoardo Barba, Alexandra Birch, Jeff Pan, and Roberto Navigli
    In Findings of the Association for Computational Linguistics: EMNLP 2023. Dec 2023
  2. WMT (EMNLP)
    Towards Effective Disambiguation for Machine Translation with Large Language Models
    Vivek Iyer, Pinzhen Chen, and Alexandra Birch
    In Proceedings of the Eighth Conference on Machine Translation. Dec 2023
  3. EACL (Findings)
    Exploring Enhanced Code-Switched Noising for Pretraining in Neural Machine Translation
    Vivek Iyer, Arturo Oncevay, and Alexandra Birch
    In Findings of the Association for Computational Linguistics: EACL 2023. May 2023
  4. WMT (EMNLP)
    The University of Edinburgh’s Submission to the WMT22 Code-Mixing Shared Task (MixMT)
    Faheem Kirefu,  Vivek Iyer, Pinzhen Chen, and Laurie Burchell
    Proceedings of the Seventh Conference on Machine Translation. May 2022
    Ranked 2nd best system overall in both directions.
  5. EMNLP (Main)
    VeeAlign: Multifaceted Context Representation Using Dual Attention for Ontology Alignment
    Vivek Iyer, Arvind Agarwal, and Harshit Kumar
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Nov 2021
  6. ISWC (Workshop)
    VeeAlign: a supervised deep learning approach to ontology alignment.
    Vivek Iyer, Arvind Agarwal, and Harshit Kumar
    In Proceedings of the Ontology Matching Workshop @ International Semantic Web Conference 2020. Dec 2020
    Ranked 1st in the Conference track.