Bashar Alhafni
Hi! I am an Assistant Professor of Natural Language Processing at the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), where I lead the Arabic AI Modeling (Aram) Lab. My research focuses on Arabic NLP with an emphasis on three interconnected themes: advancing educational Arabic NLP, developing personalized and human-centered language technologies, and understanding how linguistic representation shapes model behavior. At its core, my work aims to build Arabic NLP systems that enhance learning experiences and contribute to social good.
Before joining MBZUAI, I completed my Ph.D. in Computer Science from New York University (NYU).
My dissertation focused on controlled Arabic natural language generation with applications in AI for education and social impact.
Before that, I earned my Master’s from the University of Southern California (USC), where I worked at ISI on low-resource machine translation and event relation extraction. I hold a Bachelor’s degree in Computer Science and Mathematics from the University of Bridgeport.
In addition to my academic work, I have held applied NLP research roles at Grammarly and Dataminr, where I contributed to projects in personalized text generation, summarization, and multilingual NLP.
I am always looking to work with motivated students and postdocs. Feel free to reach out if you are interested in applying to MBZUAI.
News
| 05/2026 | Organizing two shared tasks on Arabic Sentence Segmentation and Arabic Readability Assessment to the 4th Arabic NLP Conference |
|---|---|
| 04/2026 | One paper on LLMs in education accepted at BEA 2026. See you in San Diego! |
| 02/2026 | Two papers on a blingual, bimodel benchmark for Arabic-English NLP and Arabic authorthsip attribution and style transfer accepted to LREC 2026. See you in Mallorca! |
| 01/2026 | We won the best system award at the AMIYA shared task organized at VarDial 2026 🏆 |
| 01/2026 | Two papers on Judeo-Arabic transliteration and studying the effect of Arabic diacritics on LLMs accepted to EACL 2026. See you in Morocco! |
| 12/2025 | Co-organizing the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at ACL 2026. See you in San Diego! |
| 07/2025 | I am excited to join MBZUAI as an Assistant Professor of Natural Language Processing. I am currently looking for students and postdocs to join my lab! |
| 05/2025 | Our paper on Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study has been accepted to ACL. |
| 04/2025 | I have successfully defended my Ph.D. dissertation titled Controlled Natural Language Generation for Morphologically Rich Languages: The Case of Arabic. Many thanks so my committee members: Ted Birscoe, Kyunghyun Cho, Mona Diab, Nizar Habash, He He, and Julia Stoyanovich. |
| 04/2025 | Our paper, ARWI: Arabic Write and Improve, won the diverstiy award at the In2Writing workshop at NAACL 2025 🏆 |
| 01/2025 | Co-organizing the BEA 2025 workshop at ACL 2025. |
| 09/2024 | Gave a talk at the National Research Council Canada (NRC-CNRC) on Controlled User-Centric Natural Language Generation for Morphologically Rich Languages: The Case of Arabic. |
| 09/2024 | Presenting a tutorial at COLING 2025 on LLMs in Education: Novel Perspectives, Challenges, and Opportunities with Sowmya Vajjala, Stefano Bannò, Kaushal Kumar Maurya, and Ekaterina Kochmar (tutorial website). |
| 04/2024 | Co-organizing the 2nd Arabic Natural Language Processing Conference (ArabicNLP 2024) at ACL 2024! |
| 03/2024 | Our paper on, The Arabic Text Simplification Corpus, has been accepted to LREC-COLING 2024! |
| 02/2024 | Our paper, mEdIT: Multilingual Text Editing via Instruction Tuning, has been accepted to NAACL 2024! |
| 10/2023 | Our paper, Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation, has been accepted to EMNLP 2023! |
| 05/2023 | Started my research internship at Grammarly, where I will be working on personalizing LLMs under the supervision of Vipul Raheja, Dhruv Kumar, and Vivek Kulkarni! |
| 05/2023 | Our demo paper, The User-Aware Arabic Gender Rewriter, has been accepted to the GITT workshop at EAMT 2023. |
| 10/2022 | Our paper on, CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization, has been accepted to the EMNLP 2022 Findings! This was a result of my summer internship at Dataminr. |
| 10/2022 | Our paper on, Arabic Word-level Readability Visualization for Assisted Text Simplification, has been accepted to the EMNLP 2022 system demonstrations track! |
| 07/2022 | We are running a shared task on Gender Rewriting at WANLP in EMNLP 2022. Check it out and participate! |
| 05/2022 | Our paper on, Zero-shot Cross-Linguistic Learning of Event Semantics, has been accepted to INLG 2022! |
| 04/2022 | Our paper on, User-Centric Gender Rewriting, has been accepted to NAACL 2022 main conference as a special theme paper! |
| 04/2022 | Our work on, The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses, will appear in LREC 2022! |
| 03/2022 | I am part of the EMNLP 2022 Local Organizing Committee. Feel free to email me any local organization questions you may have. |
| 01/2022 | I will be spending my summer at Dataminr as a research intern under the mentorship of Joel Tetreault. |
| 10/2021 | Our new preprint, The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses, is now available on arXiv. |
| 06/2021 | I passed my Ph.D. qualifying exam 🥳! Thanks to my committee members: Profs. Kyunghyun Cho, Nizar Habash, He He, and Julia Stoyanovich. |
| 02/2021 | Our paper on, The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models, has been accepted to 6th Arabic NLP workshop (WANLP), EACL 2021. |
| 01/2021 | I will be giving an invited talk at ETH Zürich’s NLP Reading Group (Host: Ryan Cotterell). |
| 10/2020 | Our paper on, Gender-Aware Reinflection using Linguistically Enhanced Neural Models, has been accepted to the second workshop on gender bias in NLP (GeBNLP), COLING 2020. |
| 04/2020 |
I will be joining NYU as a computer science Ph.D. student in the Fall. Let’s meetup and talk about NLP research if you’re in NYC! |
| 02/2020 | Our paper, CAMeL Tools: An Open Source Python Toolkit for Arabic Natural Language Processing, has been accepted to LREC 2020. |
| 06/2019 | I will be live-tweeting sessions 3D and 7D on Machine Translation at NAACL 2019. |
| 03/2019 | I will be attending NAACL 2019 in Minneapolis to present our paper on Contextualized Word Embeddings Enhanced Event Temporal Relation Extraction for Story Understanding. |