Hello! I'm a pre-doctoral AI Center Fellow at Microsoft Research India, where I work with Kalika Bali, Monojit Choudhury and Tanuja Ganu. My research interests broadly lie in the intersection of NLP (Natural Language Processing), HCI (Human Computer Interaction) and Society.
I've been fortunate to work with and learn from amazing researchers in the past. I did my bachelor thesis at Carnegie Mellon University where I was advised by Dave Touretzky. I also interned at University College London with Emine Yilmaz and Rishabh Mehrotra. Prior to my time in research, I spent two summers with Google Summer of Code working for Julia Computing and Mozilla.
I graduated with a B.Tech (Hons.) in Electronics and Instrumentation from BITS Pilani, Goa, India in 2019. Coming fall, I'll be joining University of Washington as a PhD student in Computer Science and Engineering. For more information, please see my CV or contact me on .
Publications
NLP + HCI

INMT: Interactive Neural Machine Translation
Sebastin Santy, Sandipan Dandapat, Monojit Choudhury, Kalika Bali
EMNLP 2019 (Demo)
CoSSAT: Code-Switched Speech Annotation Tool
Sanket Shah, Pratik Joshi, Sebastin Santy, Sunayana Sitaram
AnnoNLP@EMNLP 2019
NLP + Society
The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Sebastin Santy*, Pratik Joshi*, Amar Budhiraja*, Kalika Bali, Monojit Choudhury
ACL 2020
Learnings from Technological Interventions in a Low Resource Language
Sebastin Santy*, Devansh Mehta*, Ramaravind Kommiya Mothilal, Brij Mohan Lal Srivastava, Alok Sharma, Anurag Shukla, Vishnu Prasad, Venkanna U, Amit Sharma and Kalika Bali
LREC 2020
Deploying Language Technologies for Underserved Communities
Kalika Bali, Monojit Choudhury, Sunayana Sitaram, Sebastin Santy
UNESCO LT4All 2020 [Invited]
Unsung Challenges of Building and Deploying Language Technologies for LRL Communities
Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury and Kalika Bali
ICON 2019
Probing Language Models
BERTologiCoMix: How does Code-Mixing interact with Multilingual BERT?
Sebastin Santy*, Anirudh Srinivasan*, Monojit Choudhury
AdaptNLP@EACL 2021
Task Understanding
Towards Task Understanding in Visual Settings
Sebastin Santy, Wazeer Zulfikar, Rishabh Mehrotra, Emine Yilmaz
AAAI 2019 (Student Abstract)
Miscellaneous
BITS Darshini: A Modular, Concurrent Protocol Analyzer Workbench
Prasad Talasila, Mihir Kakrambe, Sebastin Santy, Anurag Rai, Neena Goveas, Bharat M Deshpande
ICDCN 2018
Talks
Repeatable Data Setup for Repeatable Science
Talked about DataDepsGenerators.jl and Reproducible AI.
PyData 2018 - 11 Times Square, New York City, United States
Tête-à-tête with APJ Abdul Kalam
As part of the Iken Scientifica 2009 winning team, got an opportunity to discuss tech progress in India with His Excellency, Dr. APJ Abdul Kalam, Former President of India.
Office of the Ex-President - 10 Rajaji Marg, New Delhi, India
Service
Reviewer   ACL '21, EACL '21, NAACL '21 (Demo), EMNLP '20 (Demo), MLADS '20, ICON '20
Sub-Reviewer   EMNLP '20, ACL '20, LREC '20, CoNLL '19, Interspeech '19, ICON '19
U Washington
F2021 onwards
Microsoft Research
2019 - Present
Carnegie Mellon
F2018
Univ. College London
S2018
Julia Computing
S2018
Mozilla
S2017
BITS Pilani
2015 - 2018