I am a pre-doctoral AI Center Fellow at Microsoft Research India, where I work with Kalika Bali, Monojit Choudhury and Tanuja Ganu. My research interests broadly lie in the intersection of Natural Language Processing (NLP) and Human Computer Interaction (HCI).

I have been fortunate to work with and learn from some amazing researchers in the past. I did my bachelor thesis at Carnegie Mellon University where I was advised by Dave Touretzky. I also interned at University College London with Emine Yilmaz and Rishabh Mehrotra. Prior to my time in research, I spent two summers with Google Summer of Code working for Julia Language and Mozilla.

I graduated with a B.Tech (Hons.) in Electronics and Instrumentation from BITS Pilani, Goa, India in 2019. For more details, check my CV or drop me an email!


Intelligent User Interfaces

Develop interactive interfaces and tools to take NLP research to lay people.

INMT: Interactive Neural Machine Translation
Sebastin Santy, Sandipan Dandapat, Monojit Choudhury, Kalika Bali
EMNLP'19 Demo | Empirical Methods in Natural Language Processing
website| code| pdf| abstract| cite| poster

CoSSAT: Code-Switched Speech Annotation Tool
Sanket Shah, Pratik Joshi, Sebastin Santy, Sunayana Sitaram
AnnoNLP@EMNLP'19 | Empirical Methods in Natural Language Processing
pdf| abstract| cite| slides

State of Low Resource Languages

Understand how research in NLP can be taken to a broader community especially ones which are marginalized due to their language being lesser resourced in terms of data availability.

The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Sebastin Santy*, Pratik Joshi*, Amar Budhiraja*, Kalika Bali, Monojit Choudhury
ACL'20 | Annual Conference of the Association for Computational Linguistics
pdf| abstract| cite| website| talk

Learnings from Technological Interventions in a Low Resource Language
Sebastin Santy*, Devansh Mehta*, Ramaravind Mothilal, Brij Mohan Lal Srivastava, Alok Sharma, Anurag Shukla, Vishnu Prasad, Venkanna U, Amit Sharma and Kalika Bali
LREC'20 | International Conference on Language Resources and Evaluation
pdf| abstract| cite

Deploying Language Technologies for Underserved Communities
Kalika Bali, Monojit Choudhury, Sunayana Sitaram, Sebastin Santy
LT4All | UNESCO International Conference on Language Technologies for All
abstract| poster

Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities
Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury and Kalika Bali
ICON'19 | International Conference on Natural Language Processing
pdf| abstract| cite

Task Understanding

Investigate user behavior by mapping them to the tasks they are trying to accomplish on a daily basis.

Towards Task Understanding in Visual Settings
Sebastin Santy, Wazeer Zulfikar, Rishabh Mehrotra, Emine Yilmaz
AAAI'19 Student Abstract | AAAI Conference on Artificial Intelligence
pdf| abstract| cite| poster


Previously, I have worked on Networks and Software.

DataDepsGenerators.jl: making reusing data easy by automatically generating DataDeps.jl registration code
Lyndon White, Sebastin Santy
JOSS'18 | Journal of Open Source Software
pdf| cite| talk| blog post

BITS Darshini: A Modular, Concurrent Protocol Analyzer Workbench
Prasad Talasila, Mihir Kakrambe, Sebastin Santy, Anurag Rai, Neena Goveas, Bharat M Deshpande
ICDCN'18 | International Conference on Distributed Computing and Networking
pdf| abstract| cite


The State and Fate of Linguistic Diversity in the NLP world
NLP with Friends, Remote
video| abstract

Repeatable Data Setup for Repeatable Science
Talked about DataDepsGenerators.jl and Reproducible AI.
PyData 2018, 11 Times Square, New York City, United States
video| abstract

Tête-à-tête with APJ Abdul Kalam
As part of the Iken Scientifica 2009 winning team, got an opportunity to discuss tech advancements with His Excellency, Dr. APJ Abdul Kalam, Former President of India.
10 Rajaji Marg, New Delhi, India


Reviewer   EACL 2021, NAACL 2021, EMNLP 2020, MLADS 2020, ICON 2020
Sub-Reviewer   EMNLP 2020, ACL 2020, LREC 2020, CoNLL 2019, Interspeech 2019, ICON 2019
BITS Pilani
2015 - 2018
Julia Computing
University College London
Carnegie Mellon
Microsoft Research
2019 - Present
This page has been accessed at least several times since 16th August 2020.