I am a pre-doctoral AI Center Fellow at Microsoft Research India, where I work with Kalika Bali, Monojit Choudhury and Tanuja Ganu. I am interested in working on problems that focus on democratizing language technologies.

I have been fortunate to work with an amazing set of researchers in the past. I did my bachelor thesis at Carnegie Mellon University where I was advised by Dave Touretzky. I also interned at University College London with Emine Yilmaz and Rishabh Mehrotra. Prior to my time in research, I spent two summers with Google Summer of Code working for Julia Language and Mozilla.

I graduated with a B.Tech (Hons.) in Electronics and Instrumentation from BITS Pilani, Goa, India in 2019. For more details, check my CV or hit me up on my email.

Research

Interactive Tools and User Interfaces

Develop interactive interfaces and tools to take NLP research to lay people.

INMT: Interactive Neural Machine Translation
Sebastin Santy, Sandipan Dandapat, Monojit Choudhury, Kalika Bali
EMNLP'19 Demo | Empirical Methods in Natural Language Processing
website| code| pdf| abstract| cite| poster

CoSSAT: Code-Switched Speech Annotation Tool
Sanket Shah, Pratik Joshi, Sebastin Santy, Sunayana Sitaram
AnnoNLP@EMNLP'19 | Empirical Methods in Natural Language Processing
pdf| abstract| cite| slides

Socio-Technical Systems and Societal Impact

Understand how research in NLP can be taken to a broader community especially ones which are marginalized due to their language being lesser resourced in terms of data availability.

The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Pratik Joshi*, Sebastin Santy*, Amar Budhiraja*, Kalika Bali, Monojit Choudhury (* = Equal Contribution)
ACL'20 | Annual Conference of the Association for Computational Linguistics
pdf| abstract| cite| website

Learnings from Technological Interventions in a Low Resource Language
Devansh Mehta*, Sebastin Santy*, Ramaravind Mothilal, Brij Mohan Lal Srivastava, Alok Sharma, Anurag Shukla, Vishnu Prasad, Venkanna U, Amit Sharma and Kalika Bali (* = Equal Contribution)
LREC'20 | International Conference on Language Resources and Evaluation
pdf| abstract| cite

Deploying Language Technologies for Underserved Communities
Kalika Bali, Monojit Choudhury, Sunayana Sitaram, Sebastin Santy
LT4All | UNESCO International Conference on Language Technologies for All
proceedings| poster

Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities
Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury and Kalika Bali
ICON'19 | International Conference on Natural Language Processing
pdf| abstract| cite

Task Understanding

Investigate user behavior by mapping them to the tasks they are trying to accomplish on a daily basis.

Towards Task Understanding in Visual Settings
Sebastin Santy, Wazeer Zulfikar, Rishabh Mehrotra, Emine Yilmaz
AAAI'19 Student Abstract | AAAI Conference on Artificial Intelligence
pdf| abstract| cite| poster

Miscellaneous

Before I delved into a specific area of interest, I had worked on Networks and Software.

DataDepsGenerators.jl: making reusing data easy by automatically generating DataDeps.jl registration code
Lyndon White, Sebastin Santy
JOSS'18 | Journal of Open Source Software
pdf| cite| talk| blog post

BITS Darshini: A Modular, Concurrent Protocol Analyzer Workbench
Prasad Talasila, Mihir Kakrambe, Sebastin Santy, Anurag Rai, Neena Goveas, Bharat M Deshpande
ICDCN'18 | International Conference on Distributed Computing and Networking
pdf| abstract| cite

Talks

Repeatable Data Setup for Repeatable Science
Talked about DataDepsGenerators.jl and Reproducible AI.
PyData 2018, 11 Times Square, New York City, United States
video| abstract

Tête-à-tête with APJ Abdul Kalam
As part of the Iken Scientifica 2009 winning team, got an opportunity to discuss tech advancements with His Excellency, Dr. APJ Abdul Kalam, Former President of India.
10 Rajaji Marg, New Delhi, India
video

Service

Reviewer   EACL 2021, EMNLP (Demo) 2020
Sub-Reviewer   EMNLP 2020, ACL 2020, LREC 2020, Interspeech 2019, ICON 2019
BITS Pilani
2015 - 2018
Mozilla
S2017
Julia Computing
S2018
University College London
S2018
Carnegie Mellon
F2018
Microsoft Research
2019 - Present
This page has been accessed at least several times since 16th August 2020.