SEBASTIN
SANTY
Incoming Ph.D. student
Paul G. Allen School of CSE
University of Washington

I am an incoming Ph.D. student at the University of Washington where I'll be working on problems in the intersection of natural language processing (NLP) and human-computer interaction (HCI). I am currently interested in the prospect of leveraging human computation to build large-scale high-quality datasets and to efficiently evaluate machine-generated text, both of which are necessary to train and test robust large-scale NLP models. I am also interested in building interactive and human-in-the-loop NLP systems as well as using NLP effectively to understand large-scale qualitative data.

Previously, I was an AI Center Fellow at Microsoft Research, where I worked on NLP for low resource languages and underserved & marginalized communities. I've also spent time pursuing research at CMU and UCL, as well as two summers with Google Summer of Code working for Julia Computing and Mozilla. I graduated with a B.E. in Electronics and Instrumentation Engineering from BITS Pilani, Goa, India.

You can reach out on if you have any questions.

PUBLICATIONS

NLP and Society

The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Sebastin Santy*, Pratik Joshi*, Amar Budhiraja*, Kalika Bali, Monojit Choudhury
ACL 2020 NLP News NLP Beyond English Quartz Underrated ML SIGTYP WEB TALK PDF ABS
Learnings from Technological Interventions in a Low Resource Language
Sebastin Santy*, Devansh Mehta*, Ramaravind Kommiya Mothilal, Brij Mohan Lal Srivastava, Alok Sharma, Anurag Shukla, Vishnu Prasad, Venkanna U, Amit Sharma, Kalika Bali
Unsung Challenges of Building and Deploying Language Technologies for LRL Communities
Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury, Kalika Bali
ICON 2019 SLIDES PDF ABS

NLP + HCI

Language Translation as a Socio-Technical System: Case-Studies of Mixed-Initiative Interactions
Sebastin Santy, Kalika Bali, Monojit Choudhury, Sandipan Dandapat, Tanuja Ganu, Anurag Shukla, Jahanvi Shah, Vivek Seshadri
COMPASS 2021 SLIDES PDF
INMT: Interactive Neural Machine Translation
Sebastin Santy, Sandipan Dandapat, Monojit Choudhury, Kalika Bali
EMNLP 2019 Demo WEB CODE POSTER PDF ABS
CoSSAT: Code-Switched Speech Annotation Tool
Sanket Shah, Pratik Joshi, Sebastin Santy, Sunayana Sitaram
AnnoNLP@EMNLP 2019 SLIDES PDF ABS
BERTologiCoMix: How does Code-Mixing interact with Multilingual BERT?
Sebastin Santy*, Anirudh Srinivasan*, Monojit Choudhury
AdaptNLP@EACL 2021 POSTER PDF ABS
Towards Task Understanding in Visual Settings
Sebastin Santy, Wazeer Zulfikar, Rishabh Mehrotra, Emine Yilmaz
AAAI 2019 (Student Abstract) POSTER PDF ABS
TALKS
Repeatable Data Setup for Repeatable Science
Talked about DataDepsGenerators.jl and Reproducible AI.
PyData 2018 - New York City, USA
ABS TALK
BITS Pilani
2015 - 2018
Mozilla
S2017
Julia Computing
S2018
U College London
S2018
Carnegie Mellon
F2018
Microsoft Research
Spring 2019 - Present
Washington
F2021 onwards