SEBASTIN
SANTY

I am a PhD student in Computer Science at the University of Washington, advised by Ranjay Krishna. My research focuses on making NLP (and AI) systems work for people around the world, especially the non-western populations. This includes uncovering design biases of NLP models & datasets and building models that can accommodate perspectives from different demographics. I am also broadly interested in problems that arise wherever NLP systems interact with humans, such as when crowd-sourcing data or building language user interfaces.

Previously, I have worked at Allen AI, Microsoft Research, Carnegie Mellon University and University College London. I have also spent two summers with Google Summer of Code working with Julia Language and Mozilla.

PUBLICATIONS

On NLP Research

Measuring Design Biases and Positionality of NLP Datasets and Models
Sebastin Santy*, Jenny Liang*, Ronan Le Bras, Katharina Reinecke, Maarten Sap
ACL 2023 FORTHCOMING
The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Sebastin Santy*, Pratik Joshi*, Amar Budhiraja*, Kalika Bali, Monojit Choudhury
ACL 2020 US FTC NLP News NLP Beyond English Quartz Underrated ML WEB TALK PDF ABS

NLP and Society

Language Translation as a Socio-Technical System
Sebastin Santy, Kalika Bali, Monojit Choudhury, Sandipan Dandapat, Tanuja Ganu, Anurag Shukla, Jahanvi Shah, Vivek Seshadri
COMPASS 2021 Mint Lounge SLIDES PDF ABS
Learnings from Technological Interventions in a Low Resource Language
Sebastin Santy*, Devansh Mehta*, Ramaravind Kommiya Mothilal, Brij Mohan Lal Srivastava, Alok Sharma, Anurag Shukla, Vishnu Prasad, Venkanna U, Amit Sharma, Kalika Bali
Unsung Challenges of Building and Deploying Language Technologies for LRL Communities
Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury, Kalika Bali

NLP + HCI

INMT: Interactive Neural Machine Translation
Sebastin Santy, Sandipan Dandapat, Monojit Choudhury, Kalika Bali
EMNLP 2019 Demo Slate WEB CODE POSTER PDF ABS
CoSSAT: Code-Switched Speech Annotation Tool
Sanket Shah, Pratik Joshi, Sebastin Santy, Sunayana Sitaram
AnnoNLP@EMNLP 2019 SLIDES PDF ABS
BERTologiCoMix: How does Code-Mixing interact with Multilingual BERT?
Sebastin Santy*, Anirudh Srinivasan*, Monojit Choudhury
AdaptNLP@EACL 2021 POSTER PDF ABS
Towards Task Understanding in Visual Settings
Sebastin Santy, Wazeer Zulfikar, Rishabh Mehrotra, Emine Yilmaz
AAAI 2019 (Student Abstract) POSTER PDF ABS
TALKS
Designing, Evaluating, and Learning from Humans Interacting with NLP Models
A conference tutorial on research in the intersection of NLP and HCI
EMNLP 2023 - Singapore
ABS
The State and Fate of Linguistic Diversity in the NLP world
A casual talk about our ACL 2020 paper on the same topic
NLP with Friends - Remote
ABS TALK
Repeatable Data Setup for Repeatable Science
Talked about DataDepsGenerators.jl and Reproducible AI.
PyData 2018 - New York City, USA
ABS TALK
Miscellaneous
Resources: My PhD Statement, CV Template