|
Global Reward to Local Rewards: Multimodal-Guided Decomposition for Improving Dialogue Agents
Dong Won Lee,
Hae Won Park,
Yoon Kim,
Cynthia Breazeal,
Louis-Philippe Morency
To appear at EMNLP (Oral) , 2024
paper
/
code
/
huggingface
We introduce an approach named GELI, which automatically decomposes a single Global Explicit post-interaction score while incorporating Local Implicit feedback from multimodal signals to adapt a language model to become more conversational.
|
|
Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos
Dong Won Lee,
Chaitanya Ahuja,
Paul Pu Liang,
Sanika Natu,
Louis-Philippe Morency
ICCV, 2023
paper
/
code
We introduce the Multimodal Lecture Presentations dataset and PolyViLT a multimodal transformer trained with a multi-instance learning loss. We propose a large-scale benchmark testing the capabilities of machine learning models in multimodal understanding of educational content.
|
|
HIINT: Historical, Intra-and Inter-personal Dynamics Modeling with Cross-person Memory Transformer
Yubin Kim,
Dong Won Lee,
Paul Pu Liang,
Sharifa Algohwinem,
Cynthia Breazeal,
Hae Won Park
ICMI, 2023
paper
We model the Historical, Intra-and Inter-personal (HIINT) Dynamics in conversation by incorporating memory modules in the Cross-person Memory Transformer to address temporal coherence and better represent the context of conversational behaviors.
|
|
Multipar-T: Multiparty-Transformer for Capturing Contingent Behaviors in
Group Conversations
Dong Won Lee,
Yubin Kim,
Rosalind Picard,
Cynthia Breazeal,
Hae Won Park
IJCAI, 2023 (Oral)
paper
We introduce a new transformer architecture to model contingent behaviors in multiparty group conversations.
|
|
Low-resource Adaptation for Personalized Co-Speech Gesture Generation
Chaitanya Ahuja, Dong Won Lee,
Louis-Philippe Morency
CVPR, 2022
paper
/
We propose a new approach in crossmodal generative modeling in low-resource settings in the hopes to to create a personalized gesture generation model (e.g. as part of a personalized avatar) with limited data from a new speaker.
|
|
Crossmodal clustered contrastive learning: Grounding of spoken language to gestures
Dong Won Lee, Chaitanya Ahuja,
Louis-Philippe Morency
ICMI, GENEA Workshop, 2021
paper
/
presentation video
/
code
We propose a new crossmodal contrastive learning loss to encourage a stronger grounding between gestures and spoken language.
|
|
No Gestures Left Behind: Learning Relationships between Spoken Language and Freeform Gestures
Chaitanya Ahuja, Dong Won Lee, Ryo Ishii,
Louis-Philippe Morency
EMNLP, Findings, 2020
paper
/
code
We study relationships between spoken language and co-speech gestures to account for the long tail of text-gesture distribution.
|
|
Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional Mixture Approach
Chaitanya Ahuja, Dong Won Lee, Yukiko I. Nakano,
Louis-Philippe Morency
ECCV, 2020
project page
/
paper
/
code
We propose a new style transfer model to learn individual styles of speaker's gestures.
|
Planned Submissions
06/2024: EMNLP 2024
|
Mentors
I have been blessed to meet amazing mentors who have guided me to become a better researcher (and more importantly, a good person).
I believe that I can only repay what they've done for me by assisting others in their journey in any way I can.
Please don't hesitate to reach out!
Mentors and Advisors: (in Alphabetical Order)
- Ben Eysenbach - CMU
- Chaitanya Ahuja - CMU
- Cynthia Breazeal - MIT
- David Kosbie - CMU
- Hae Won Park - MIT
- Louis-Phillipe Morency - CMU
- Mark Stehlik - CMU
- Paul Pu Liang - CMU
- Roz Picard - MIT
- Yoon Kim - MIT
- Sid Sen - Microsoft Research
|
|