AI Researcher
Hi! I'm an AI Researcher originally from Korea, with formative years in NYC, now based in SF. I specialize in training and deploying LLMs for enterprise clients across niche domains. This entails building novel data curation, pre/post-training, and evaluation pipelines that scale different model behaviors effectively.
Before AI research, I worked as a Data Scientist, partnering with C-Suite, Product, Growth, and Marketing in AI & Ed-Tech. I care about translating research into real-world impact—rather than chasing a 0.007 bump in MMLU.
I'm a US citizen interested in defense tech 🇺🇸

Education

M.S. in Artificial Intelligence
Carnegie Mellon University
School of Computer Science, Language Technologies Institute
Pittsburgh, PA
2021 - 2023

B.A. in Economics-Statistics & Linguistics
Columbia University
Columbia College
New York, NY
2013 - 2017
Work Experience

Tech Lead, AI Research Scientist
Accenture
San Francisco, CA
Nov 2023 - Present
- Designed, trained, and deployed domain-specific LLMs and multi-agent systems, developing custom pre/post-training recipes and pipelines to enhance reasoning, alignment, function-calling, and RAG.
- Optimized large-scale model training and benchmarking workflows via data and model parallelism, cutting GPU allocation and manual efforts by 400%.
- "Harnessing Business and Media Insights with Large Language Models" (arXiv), introducing FALM, the Fortune Magazine LLM built for business analytics.
- "SFT-GO: Supervised Fine-Tuning with Group Optimization for Large Language Models"(arXiv), introducing a token importance-aware training approach, improving model accuracy and convergence efficiency across multiple datasets and base Llama models.

Data Scientist
Nearpod
Brooklyn, NY
January 2020 - April 2021
- First and only data science hire. Defined product metrics, built API pipelines, predictive models, and dashboards, working directly with C-Suite. Played a key role in $650M acquisition by Renaissance Learning.

Data Scientist
IBM
New York, NY
August 2017 - January 2020
- Led a 30-person team across marketing, product, and design. Launched 200+ personalization experiments, 30+ user interviews, and leveraged funnel-based A/B testing to drive 230% engagement growth and $10M in revenue gains for 13 AI products. Earned the 2019 Q1 CMO Award.
Research

Multimodal Research Engineer
Carnegie Mellon University, School of Computer Science, Language Technologies Institute
Pittsburgh, PA
October 2021 - December 2022
- Temporal Action Localization: Implemented and achieved a 0.9 point AUC improvement over the BMN model on ActivityNet-1.3 (20K miscellaneous Youtube videos) using temporal shift, adding global information using Squeeze and Excite, and ensembling to correctly identify action timestamps (Report)
- Multilingual Speech Recognition: Identified the optimal Byte Pair Encoding vocabulary size, incorporated language modeling, and employed self supervised representation learning with HuBERT and wav2vec 2.0 to train a variety of ASR models on African Accented French using ESPNet (Report)
- Model Probing: Built a model-agnostic end-to-end framework to determine whether upstream tasks (NER, PoS tagging, DP, and SRL) learned in pretrained encoders are actually utilized in downstream tasks. Created a unified data processing infrastructure that covers 14 CoLA datasets (Report)
- Model Compression & Bias in LLMs: Applied pruning and quantization on BERT to assess the impact on the level of biases based on question context, class balance, and prediction confidence (Report)
- Machine Translation for Low Resource Languages: Built a transformer model with Fairseq to 1) augment the data with copied target side monolingual data 2) select a second transfer language based on LangRank which resulted in a 345 percent increase in BLEU over the given baseline for Bel-Eng and Aze-Eng (Report)
- Google Scholar
Projects
Driver/Restaurant Recommendation System
Developed an Uber-like Java program that matches real-time client requests to available cab drivers and targeted restaurant advertisements based on regression models by processing multiple streams of GPS, Apple Watch user health and biometric data, and Google Map & Yelp business data. Deployed Kafka and Samza on a YARN cluster provisioned on AWS.
Deep Learning Algorithms From Scratch
Implemented forward and backward methods for linear, 1D & 2D conv, RNN, dropout, batch norm, sigmoid, tanh, ReLU, softmax, hidden markov, matrix factorization, logistic regression, SVM, TF-IDF, Decision Tree, and neural network without relying on Pytorch or Scikit-Learn
Twitter Analytics Web Service
Implemented ETL for QR code, Blockchain validation, and User Recommendation services on a Twitter dataset (~1TB) using AWS, Spark, and Kubernetes. Designed and optimized a MySQL DB schema for scale and throughput. Deployed web-servers (Sanic) in Python
Heterogeneous Storage for Social Networking
Configured and deployed heterogeneous SQL and NoSQL databases (MySQL, MongoDB and Neo4j) in Java with a caching mechanism for a Facebook-like social networking web app
Various AI Projects
Including personalized rental listing search, personalized reflection assistant, and an appointment booking tool. I'll share more when we meet.