Projects
-
LLM Instruction Following in Open-Ended Generation
2024
Python
- Created the model-in-the-loop data pipeline and generated prompts and preference response pairs with self-instruct framework. Improved data quality by critique-and-revise, increasing correct labels by 4 times.
- Built instruction following capability such as keywords and length constraints by supervised fine-tuning and alignment tuning (RLFH, PPO and reward modeling, DPO), improving IFEval accuracy by 56% relatively.
-
Evaluating and Improving LLM Creative Writing Quality
2023
Python
- Designed the single/multi-turn human evaluation datasets that measure model's creative writing quality in 6 dimensions. Led the human evaluation and established inter-annotator agreement with 0.46 Spearman coefficient.
- Developed the specialized prompting strategy for multi-turn dialog evaluation with LLM-as-a-judge for specific dimensions, making it a reliable proxy of human evaluation.
- Explored supervised fine-tuning with curriculum and achieved 61.5% win rate against the baseline in response quality judged by GPT-4.
-
Dialog-State-Aware Prompt Composer for LLM
2023
Python
- Developed a two-level prompting system that first determine the user intent and then select relevant APIs and exemplars to compose the prompt based on the dialog history.
- Built the multi-turn dialog inference orchestrator that supports tool API calls and remote model server interface for experimentation and evaluation for the team.
-
Argument Filling Model in Multimodal Dialogs
2022
Python
- In the task of selecting an object on the screen in a dialog for multimodal screen devices, designed the modeling of graceful exit (instead of returning wrong object) when no object on the screen matches user's request.
- Built and deployed byte pair encoding to the argument filling model, addressing the out-of-vocabulary issue and improving the object selection accuracy by 10.2%.
-
Indexing New Product Features in Search
2022
Scala | Spark
- Published the new product features to the search engine index by creating an end-to-end pipeline from feature production, job scheduling, data warehouse registration, to index publication.
-
Predicting Customer Actions for Low-Resource Items
2021
Python | PyTorch
- Alleviated the problem of lacking action data for cold-start (low-resource) items by introducing item metadata graph to transfer action (e.g. clicks) knowledge from popular (high-resource) items to cold-start items, improving the ROC-AUC by 18% against the counting-based baseline.
-
Online Product Search Ranking Feature for Long-Tail Queries
2020 - 2021
Python | TensorFlow
- Deep dived to reveal two causes impacting model performance: failure in generalizing to queries with tokens unseen in training, and imbalance of the number of training instances in head and tail queries.
- Introduced vocabulary-indexed embedding to replace the hash-indexed embedding, improving the PR-AUC by 2.4% for head products and tail queries.
-
Generating Queries for Cold-Start Products
2019 - 2020
Python | TensorFlow
- Proposed and implemented novel multi-encoder neural language generative models to predict search queries based on product information, bringing 6.07% gain in new product sales.
- Integrated the query generation model into AWS SageMaker by making a new server-container system that interacts with SageMaker API via HTTP messaging, enabling deployment of models trained outside of SageMaker.
-
New Product Search Metrics for A/B Testing
2019 - 2020
Python | Spark | Shell
- Built the data pipeline that generates and analyzes daily worldwide customer search data on new products in any online A/B testing, used by all teams across Amazon to make launch decisions for new services.
-
Choosing Transfer Languages for Cross-Lingual Learning
2018 - 2019. Advised by Graham Neubig. With Yuyan Zhang, Chian-Yu Chen, Jean Lee, Zirui Li, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, and Patrick Littell.
Python
[pdf] [code]
- Proposed a ranking framework to select the optimal transfer languages for low-resource NLP tasks based on typological information and corpus statistics.
- Improve the NDCG by 84% in ranking the best transfer languages for machine translation and POS tagging over any single most informative language or dataset attribute.
-
Two-Phase Ranking for Product Search
2018
Python
- Developed the feature of customer engagement weighting in the machine learning model training and evaluation infrastructure.
- Enhanced the ranking quality of product search for Amazon Business by designing and training the new two-phase ranking model that raises NDCG by 0.71% and reduces the 99th percentile latency by 2.49%.
-
Fault Tolerance and Consistency in Distributed Systems
2018
Go
- Built replicated state machines using Raft consensus algorithm, implemented leader election, log replication and commit, and tested in environment where servers suffer concurrent failures and reconnections.
- Constructed a thread-safe key-value storage system that utilizes client-side caching with lease managed by the storage servers to achieve scalability and consistency.
-
Towards a General-Purpose Linguistic Annotation Backend
2018. Advised by Graham Neubig. With Yuyan Zhang, Chian-Yu Chen, Jean Lee, and Zirui Li.
Python | TensorFlow | Flask
[pdf]
- Created RESTful API between transcription engine and annotation interface for language documentation.
- The automatic phonemic transcription system was based on the Connectionist Temporal Classification objective function with the bidirectional LSTM recurrent neural network.
-
Reinforcement Learning in Sequence-to-Sequence Models
2018. Conducted in course lectured by Graham Neubig. With Hai Pham and Shuxin Lin.
Python | PyTorch
[pdf] [code]
- Proposed two adaptive beam search methods using reinforcement learning and heuristic rules, reducing the beam search time by 53% on CCGbank dataset while retaining the same accuracy.
- Implemented the reinforcement learning environment and adaptive beam search framework, with agent trained by the actor-critic method.
-
Semantic Adversarial Autoencoder for Zero-Shot Learning
2017. Conducted in course lectured by Ruslan Salakhutdinov. With Kangyan Zhou and Shihui Li.
Python | TensorFlow
[pdf] [code]
- Developed the novel architecture that incorporates the generative adversarial net (GAN) into the semantic autoencoder for the zero-shot learning.
- Outperformed the semantic autoencoder on the generalized zero-shot image classification tasks on the benchmark datasets, CUB 200 and AwA.
-
Learning from Few Labeled Data on Large Knowledge Graph
2017
Python | Spark
- Achieved classification of large amount of unlabeled data with only 5% of the data initially labeled by leveraging the Gaussian random field model.
- Efficiently implemented the label propagation algorithm in Apache Spark and applied to the big dataset, the Freebase knowledge graph, with over 300,000 entries and 2 million relations.
-
Multi-Threaded Web Proxy for HTTP service
2017
C | Linux | TCP/IP
- Implemented a HTTP proxy that serves contents from the remote servers to the client web browsers.
- Enhanced the performance by supporting multi-threaded concurrent connections and LRU caching of web contents.
-
Newton Method Optimization for Deep Learning
2016 - 2017. Advised by Chih-Jen Lin. With Chien-Chih Wang, Kent Loong Tan, Chun-Ting Chen, S. Sathiya Keerthi, Dhruv Mahajan, S. Sundararajan.
MATLAB | Python
[pdf] [suppl, code]
- Designed the Newton method optimizer for deep neural nets, exploiting the data structure manipulations to reduce the overhead and increase the efficiency, tested and proven by the profiling tool.
- Reduced the bottleneck memory consumption from O(n2) to O(n) through analyzing the backpropagation procedure and decomposing the Jacobian matrix into the compositional blocks.
- Implemented stochastic gradient descent (SGD) optimizer with adaptive learning rate and momentum, supporting weight decay and early stopping.
-
Numerical Library for Astrophysics Data
2013 - 2016. Advised by Pisin Chen.
Mathematica | Fortran | CAMB
[pdf] [code]
- Developed the numerical library to solve the non-linear systems of gravitational perturbations.
- Improved the compatibility of the open-source science code, CAMB, by adding the interface for non-parametrized data and the interpolating function.