I'm a software engineer passionate about creating impactful applications. I specialize in leveraging the potential of Machine Learning and enjoy the broader spectrum of software development. My experience includes developing large-scale distributed systems, building batch-processing workflows and real-time streaming pipelines for content personalization across various Yahoo! products.
- Developed content-based user profiles from users' activities and interests for personalization with ML models
- Created real-time streaming pipelines to generate hundreds of millions of profiles from user clicks and dwell time, driving a 2.5% increase in homepage and content page revenue and longer user retention
- Designed a config-driven feature generation system to transform content and user data into ML features, accelerating the integration and development of features for model training and online ranking
- Implemented a scalable and generic user reaction microservice with gRPC, Apache Pulsar, Apache Storm, and NoSQL to process and serve configurable interactions across Yahoo!'s product ecosystem
- Coordinated a cross-functional privacy initiative with legal, product and engineering teams to develop a system for the identification and exclusion of sensitive topics, prioritizing user privacy and ensuring legal compliance
- Designed a model serving automation pipeline to enable continuous delivery of machine learning models
- Reduced model deployment time in Tensorflow Serving on Kubernetes to improve bucket testing efficiency
- Created a RESTful Jetty server to validate models and monitor server health and performance
- Developed a Cost-Sensitive Learning optimization technique to address the Multiclass Imbalance Problem
- Improved average multi-class recall by using Reinforcement Learning (PPO).
Work published in TAAI
- Received College Student Research Award, given to only one project in the Department of Computer Science
- Held office hours to provide individualized help for more than 200 students in the Machine Learning course
- Developed a speaker recognition pipeline and API for verifying speakers in the text-independent scenario
- Created a dashboard to visualize model performance and monitor usage for troubleshooting and improvement
- Integrated traditional speech processing methods with deep learning to increase verification rate by 15%
GPA: 4.0/4.0
Courses: Natural Language Processing, Recommender Systems, Virtualization,
Compiler Design, Computer Vision, Convex Optimization
GPA: 4.14/4.30; Rank: 10/123
Courses: Algorithm Design and Analysis, Digital Image Processing,
Video Communication, Machine Learning, Computer Architecture,
Computer Network, Operating System
- Implemented a web application powered by Node.js and Express on the backend, featuring a responsive front-end interface designed with React, Ajax, jQuery, and Bootstrap
- Leveraged fast style transfer in Python and Tensorflow to transform uploaded user pictures into artistic images, and reduced transformation time by 50% while serving multiple models
- Designed an MVC framework with RESTful API using Express and MongoDB, and deployed the server with Nginx on AWS EC2, ELB, Lambda and Route 53
- Built a real-time chat feature using Socket.IO and secured user authentication with Passport.js
- Developed a chatbot with a tailored personality profile, powered by a Transformer model for personalized conversations
- Applied Transfer Learning to fine-tune the model pre-trained on a large text corpus to small dialogue dataset
- Created a web application using Django for users to interact with the chatbot and customize its personality
- Ranked 8th out of 423 competitors with 1.099 Mean Squared Error on rating prediction, and 73.32% accuracy on read prediction
- Integrated implicit feedback into the objective function by maximizing the probability of relative preference prediction
- Used Latent Factor Models and Alternating Least Square for faster rating prediction
- Built a streaming video codec that integrates a Convolutional Neural Network Autoencoder and Huffman Coding with the conventional H.264 codec
- Used a residual autoencoder to minimize the difference between the original and compressed video, achieving improved compression efficiency