I began my career as a Project Control Engineer at CTCI, Taiwan's largest international EPC company, where I spent four years specializing in cost and schedule control for large-scale infrastructure projects. Throughout my engineering career, I worked across multiple project phases and disciplines, developing expertise in both cost analysis and schedule management while supporting various long-term industrial initiatives.
This experience taught me to work effectively in high-complexity environments, synthesize information from diverse sources, coordinate across cross-functional teams, and deliver reliable analysis under pressure. Whether tracking performance metrics or developing project schedules, the fundamental skill remained constant: transforming complex data into actionable insights that support decision-making.
Over time, I became deeply intrigued by the power of data and AI-driven systems, especially their potential to bring scalable solutions to the real world. Now, I'm combining the structured analytical mindset I developed in engineering execution with hands-on machine learning development, from training deep learning models to deploying full-stack AI applications.
What makes my background unique: My engineering experience equipped me with skills that directly translate to AI product development—breaking down ambiguous problems into executable approaches, balancing technical feasibility with practical constraints, coordinating with diverse stakeholders, and adapting quickly to new technical domains. Whether I'm architecting a multimodal AI system or designing an image segmentation pipeline, the core competency remains the same: understanding complex problems, structuring practical solutions, and delivering measurable results.
A hybrid model for dog breed classification and recommendation
PawMatchAI combines a sophisticated CNN backbone with advanced Transformer layers and a specialized Morphological Feature Extractor to create an innovative hybrid architecture for comprehensive dog breed identification. The project features several key innovations and achievements:
- Advanced Architecture: Hybrid model combining Convolutional Neural Networks with Transformer layers for enhanced feature extraction and classification accuracy
- Custom Feature Engineering: Proprietary Morphological Feature Extractor inspired by expert veterinary observation patterns
- High Performance: Achieved an impressive 88.7% F1-score on breed classification tasks
- Multi-functional Platform: Enables users to identify breeds, compare breed characteristics, and get LLM-powered recommendations for optimal matches based on lifestyle preferences
- Business Intelligence Integration: Comprehensive Tableau dashboard transforms breed classification data into strategic business insights, analyzing market segmentation patterns and delivering analytical breed recommendations
- Recognition: Featured on Hugging Face's "Spaces of the Week" with over 34,000+ visits and 14,000 GPU runs
- Links: 🌐 Try the Demo | 🗂️ Explore the Project | 📊 Business Dashboard
Advanced multi-modal system for deep scene understanding
Vision Scout represents a sophisticated approach to visual intelligence, orchestrating multiple state-of-the-art models to transform complex visual data into comprehensive narratives:
- Multi-model Architecture: Seamlessly integrates YOLO11, CLIP, Places365, and Llama 3.2 for comprehensive scene analysis
- Deep Understanding: Transforms raw visual data into human-readable stories and detailed scene descriptions
- Advanced Processing: Combines object detection, image classification, scene recognition, and natural language generation
- Production-Grade Architecture: Three-layer facade design managing 35,000+ lines of code across 70 specialized classes, showcasing sophisticated system design capabilities for enterprise-scale multimodal AI deployment
- Foundational Research: Tackles the fundamental challenge of true scene understanding beyond simple object detection, addressing spatial relationships, functional purposes, and contextual reasoning that form the foundation for future autonomous systems and AI assistants
- Future Vision: Represents ongoing exploration toward achieving genuine scene comprehension, a critical stepping stone toward more general AI systems capable of perceiving, reasoning about, and interacting with the physical world in human-like ways
- Community Recognition: Featured on Hugging Face's "Spaces of the Week" with significant user engagement
- Performance Metrics: Over 11,000 visits and 4,000+ GPU runs within three months, demonstrating strong user adoption
- Links: 🌐 Try the Demo | 🗂️ Explore the Project
Professional AI-powered image enhancement and video generation platform
VividFlow represents a significant advancement in accessible AI content creation, combining professional-grade image-to-video generation with intelligent background replacement and artistic style transfer in a unified platform. The system addresses critical pain points in digital content production by enabling non-technical users to transform static imagery into dynamic, broadcast-quality videos while simultaneously offering sophisticated background synthesis and artistic rendering capabilities. The project features several key innovations and achievements:
- Triple Creative Workflow: Unified platform integrating image-to-video generation, intelligent background replacement, and artistic style transfer within a streamlined three-tab interface
- Intelligent Motion System: Eight curated template categories designed to prevent common generation artifacts, with support for custom natural language prompts and optional AI-powered prompt enhancement
- Optimized Performance: FP8 quantization and Lightning LoRA distillation enable efficient generation while maintaining professional output quality
- Advanced Background Synthesis: Multi-tier segmentation with twenty-four professionally curated scene templates spanning professional, natural, urban, artistic, and seasonal environments, complemented by AI-powered inpainting for precision artifact removal
- Artistic Style Transfer: Six foundational artistic styles and five balanced blend presets transform photographs into distinctive interpretations ranging from 3D cartoon to classical oil painting, with optional identity preservation technology
- Artifact-Free Compositing: Lab color space blending technology ensures seamless subject-background integration with precise edge handling and manual Touch Up refinement capabilities
- Recognition: Featured on HuggingFace's "Spaces of the Week" with over 26,000 visits
Links: 🌐 Try the Demo | 🗂️ Explore the Project
Comprehensive data science and machine learning portfolio repository
The Learning Record repository chronicles a complete journey through data science fundamentals while tackling real-world business challenges. This collection represents hands-on experience solving complex problems across multiple industries and technical disciplines:
- Financial Analytics: Credit card fraud detection achieving 99% AUC performance using XGBoost and Bayesian optimization
- Credit Risk Assessment: Comprehensive credit score classification models with 85% accuracy utilizing ensemble methods and neural networks
- Natural Language Processing: Advanced MBTI personality prediction through sophisticated text analysis and NLP techniques
- Customer Analytics: E-commerce segmentation analysis using K-means and DBSCAN with optimized silhouette scores
- Time Series Forecasting: Retail sales prediction implementing ARIMA and SARIMAX statistical methodologies
- Signal Processing: Human activity recognition from smartphone sensor data using advanced preprocessing and dimensionality reduction techniques
- Audio Classification: Music genre classification from audio features with sophisticated feature engineering and model optimization
Links: 🗂️ Explore the Repository
I write about deep learning architectures, hybrid modeling, and AI system design—translating complex concepts into actionable engineering insights.
| Title | Published | Pageviews | Engaged Views | Highlights | Recognition |
|---|---|---|---|---|---|
| 🧠 From Fuzzy to Precise: How Morphological Feature Extractors Enhance AI Recognition | 2025/03/25 | 505 | 294 | Morphological reasoning in CNNs | Deep Dives |
| 🧩 The Art of Hybrid Architectures: Blending Convolutional and Transformer Models for Explainability | 2025/03/28 | 1,502 | 783 | Layered hybrid design: CNN + Transformer | Deep Dives |
| 🔗 Beyond Model Stacking: The Architecture Principles That Make Multimodal AI Systems Work | 2025/06/19 | 6,056 | 2,197 | Multimodal system design & architecture thinking | Deep Dives |
| 🤖 Four AI Minds in Concert: A Deep Dive into Multimodal AI Fusion | 2025/07/02 | 2,173 | 897 | An in-depth exploration of multi-model collaboration in AI systems | Deep Dives |
| 🌍 Scene Understanding in Action: Real-World Validation of Multimodal AI Integration | 2025/07/10 | 590 | 360 | Real-world benchmarking of integrated AI collaborative systems | Deep Dives |
| 🎨 From RGB to Lab: Addressing Color Artifacts in AI Image Compositing | 2026/01/16 | 326 | 142 | Three-tier segmentation strategy & Lab color space correction | Editor's Pick |
🔹 Machine Learning & AI Enthusiast
Hands-on in deep learning, computer vision, NLP, and model deployment, with a focus on building useful, explainable, and well-integrated AI solutions.
🔹 Engineer Turned AI Builder
From managing construction schedules to orchestrating multi-model AI systems, I carry the same structured, iterative mindset—whether defining MVPs or architecting segmentation pipelines. I bring ideas from concept to working code: background replacement with Lab color space correction, multimodal fusion with YOLO + CLIP, morphological feature extractors that enhance CNN reasoning.
🔹 Data Quality Advocate
I believe in "Garbage in, garbage out"—a lesson that applies as much to training transformers as it does to a poorly defined CPM schedule. Whether it's feature engineering for XGBoost or preprocessing for diffusion models, a well-prepared dataset determines everything downstream.
🔹 Product-Minded Developer
I don't just build AI models, I develop solutions that address real user needs. PawMatchAI emerged from understanding breed selection challenges; VisionScout transforms complex visual scenes into comprehensive narratives for users seeking deeper image understanding; VividFlow tackles professional content creation for non-technical users. I prioritize features based on impact, iterate based on feedback, and measure success through actual adoption. The 69,000+ combined visits across my projects reflect this user-first approach, as does my technical writing on Towards Data Science—translating complex architectures into actionable insights for practitioners.
I'm actively seeking opportunities in:
👉 Technical Program Manager
👉 Operation Program Manager/Analytics
I'm particularly interested in connecting with:
- Teams building AI-powered products in computer vision, generative AI, or multimodal systems
- Organizations that value both technical depth and product thinking
- Companies fostering a culture of open collaboration and continuous learning
Whether you're exploring potential collaboration, looking for someone to join your team, or want to discuss AI product development challenges, I'd welcome the conversation.
"Every challenge is a puzzle — it's just waiting for the right combination of algorithms and insight."
