Skip to content
View xang1234's full-sized avatar
  • Singapore

Block or report xang1234

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xang1234/README.md

Hi, I'm David πŸ‘‹

πŸ“ Singapore | πŸ“Š Data Scientist | Python Builder | ⚑ Rust Learner

Python R Jupyter scikit-learn PostgreSQL Kaggle Rust

Current Projects

  • πŸ“° news-tracker β€” Multi-platform financial data ingestion framework for tracking semiconductor & tech news. Ingests from Twitter, Reddit, Substack & 6+ news APIs β†’ Redis Streams β†’ FinBERT/MiniLM embeddings β†’ pgvector semantic search. Includes NER, entity-level sentiment analysis, and a FastAPI serving layer.
  • ⚑ rapid-textrank β€” High-performance TextRank in Rust with Python bindings. Extract keywords 10–100x faster than pure Python. Supports TextRank, PositionRank & BiasedTextRank with 18 languages.
  • πŸ“ˆ StockScreener β€” Full-stack stock scanner implementing William O'Neil's CANSLIM and Mark Minervini's trend template. 80+ filters, AI chatbot (Groq/DeepSeek/Gemini), theme discovery, and StockBee-style breakout scans.

NLP & Machine Learning

  • πŸ”€ pytextrank β€” Python implementation of TextRank for text document NLP parsing and summarization
  • 🏷️ Multi-label Classification β€” Deep dive into scikit-multilearn for multi-label problems
  • 🏷️ FastXML β€” Extreme multi-label classification
  • βš–οΈ Imbalanced Datasets β€” Techniques for handling class imbalance in ML

Data Visualization & Analysis

  • πŸ—³οΈ Malaysian-Elections-GE14-2018 β€” Visualizing the watershed 2018 Malaysian election with data from Wikipedia & Department of Statistics
  • πŸ—ΊοΈ Isochrone Maps β€” Travel time visualizations for Singapore using R and OpenTripPlanner
  • 🎡 Chord Diagrams β€” Interactive chord diagrams in R for visualizing global migration flows
  • πŸ“Š Population Pyramids β€” Faceted population pyramids for demographic analysis
  • πŸ“‰ Time Series with Prophet β€” Forecasting time series data with Facebook Prophet

Tools & Scrapers

  • πŸ’° Finviz-Scraper β€” Simple, effective Python scraper for Finviz financial data ⭐ 35
  • πŸ” JobScraper β€” Job listing scraper
  • 🚒 at_sea β€” Python utility project

πŸ† Kaggle β€” Competitions Expert

Rank: 4,987 / 204,611 (highest: 421) Β |Β  πŸ₯ˆ 2 Silver Β  πŸ₯‰ 4 Bronze Β |Β  20 competitions (all solo)

Competition Domain Rank
Lyft 3D Object Detection for Autonomous Vehicles 3D CV / Autonomous Driving πŸ₯ˆ 35 / 546 (top 6%)
Predicting Molecular Properties Chemistry / Quantum Mechanics πŸ₯ˆ 40 / 2,737 (top 1.5%)
Peking University/Baidu β€” Autonomous Driving CV / Pose Estimation πŸ₯‰ 75 / 864 (top 9%)
Google QUEST Q&A Labeling NLP / Question Answering πŸ₯‰ 104 / 1,571 (top 7%)
RSNA Intracranial Hemorrhage Detection Medical Imaging πŸ₯‰ 104 / 1,345 (top 8%)
TensorFlow 2.0 Question Answering NLP / Reading Comprehension πŸ₯‰ 119 / 1,233 (top 10%)

Connect

LinkedIn Kaggle Blog GitHub


Philosophy

"Good data deserves fast algorithms." β€” I build tools that sit at the intersection of NLP, finance, and performance engineering. If Python's too slow, there's always Rust.

Pinned Loading

  1. Finviz-Scraper Finviz-Scraper Public

    Simple Python Scraper for Finviz

    Python 35 16

  2. pytextrank pytextrank Public

    Forked from DerwenAI/pytextrank

    Python implementation of TextRank for text document NLP parsing and summarization

    Jupyter Notebook 13 18

  3. StockScreener StockScreener Public

    Full-stack stock scanner implementing William O'Neil's CANSLIM and Mark Minervini's trend template. Features 80+ filters, AI chatbot (Groq/DeepSeek/Gemini), theme discovery, and StockBee-style brea…

    Python

  4. rapid-textrank rapid-textrank Public

    ⚑ High-performance TextRank in Rust with Python bindings. Extract keywords 10-100x faster than pure Python. Supports TextRank, PositionRank & BiasedTextRank with 18 languages.

    Jupyter Notebook 2