Skip to content
View omkarmusale0910's full-sized avatar
πŸ˜€
πŸ˜€

Block or report omkarmusale0910

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
omkarmusale0910/README.md

Hi πŸ‘‹, I'm Omkar Musale

Software Engineer | Cloud & Data Engineering Specialist

Building scalable data pipelines and cloud infrastructure with Python, GCP, and Big Data technologies

LinkedIn Medium Email


πŸš€ About Me

  • πŸ”­ Currently working on: Building automated data pipelines and cloud-native applications
  • 🌱 Learning: FastAPI, AI Integration, Advanced Cloud Architecture, System design
  • πŸ’Ό Experience: 2.5+ years in Cloud Engineering, Data Processing
  • πŸ“ Writing: Technical articles on Medium
  • 🎯 Interests: AI, Geospatial Data, Big Data, and Scalable Systems

πŸ’Ό Work Experience

Software Engineer | Infocusp Innovations

July 2023 - January 2025

  • Built scalable data pipelines processing 100GB+ geospatial datasets (TIFF, CSV) using Google Earth Engine API
  • Optimized algorithms achieving 20% efficiency improvement
  • Deployed containerized applications on Kubernetes with infrastructure managed via Terraform
  • Implemented comprehensive monitoring and logging solutions using GCP Cloud Logging

Tech Stack: Python, GCP (GCS, Pub/Sub, Kubernetes, VM), Docker, Terraform, Google Earth Engine API

Software Intern | Infocusp Innovations

January 2023 - July 2023

  • Processed large-scale healthcare device data using PySpark on distributed systems (AWS Glue)
  • Developed automated data export pipelines with Cloud Scheduler, Batch Jobs, and Cloud Workflows
  • Created interactive data visualization dashboards using Streamlit for real-time insights
  • Analyzed accident data to identify high-risk patterns and trends using big data techniques

Tech Stack: PySpark, AWS Glue, Streamlit, Terraform, Cloud Workflows


πŸ› οΈ Technical Skills

Languages

Python C++ Bash

Cloud & DevOps

GCP Docker Kubernetes Terraform Pub/Sub BigQuery Dataflow

Big Data & Databases

Apache Spark Apache Beam Kafka PostgreSQL MySQL Redis

Tools & Technologies

Git Linux FastAPI


πŸ“Š Featured Projects

πŸ•·οΈ IntelliScraper

PyPI Downloads GitHub

Anti-bot detection asynchronous web scraping library built with Playwright

Description: A production-ready Python library for scraping protected websites (job platforms, social networks, e-commerce) that bypass anti-bot systems. Features session management, proxy support, and advanced HTML parsing. Published on PyPI with 2,000+ downloads.

Tech Stack: Python, Playwright, Asyncio, Bright Data Proxy

Highlights:

  • πŸ” Session management with cookies and browser fingerprints for authenticated scraping
  • πŸ›‘οΈ Advanced anti-detection techniques to bypass bot protection systems
  • ⚑ Fully asynchronous architecture for high-performance concurrent scraping
  • πŸ“¦ Published open-source library with 2.08K+ PyPI downloads
  • 🌐 Integrated proxy support (Bright Data) and CLI tool for session generation

πŸ† Achievements & Certifications

  • ⭐ 4-Star Competitive programming - CodeChef
  • ⭐ 5-Star C++ - HackerRank
  • πŸ“œ Competitive Programming Essentials - Master Algorithms Certification
  • πŸ’» Active problem solver on GeeksforGeeks, LeetCode,

πŸ“ Latest Blog Posts

Check out more of my articles on Medium


🀝 Connect With Me

LinkedIn Medium CodeChef HackerRank LeetCode


Profile Views

πŸ’‘ Open to collaborating on int

Pinned Loading

  1. IntelliScraper IntelliScraper Public

    A powerful, anti-bot detection web scraping solution built with Playwright, designed for scraping protected sites like LinkedIn and other platforms that require authentication. Features session man…

    Python 3

  2. pdfnotes pdfnotes Public

    Extract PDF comments and highlights for humans and AI

    Makefile