Welcome! This repository is a curated collection of my projects as a FullStack Developer focusing on Data Engineering and Backend Infrastructure.
I am a 20-year-old developer based in Brazil, specializing in building scalable pipelines, relational modeling (3FN), and transforming raw data into business intelligence.
Tech Stack: PySpark, Databricks, Delta Lake, Python.
A robust implementation of a Medallion Architecture to process real estate data.
- Ingestion: Handled heterogeneous sources (CSV/JSON) using PySpark.
- Data Engineering: Implemented schema harmonization, handled corrupted records, and automated data cleaning.
- Business Intelligence: Created a Gold layer with aggregated metrics for price-per-square-meter and regional trends.
Tech Stack: Python, Selenium, Pandas, TextBlob, MySQL.
An automated pipeline to extract and analyze consumer sentiment from e-commerce platforms.
- Scraping: Developed a resilient scraper with user-agent rotation to bypass anti-bot systems.
- NLP: Applied sentiment analysis (Polarity/Subjectivity) and text normalization to unstructured reviews.
- Relational Modeling: Designed a database schema to store processed comments, ready for BI dashboard consumption.
- Languages: Node.js, PHP (Laravel), Python.
- Databases: MySQL (expert in 3FN normalization), PostgreSQL, NoSQL.
- Design: RESTful APIs, System Refactoring, Scalable Infrastructure.
- Frameworks: Apache Spark (PySpark), Delta Lake.
- Platforms: Databricks, XAMPP.
- ETL/ELT: Pipeline orchestration and data quality enforcement.
- Analytics: SQL for deep data exploration.
- Visualization: Power BI, data-driven KPI definition.
- LinkedIn: Gustavo Ferreira Cordeiro
- Email: gucordeiro26@gmail.com
- Location: Tatuí, SP - Brazil 🇧🇷