Skip to content
View cleberfc23's full-sized avatar
Iskra Labs
Iskra Labs

Block or report cleberfc23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
cleberfc23/README.md

Cleber F. Carvalho

Machine Learning Engineer · AI Systems · LLM Applications

ML Engineer with a background spanning production software delivery, applied research, and AI systems development. Two years of graduate-level ML research at Kyoto University of Advanced Science (healthcare AI, time series forecasting, LSTM architectures) in Japan. Four years of production Android engineering at a French assistive technology company, deploying systems used by 3,000+ users across France, Belgium, and Switzerland.

Currently building production-oriented AI systems under Iskra Labs, an independent engineering initiative focused on LLM applications, RAG pipelines, and measurable system design.

Olá! 🇧🇷 · Cześć! 🇵🇱 · こんにちは! 🇯🇵 · ¡Hola! 🇪🇸


Selected Work

RAG system for querying clinical guidelines with citation-backed, hallucination-mitigated responses.

  • Modular architecture: decoupled ingestion → ChromaDB → LangGraph inference → FastAPI → Streamlit
  • End-to-end latency reduced from ~358s to ~7.1s (>98% improvement)
  • Stack: LangChain · LangGraph · ChromaDB · HuggingFace · Gemini · FastAPI · Docker

NLP pipeline for discovering themes and emerging signals in large Polish-language text corpora.

  • Dataset: 248,123 documents (Polish news corpus); 2,890-document high-quality subset after cleaning (~96.3% retention)
  • Pipeline: text preprocessing → CountVectorizer → KMeans (k=8) → PCA visualization
  • Live demo: sygnaly-miner.streamlit.app
  • Stack: Python · NLP · scikit-learn · Sentence Embeddings · Streamlit · Docker

SQL and Python pipeline for analyzing hospital readmission patterns in diabetic patients.

  • Modelled 101,766 encounters across 71,518 patients; risk buckets with readmission rates from 35.52% (low) to 50.58% (high)
  • Stack: SQL · Python · Pandas · SQLite

Core Technologies

AI & ML — LangChain · LangGraph · HuggingFace · OpenAI · Gemini · RAG · vector search · LLM evaluation
ML Frameworks — PyTorch · TensorFlow · scikit-learn · time-series forecasting
Data — Pandas · NumPy · analytical SQL · MLflow
Languages — Python · SQL · Kotlin · Java
Tools — Docker · FastAPI · Streamlit · Git · Jupyter
Cloud — GCP · Vertex AI · BigQuery


Pinned Loading

  1. clinical-evidence-navigator clinical-evidence-navigator Public

    Evidence-grounded RAG system for querying public medical guidelines with citation enforcement and hallucination mitigation. Built with LangGraph, Chroma, and Streamlit.

    Python 1

  2. clinical-sql-analytics-diabetes-readmissions clinical-sql-analytics-diabetes-readmissions Public

    Clinical SQL analytics project focused on diabetes hospital readmissions. Includes relational data modeling in SQLite, ETL pipeline, data quality checks, and SQL-based exploratory analysis to gener…

    Jupyter Notebook 1

  3. sygnaly-miner sygnaly-miner Public

    Exploring how machine learning can help uncover patterns in large datasets of citizen feedback. This project transforms raw public reports into meaningful signals through text analysis and clusteri…

    Jupyter Notebook 1

  4. malariagen-data-python malariagen-data-python Public

    Forked from malariagen/malariagen-data-python

    Analyse MalariaGEN data from Python

    Jupyter Notebook 1