document-chunking

Here are 8 public repositories matching this topic...

messkan / rag-chunk

A Python CLI to test, benchmark, and find the best RAG chunking strategy for your Markdown documents.

python nlp ia chunking rag vector-search embedding-vectors llm langchain retrieval-augmented-generation text-splitting rag-pipeline document-chunking

Updated Dec 24, 2025
Python

speedyk-005 / chunklet-py

Star

One library to split them all: Sentence, Code, Docs. Chunk smarter, not harder — built for LLMs, RAG pipelines, and beyond.

visualization nlp natural-language-processing ai chunking code-structure code-chunking rag chunks-processing chunks-algorithm text-splitting document-chunking

Updated Dec 21, 2025
Python

SStephanJX / Snowflake-RAG-System

Star

Production-ready Snowflake RAG system with type-specific chunking

snowflake embedding rag vector-search retrieval-augmented-generation snowflake-cortex document-chunking resume-processing

Updated Dec 11, 2025
PLpgSQL

davidmoserai / AzureDocumentIntelligenceChunker

Star

A lightweight Python library for metadata-rich document chunking in Retrieval-Augmented Generation (RAG) workflows. It leverages Azure AI Document Intelligence to enhance chunking by retaining hierarchical structure, page numbers, and bounding boxes for seamless integration with PDF viewers.

react python agent azure chunking agents unstructured-data rag production-grade react-pdf-viewer layout-parser llm langchain retrieval-augmented-generation azure-ai-search azure-ai-document-intelligence layout-parsing document-chunking

Updated Jan 11, 2025
Python

choudaryhussainali / Langchain_Learnings

Star

"My complete LangChain learning journey — from basics to advanced RAG, LCEL, LangGraph, LangServe, LangSmith with hands-on code examples."

embeddings chains agents rag retrieval-systems prompt-engineering generative-ai langchain langsmith ai-reasoning vector-databases langserve lcel langgraph ragpipeline document-chunking memory-in-ai llms-integration

Updated Aug 12, 2025
Jupyter Notebook

kooroshsajadi / retrieval-augmented-generation

Star

This repository provides a fully modular implementation of a Retrieval-Augmented Generation (RAG) pipeline tailored for Italian legal-domain documents.

vectorization reranking rag hybrid-retrieval retrieval-augmented-generation document-chunking

Updated Nov 25, 2025
Python

alienveryilmaz / RAG-text-splitter-document-chunking-tool

Star

Smart text chunking tool for RAG systems. Splits long texts into sentence-based chunks with ~10%-15% overlap for better context retention. Runs fully in-browser with a clean UI and copyable outputs.

ai splitter chunking rag llm ai-tool text-chunking document-chunking

Updated Dec 12, 2025
HTML

ItzikAquaMotek / rag-chunk

Star

📝 Parse, chunk, and evaluate Markdown for RAG pipelines with token-accurate support and flexible strategies for optimal context management.

tree-sitter library ai csharp dotnet chroma ia code-structure embedding-vectors streamlit hybrid-search aisearch semantickernel text-chunking rag-pipeline llama3 document-chunking propositional-models

Updated Dec 24, 2025
Python

Improve this page

Add a description, image, and links to the document-chunking topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the document-chunking topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document-chunking

Here are 8 public repositories matching this topic...

messkan / rag-chunk

speedyk-005 / chunklet-py

SStephanJX / Snowflake-RAG-System

davidmoserai / AzureDocumentIntelligenceChunker

choudaryhussainali / Langchain_Learnings

kooroshsajadi / retrieval-augmented-generation

alienveryilmaz / RAG-text-splitter-document-chunking-tool

ItzikAquaMotek / rag-chunk

Improve this page

Add this topic to your repo