DSPyProjectAwareCoder

Project Aims:

This repository is purely for learning and exploration of:

How the DSPy programming model.

Testing capabilities of DeepSeek‑R1‑0528‑Qwen3‑8B (Q6_K, YaRN 131 k) language model.

Nothing here should be considered production‑ready.

DSPyProjectAwareCoder

Local, project‑aware coding assistant built with DSPy and DeepSeek‑R1‑0528‑Qwen3‑8B (Q6_K, YaRN 131 k) served by llama‑server.

1 Current capabilities (v0.1)

Stage	Module	Highlights
Index & Chunk	`indexer/`	2 k‑token, 50 %‑overlap chunks · tokenizer from GGUF via `llama‑cpp` · embeddings with bge‑small‑en‑v1.5 · differential upsert & delete into Chroma.
Retrieve	`dspy_agent/retrieval.py`	k‑NN search (k=10 default), max prompt ≤ 40 k tokens.
Reason	`dspy_agent/pipeline.py`	DSPy Chain‑of‑Thought + JSON adapter compiled via BootstrapFewShot.
Validate	`dspy_agent.assertions.RefAssertion`	Fails if cited references absent in answer.
CLI	`python ‑m dspy_agent.pipeline`	Streams structured JSON.

Example session

$ llama-server --model "$DEEPSEEK_GGUF" --port 8080 --n_ctx 65536 &
$ find indexer -name "*.py" -print0 | xargs -0 -n1 python -m indexer.upsert
$ python -m dspy_agent.pipeline "Explain indexer.upsert()"
{
  "solution": "The indexer.upsert function is used to index or update a file within the ChromaDB collection. It works by reading the file, splitting it into chunks, generating embeddings for the chunks, and then using the ChromaDB collection's upsert method to insert new chunks or update existing ones. Additionally, it deletes chunks that are no longer present in the file to keep the database consistent. The function is defined in the first snippet and relies on the chunking and embedding functions from the other snippets.",
  "references": [
    "The first snippet defines the index_file function which uses the COL.upsert method to update the ChromaDB collection with new embeddings and documents, while also deleting outdated ones.",
    "The second snippet provides the chunking function that splits the file content into manageable pieces and computes their sha256 hashes for tracking.",
    "The third snippet defines the embed function that converts text into vectors using a SentenceTransformer model, which is used in the upsert operation."
  ]
}

2 Roadmap

Priority	Feature	Purpose
1	Persistent chat sessions backed by MongoDB.	Maintain dialogue context without rerunning full pipeline every turn.
2	Self‑refreshing index: assistant triggers incremental upserts when files change during a session.	Keeps retrieval in sync live.
3	MongoDB memory store for summaries, tool outputs, and long‑term notes.	Enables iterative design loops.
4	Tool‑calling (`apply_patch`, `run_tests`).	Move from Q&A to active pair‑programming.

3 Environment (.env template)

# Quantised DeepSeek GGUF (YaRN 131k)
DEEPSEEK_GGUF="/…/unsloth_DeepSeek-R1-0528-Qwen3-8B-Q6_K.gguf"

# Embedding model
_EMB_MODEL="BAAI/bge-small-en-v1.5"

# llama‑server endpoint (OpenAI format)
LLM_API_BASE="http://127.0.0.1:8080/v1"
LLM_MODEL="openai/unsloth_DeepSeek-R1-0528-Qwen3-8B-Q6_K.gguf"
MAX_TOKEN=8000
TEMPERATURE=0.6

# Retrieval
RETRIEVE_K=10

4 Quick start

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export $(grep -v '^#' .env | xargs)

# start inference server
llama-server --model "$DEEPSEEK_GGUF" --port 8080 --n_ctx 65536 &

# index your project
find my_project -name "*.py" -print0 | xargs -0 -n1 python -m indexer.upsert

# ask something
python -m dspy_agent.pipeline "How does foo.bar.validate() work?"

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dspy_agent		dspy_agent
indexer		indexer
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Aims:

Nothing here should be considered production‑ready.

DSPyProjectAwareCoder

1 Current capabilities (v0.1)

Example session

2 Roadmap

3 Environment (.env template)

4 Quick start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project Aims:

Nothing here should be considered production‑ready.

DSPyProjectAwareCoder

1 Current capabilities (v0.1)

Example session

2 Roadmap

3 Environment (.env template)

4 Quick start

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages