CLAUDE.md

always respond in Chinese.

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

A high-performance Python library implementing computationally intensive algorithms in Rust using PyO3 bindings. The library focuses on financial data analysis, time series processing, statistical calculations, and mathematical functions that are significantly faster than pure Python implementations.

Development Environment

Python Environment:

Python path: /home/chenzongwei/.conda/envs/chenzongwei311/bin/python
Pip path: /home/chenzongwei/.conda/envs/chenzongwei311/bin/pip
Maturin path: /home/chenzongwei/.conda/envs/chenzongwei311/bin/maturin

Common Development Commands

Build and Development

增加新函数后，要在对应的*.pyi中添加函数声明
使用timeout 600s ./alter.sh 2>&1来构建项目并查看成功或报错信息

Testing

生成测试文件时，不要直接生成在rust_pyfunc文件夹下，而是存储在tests文件夹中。
在编写了rust新函数后，请使用python代码实现同样的功能，并比较二者的是否一致，以及速度差异如何。

Python Integration

python/rust_pyfunc/__init__.py - Python package entry point
python/rust_pyfunc/*.py - Additional Python utilities and pandas extensions

Key Dependencies

PyO3 - Rust-Python bindings
maturin - Build system for Rust-Python packages
ndarray/numpy - Array operations
nalgebra - Linear algebra
rayon - Parallel processing

Development Guidelines

Code Style

When adding new Rust functions to be called from Python, update the corresponding .pyi file for proper type hints and IDE support
Use Altair or Plotly for data visualization, avoid Matplotlib
Add appropriate comments for code readability
Only modify code relevant to the specific changes being made

Adding New Functions

Implement the function in the appropriate src/ module
Add the function export in src/lib.rs (line ~21-65)
Update python/rust_pyfunc/rust_pyfunc.pyi with proper type hints
Add documentation and examples following the existing pattern
Create test cases in tests/ directory

Performance Optimization

Use #[pyfunction] macro for Python-callable functions
Leverage rayon for parallel processing where appropriate
Use SIMD instructions when available (packed_simd_2)
Profile with criterion benchmarks (in [dev-dependencies])

Release Configuration

The project uses aggressive optimization settings:

[profile.release]
lto = true
codegen-units = 1
panic = "abort"
opt-level = 3

CI/CD

Multi-platform builds (Linux, macOS, Windows) via GitHub Actions
Automatic documentation deployment to GitHub Pages
PyPI publishing on tag creation

Stock Data Context

The project includes utilities for working with Chinese stock market data through the design_whatever library, supporting L2 tick data, market snapshots, and minute-level aggregations.

代码开发指南

代码组织原则

写新函数时,不要在mod.rs中追加,可以创建一个新的文件,然后在lib.rs中添加导入即可

开发备忘录

写了新函数之后,只要写一些测试文件即可,不需要写演示文件

避免非确定性行为（重要）

Rust 的 HashMap 迭代顺序是非确定性的（每次进程启动时哈希种子随机化），会导致以下问题：

相同输入多次调用函数，结果不一致
不同进程/不同时间计算同一批数据，结果不一致
下游对序列顺序敏感的统计量（trend、autocorr、lz_complexity）受影响最大

必须遵守的规则：

需要遍历 HashMap 的 keys() 或 values() 并用于后续计算时，必须先排序（如 sort_by_key）

推荐写法：

// 错误：HashMap迭代顺序随机
let v: Vec<f64> = my_hashmap.values().map(|&v| v as f64).collect();

// 正确：先按key排序再取values
let mut sorted: Vec<_> = my_hashmap.iter().collect();
sorted.sort_by_key(|(&k, _)| k);
let v: Vec<f64> = sorted.iter().map(|(_, &v)| v as f64).collect();

如果只需要查找（get/contains_key）而不遍历，HashMap 无需排序
替代方案：使用 BTreeMap（天然有序），但插入性能略低于 HashMap
HashSet 同理，遍历时也需排序后再使用
新增函数时，凡是涉及 HashMap/HashSet 的遍历结果用于下游计算，都必须验证确定性：相同输入调用多次，结果应完全一致

代码命名规范

写新函数时，函数名字要具体，可以简单概括函数的核心计算内容与逻辑
使用timeout 600s ./alter.sh 2>&1构建项目时,将timeout限制设置为10分钟.
优化函数性能时,不要使用并行,除非在提示词中明确指出使用并行.
在写了新函数或优化了函数后,请在回答中直接告知我新函数或优化函数的名字是什么,以及给出一个最简单的调用示例.
写git commit 时,版本号与 Cargo.toml中的保持一致

GitHub

GitHub Token: 已配置在git remote URL中，无需手动输入
每次git push之前先运行git remote -v查看当前仓库地址，然后再git push
git push时使用 HTTP_PROXY=http://127.0.0.1:10808 HTTPS_PROXY=http://127.0.0.1:10808 git push设置代理
有关git或npm的操作，都要设置这个代理 HTTP_PROXY=http://127.0.0.1:10808 HTTPS_PROXY=http://127.0.0.1:10808
用户名称是chen-001

Bash命令

上传git时,将所有更改都上传
写git commit message时,要详细列出更新的版本号和内容信息
不要主动上传git,除非我在prompt中明确要求上传git
不要生成markdown文件,除非我在提示词中明确要求.如果要生成一些代码说明文件,请不要单独创建一个markdown文件,而是可以在代码文件中多写些注释,或者使用ipynb文件演示.
创建的python测试文件或演示文件尽量精简,文件数量要精简,文件里的代码也要精简,只展示核心的功能即可,不要写的太复杂.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Development Environment

Common Development Commands

Build and Development

Testing

Python Integration

Key Dependencies

Development Guidelines

Code Style

Adding New Functions

Performance Optimization

Release Configuration

CI/CD

Stock Data Context

代码开发指南

代码组织原则

开发备忘录

避免非确定性行为（重要）

代码命名规范

GitHub

Bash命令

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Development Environment

Common Development Commands

Build and Development

Testing

Python Integration

Key Dependencies

Development Guidelines

Code Style

Adding New Functions

Performance Optimization

Release Configuration

CI/CD

Stock Data Context

代码开发指南

代码组织原则

开发备忘录

避免非确定性行为（重要）

代码命名规范

GitHub

Bash命令