Skip to content

fix: Replace VLAs with std::vector for MSVC compatibility and stack safety#190

Open
feihongxu0824 wants to merge 6 commits intomainfrom
fix/remove_vla
Open

fix: Replace VLAs with std::vector for MSVC compatibility and stack safety#190
feihongxu0824 wants to merge 6 commits intomainfrom
fix/remove_vla

Conversation

@feihongxu0824
Copy link
Collaborator

No description provided.

@greptile-apps
Copy link

greptile-apps bot commented Mar 2, 2026

Greptile Summary

This PR systematically replaces Variable Length Arrays (VLAs) with std::vector across the entire codebase to achieve MSVC compatibility and improve stack safety.

Key Changes:

  • Replaced all VLA declarations (e.g., float arr[size]) with std::vector<T> (e.g., std::vector<float> arr(size))
  • Added #include <vector> to files that needed it
  • Updated function calls to use .data() when passing vectors to functions expecting pointers
  • Corrected sizeof() usage to properly calculate byte sizes for vectors (e.g., sizeof(T) * vec.size())
  • In test files, replaced some memset calls with more idiomatic std::fill

Impact:

  • MSVC Compatibility: VLAs are a C99 feature not supported by MSVC and not part of the C++ standard
  • Stack Safety: Large VLAs can cause stack overflow; heap-allocated vectors are safer
  • Standards Compliance: Makes the codebase fully C++ standard compliant

Performance Considerations:
In performance-critical SIMD code paths (inner_product_distance_batch_impl*.h), vectors are allocated inside loops. While this introduces heap allocation overhead, the impact should be minimal because:

  • Template parameters like dp_batch are typically small compile-time constants
  • Modern allocators are efficient for small allocations
  • The compiler may optimize some allocations
  • The benefits (portability, safety, standards compliance) outweigh the small performance cost

All changes follow a consistent mechanical pattern and appear to be correctly implemented.

Confidence Score: 4/5

  • This PR is safe to merge with low risk - changes are mechanical and consistent across all files
  • Score of 4 reflects high confidence with minor reservation: while all changes are mechanically correct and improve portability/safety, there are vector allocations inside performance-critical loops that could have small runtime impact. The extensive scope (38 files) requires thorough testing to ensure no edge cases were missed.
  • Pay attention to src/ailego/math_batch/inner_product_distance_batch_impl*.h files - these contain SIMD hot paths with loop-scoped vector allocations that should be performance tested

Important Files Changed

Filename Overview
src/ailego/algorithm/lloyd_cluster.h Replaced VLAs with std::vector for scores, nearest_scores, and nearest_indexes arrays; added vector include; all .data() conversions correct
src/ailego/math/mips_euclidean_distance_matrix.h Replaced all fixed-size u2 and v2 arrays with std::vector across multiple template specializations; consistent pattern applied
src/ailego/math_batch/inner_product_distance_batch_impl.h Replaced VLAs for SIMD registers (accs, data_regs, sum128_regs) with std::vector; some allocations occur inside loops which may have minor performance impact
src/ailego/math_batch/inner_product_distance_batch_impl_fp16.h Replaced VLAs for FP16 SIMD registers with std::vector across multiple implementations (AVX512FP16, AVX512F, AVX2); loop-scoped allocations present
src/core/algorithm/hnsw/hnsw_algorithm.cc Replaced neighbor_ids, dists, and neighbor_vecs VLAs with std::vector; added vector include; proper .data() usage throughout
src/core/algorithm/hnsw/hnsw_entity.cc Replaced padding, mapping, and buffer arrays with std::vector; correctly updated sizeof() calculations to use vector.size() * sizeof(element)
src/core/algorithm/hnsw_sparse/hnsw_sparse_algorithm.cc Replaced neighbor_ids VLAs with std::vector in sparse HNSW implementation; added vector include; proper .data() usage
src/core/algorithm/ivf/ivf_entity.cc Replaced distances VLA with std::vector in IVF search functions; proper .data() usage for distance calculations

Last reviewed commit: f8bab86

@feihongxu0824
Copy link
Collaborator Author

@greptile

@iaojnh
Copy link
Collaborator

iaojnh commented Mar 2, 2026

@greptile

Copy link
Collaborator

@iaojnh iaojnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants