Skip to content

feat(interchain-indexer): Improving internal stats module#1650

Merged
EvgenKor merged 18 commits intomainfrom
evgenkor/interchain/stats-tables
Apr 6, 2026
Merged

feat(interchain-indexer): Improving internal stats module#1650
EvgenKor merged 18 commits intomainfrom
evgenkor/interchain/stats-tables

Conversation

@EvgenKor
Copy link
Copy Markdown
Contributor

@EvgenKor EvgenKor commented Mar 24, 2026

Summary

This branch introduces a complete statistics pipeline for interchain analytics and exposes the first read APIs on top of it.

The implementation adds:

  • persistent stats tables for messages, message paths, assets, asset edges, and per-chain aggregates;
  • incremental stats projection during normal indexing;
  • startup backfill for rows that were indexed before the stats pipeline existed;
  • periodic recomputation for chain-level user counters;
  • new API endpoints for chain metadata and statistics-driven views.

In practice, the branch moves statistics from ad hoc runtime computation to materialized database state that can be queried with stable pagination and predictable response time.

Common Implementation Plan

The branch can be understood as one rollout executed in five steps:

  1. Add schema support for analytics data.

    • Introduce stats_assets, stats_asset_tokens, stats_asset_edges, stats_messages, stats_messages_days, and stats_chains.
    • Add stats_processed counters to crosschain_messages and crosschain_transfers so projection is incremental and idempotent-friendly.
  2. Populate stats tables from indexed data.

    • Run stats projection inline when finalized messages are flushed from the message buffer.
    • Project directional message counts into stats_messages and stats_messages_days.
    • Project transfer aggregates into asset- and edge-based stats tables.
    • Keep token enrichment asynchronous, so projection does not block on metadata fetches.
  3. Make historical data usable immediately after deploy.

    • Add stats_backfill_on_start to project all rows with stats_processed = 0 during startup.
    • Reuse the same projection path for both backfill and live indexing, so historical and fresh data are built consistently.
  4. Add orchestration and refresh flows for read-side APIs.

    • Introduce StatsService as the orchestration layer for projection, backfill, enrichment kickoff, and stats queries.
    • Add periodic stats_chains recomputation via stats_chains_recalculation_period_secs.
    • Normalize chain metadata through ChainInfoService so list/stat endpoints return consistent explorer and route fields.
  5. Expose public APIs on top of materialized stats.

    • Add chain catalog endpoint for configured chains.
    • Add bridged-token stats with cursor pagination and sorting.
    • Add chain stats with optional chain filtering.
    • Add sent/received message path endpoints, including date-bounded queries backed by stats_messages_days.

Introduced API Endpoints

GET /api/v1/interchain/chains

Returns the full list of known chains from the chains table.

Notes:

  • response items include id, name, logo, explorer_url, custom_tx_route, custom_address_route, and custom_token_route;
  • explorer URLs are normalized without trailing slash;
  • default explorer routes are omitted from custom route fields;
  • chains without a valid name are returned as Unknown.

GET /api/v1/stats/chain/{chain_id}/bridged-tokens

Returns bridged-token aggregates for a chain.

Purpose:

  • group chain-local token contracts into logical assets;
  • expose input, output, and total transfer counts per aggregated asset;
  • include linked token representations across chains.

Request behavior:

  • supports page_size, last_page, and cursor pagination;
  • supports sorting by TOTAL_TRANSFERS_COUNT, OUTPUT_TRANSFERS_COUNT, INPUT_TRANSFERS_COUNT, or NAME;
  • supports DESC and ASC order;
  • default sorting by total transfers count, descending order
  • pagination can use either page_token or raw cursor fields depending on api.use_pagination_token.

Response shape:

  • each row includes aggregate counters plus tokens[] with chain_id, token_address, name, symbol, icon_url, and decimals.

GET /api/v1/stats/chains

Returns known chains together with unique_transfer_users_count.

Request behavior:

  • supports page_size, last_page, and cursor pagination;
  • supports optional chain_ids filter as a comma-separated list;
  • supports ASC and DESC ordering by unique_transfer_users_count.

Response shape:

  • each item includes chain_id, name, icon_url, explorer_url, and unique_transfer_users_count;
  • chains are returned even if they do not yet have a stats_chains row, in which case the counter is 0.

GET /api/v1/stats/chain/{chain_id}/messages-paths/sent

Returns outgoing message paths for the selected chain.

Request behavior:

  • chain_id is required;
  • from_date and to_date are optional and must use YYYY-MM-DD;
  • counterparty_chain_ids is optional and accepts a comma-separated list of int64 chain ids.

Response shape:

  • each item includes source_chain, destination_chain, and messages_count.

Data source:

  • without date bounds, the endpoint reads from aggregated directional message stats;
  • with date bounds, it uses daily message-path aggregates.

GET /api/v1/stats/chain/{chain_id}/messages-paths/received

Returns incoming message paths for the selected chain.

Behavior matches the sent endpoint, but the selected chain is treated as the destination side.

Configuration Changes

New server settings:

  • stats_backfill_on_start (INTERCHAIN_INDEXER__STATS_BACKFILL_ON_START ENV)
    • default false
    • when true, the service projects pending stats rows during startup before indexers begin;
  • stats_chains_recalculation_period_secs (INTERCHAIN_INDEXER__STATS_CHAINS_RECALCULATION_PERIOD_SECS ENV)
    • default 3600 (1 hour)
    • controls periodic full recomputation of stats_chains;
    • 0 disables the background refresh worker.

Notes For Reviewers

  • The branch is not just API wiring. Most of the change is in the database and logic layers that make the APIs cheap to serve.
  • Bridged-token and chain stats endpoints rely on dedicated cursor logic rather than offset pagination.
  • Message path APIs now support both all-time and date-bounded queries because the branch adds stats_messages_days in addition to cumulative stats_messages.

Summary by CodeRabbit

  • New Features

    • Added statistics API endpoints for bridged tokens, chain-level stats, and message paths with cursor-based pagination and customizable sorting.
    • Added environment variables to control statistics backfill on startup and chain recalculation frequency.
  • Documentation

    • Documented new environment variables and known edge cases.
  • Bug Fixes

    • Updated Avalanche bridge configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 24, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (2)
  • fix
  • feat

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 00802f09-0170-4a97-b9ce-12c71b7566ae

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This pull request introduces a comprehensive analytics and statistics system for the interchain indexer. It adds a new database schema with six stats tables (stats_assets, stats_asset_tokens, stats_asset_edges, stats_messages, stats_messages_days, and stats_chains) to track cross-chain transfer aggregates, message path statistics, and per-chain user metrics. The changes include SeaORM entity definitions for all stats tables, a new StatsService orchestration layer that coordinates stats projection from finalized messages and transfers, pagination logic for bridged token and chain statistics endpoints, and gRPC service handlers exposing these stats via new API endpoints. Integration points include extending the message buffer and finalization pipeline to trigger stats projection, adding token enrichment hooks to propagate metadata into stats tables, and updating the Avalanche indexer to use StatsService. Configuration options enable stats backfill on startup and periodic background recomputation of per-chain statistics.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 hops through data with delight
Cross-chain bridges now shine bright,
Stats tables bloom with every flow,
The rabbit's ledger steals the show! 📊✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: improvements to the internal stats module, which aligns with the substantial additions of statistics infrastructure, projection logic, and APIs.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch evgenkor/interchain/stats-tables

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a materialized, incremental statistics pipeline (DB tables + projection/backfill + periodic recomputation) and exposes initial read APIs for interchain analytics (chains catalog, bridged-token stats, chain stats, and message-path views).

Changes:

  • Introduces new stats schema (assets/edges/messages/daily rollups/per-chain aggregates) plus incremental projection/backfill and periodic stats_chains recomputation.
  • Adds new public API endpoints and pagination/sorting logic for stats-driven views (bridged tokens, chain stats, message paths) and a chains catalog endpoint.
  • Extends token metadata flow to asynchronously enrich stats tables and propagates fetched token info into stats state.

Reviewed changes

Copilot reviewed 52 out of 53 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
interchain-indexer/justfile Updates dev run defaults (logging + enable stats backfill on start).
interchain-indexer/interchain-indexer-server/tests/avalanche_e2e.rs Wires StatsService into Avalanche indexer E2E tests.
interchain-indexer/interchain-indexer-server/src/settings.rs Adds stats_backfill_on_start + stats_chains_recalculation_period_secs settings with tests.
interchain-indexer/interchain-indexer-server/src/services/stats.rs Implements new stats gRPC/HTTP handlers (bridged tokens, chain stats, message paths) + parsing helpers.
interchain-indexer/interchain-indexer-server/src/services/mod.rs Registers new chain_info_proto module.
interchain-indexer/interchain-indexer-server/src/services/interchain_service.rs Adds GetChains endpoint and centralizes chain-to-proto conversion.
interchain-indexer/interchain-indexer-server/src/services/chain_info_proto.rs New shared chains::ModelChainInfo mapping helper.
interchain-indexer/interchain-indexer-server/src/server.rs Wires StatsService, startup backfill, and periodic stats_chains worker; passes stats into indexers and services.
interchain-indexer/interchain-indexer-server/src/indexers.rs Changes indexer spawning to accept Arc<StatsService>.
interchain-indexer/interchain-indexer-server/src/config.rs Removes brittle tests that depended on repo file layout.
interchain-indexer/interchain-indexer-server/config/example.toml Documents new stats-related settings.
interchain-indexer/interchain-indexer-proto/swagger/v1/interchain-indexer.swagger.yaml Adds OpenAPI paths/definitions for new endpoints and pagination DTOs.
interchain-indexer/interchain-indexer-proto/proto/v1/stats.proto Adds stats service RPCs, pagination messages, and sort/order enums.
interchain-indexer/interchain-indexer-proto/proto/v1/interchain_indexer.proto Adds GetChains and message-path request/response messages.
interchain-indexer/interchain-indexer-proto/proto/v1/api_config_http.yaml Adds HTTP mappings for new endpoints.
interchain-indexer/interchain-indexer-proto/build.rs Adds serde defaults for query enum fields and a post-process to dedupe actix-prost duplicate structs.
interchain-indexer/interchain-indexer-migration/src/migrations_up/m20260312_175120_add_stats_tables_up.sql Adds stats tables/types, processing markers, and supporting indexes.
interchain-indexer/interchain-indexer-migration/src/migrations_down/m20260312_175120_add_stats_tables_down.sql Drops stats tables/types/indexes and removes new columns.
interchain-indexer/interchain-indexer-migration/src/m20260312_175120_add_stats_tables.rs Registers SQL-based migration wrapper.
interchain-indexer/interchain-indexer-migration/src/lib.rs Adds the new stats migration to the migrator.
interchain-indexer/interchain-indexer-logic/src/token_info/service.rs Propagates fetched token info into stats tables and adds async enrichment kickoff for stats keys.
interchain-indexer/interchain-indexer-logic/src/stats_chains_query.rs New keyset-paginated query for /stats/chains.
interchain-indexer/interchain-indexer-logic/src/stats/service.rs New orchestration layer for projection/backfill/recompute + read helpers.
interchain-indexer/interchain-indexer-logic/src/stats/projection.rs New transactional batch projection of messages/transfers into stats tables (assets/edges/messages/days).
interchain-indexer/interchain-indexer-logic/src/stats/mod.rs Exposes stats service and types.
interchain-indexer/interchain-indexer-logic/src/pagination.rs Adds stats pagination tokens and sort/order enums.
interchain-indexer/interchain-indexer-logic/src/message_buffer/persistence.rs Adds token-key extraction helper for enrichment.
interchain-indexer/interchain-indexer-logic/src/message_buffer/mod.rs Re-exports finalized token-key helper.
interchain-indexer/interchain-indexer-logic/src/message_buffer/maintenance.rs Runs stats projection in the flush transaction and kicks off token enrichment post-commit.
interchain-indexer/interchain-indexer-logic/src/message_buffer/buffer.rs Refactors buffer to depend on StatsService (and provides constructors for embedders/tests).
interchain-indexer/interchain-indexer-logic/src/lib.rs Registers new stats modules and re-exports pagination/service types.
interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/mod.rs Passes StatsService into buffer and updates blockchain-id resolution to honor process_unknown_chains.
interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/consolidation.rs Initializes stats_processed on new consolidated messages.
interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/blockchain_id_resolver.rs Adds force_add_chain semantics to control whether discovered chains are persisted.
interchain-indexer/interchain-indexer-logic/src/chain_info/service.rs Adds get_all_chains_info() with normalization and DB-backed tests.
interchain-indexer/interchain-indexer-logic/src/bridged_tokens_query.rs New keyset-paginated bridged-token aggregation query + token-link enrichment fetch.
interchain-indexer/interchain-indexer-entity/src/codegen/stats_messages_days.rs Adds SeaORM entity for stats_messages_days.
interchain-indexer/interchain-indexer-entity/src/codegen/stats_messages.rs Adds SeaORM entity for stats_messages.
interchain-indexer/interchain-indexer-entity/src/codegen/stats_chains.rs Adds SeaORM entity for stats_chains.
interchain-indexer/interchain-indexer-entity/src/codegen/stats_assets.rs Adds SeaORM entity for stats_assets.
interchain-indexer/interchain-indexer-entity/src/codegen/stats_asset_tokens.rs Adds SeaORM entity for stats_asset_tokens.
interchain-indexer/interchain-indexer-entity/src/codegen/stats_asset_edges.rs Adds SeaORM entity for stats_asset_edges.
interchain-indexer/interchain-indexer-entity/src/codegen/sea_orm_active_enums.rs Adds EdgeAmountSide active enum.
interchain-indexer/interchain-indexer-entity/src/codegen/prelude.rs Exposes new stats entities in the prelude.
interchain-indexer/interchain-indexer-entity/src/codegen/mod.rs Registers stats entity modules.
interchain-indexer/interchain-indexer-entity/src/codegen/crosschain_transfers.rs Adds stats_processed + stats_asset_id fields and relation to stats_assets.
interchain-indexer/interchain-indexer-entity/src/codegen/crosschain_messages.rs Adds stats_processed field.
interchain-indexer/interchain-indexer-entity/src/codegen/chains.rs Adds relations to stats tables.
interchain-indexer/config/avalanche/bridges.json Sets process_unknown_chains to false for Avalanche config.
interchain-indexer/README.md Documents new stats settings env vars.
interchain-indexer/Cargo.toml Adds the bon dependency and reorders entry.
interchain-indexer/.memory-bank/gotchas.md Documents gotchas for stats edge amount_side semantics and filtering behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread interchain-indexer/interchain-indexer-logic/src/pagination.rs Outdated
Comment thread interchain-indexer/interchain-indexer-logic/src/stats/projection.rs
@EvgenKor EvgenKor requested review from tom2drum and removed request for maxaleks March 24, 2026 11:46
@EvgenKor
Copy link
Copy Markdown
Contributor Author

EvgenKor commented Apr 4, 2026

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 4, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (7)
interchain-indexer/interchain-indexer-server/src/settings.rs (1)

135-143: Avoid duplicating the production default in tests.

stats_chains_period_default_for_test() redefines 3600, so tests can pass even if production default changes. Reuse the production default function directly to prevent drift.

♻️ Suggested tweak
-    fn stats_chains_period_default_for_test() -> u64 {
-        3600
-    }
-
     #[derive(Deserialize)]
     struct StatsChainsPeriod {
-        #[serde(default = "stats_chains_period_default_for_test")]
+        #[serde(default = "super::default_stats_chains_recalculation_period_secs")]
         stats_chains_recalculation_period_secs: u64,
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interchain-indexer/interchain-indexer-server/src/settings.rs` around lines
135 - 143, The test-specific function stats_chains_period_default_for_test()
redefines the literal 3600 and should be removed; update StatsChainsPeriod to
reuse the existing production default function used for
stats_chains_recalculation_period_secs (i.e., replace the serde default
reference to call the production default function instead of
stats_chains_period_default_for_test()), and delete the duplicate
stats_chains_period_default_for_test() helper so the test default will
automatically follow any production changes.
interchain-indexer/interchain-indexer-logic/src/chain_info/service.rs (1)

162-170: Enforce sort order in-service to match the documented contract.

get_all_chains_info() says it returns rows ordered by chains.id ascending, but it currently relies on DB behavior only (Line 163). Sorting before return makes this contract stable.

🔧 Suggested hardening
     pub async fn get_all_chains_info(&self) -> anyhow::Result<Vec<chains::Model>> {
         let chains = self.db.get_all_chains().await?;
         let mut out = Vec::with_capacity(chains.len());
         for chain in chains {
             let normalized = normalize_chain(chain);
             out.push(normalized);
         }
+        out.sort_by_key(|c| c.id);
         Ok(out)
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interchain-indexer/interchain-indexer-logic/src/chain_info/service.rs` around
lines 162 - 170, The get_all_chains_info function currently returns DB rows
without enforcing order; update get_all_chains_info to sort the normalized
Vec<chains::Model> by chains.id ascending before returning (after calling
normalize_chain in the loop or by sorting the original `chains` collection),
ensuring the returned ordering matches the documented contract; reference the
function name get_all_chains_info, the normalize_chain call, and the
db.get_all_chains result when implementing the in-service sort.
interchain-indexer/interchain-indexer-logic/src/message_buffer/persistence.rs (1)

1-1: Consolidate std::collections imports.

HashMap is imported on line 1, and HashSet is separately imported on line 12. Per the coding guidelines, use crate-level import grouping.

Suggested fix
-use std::collections::HashMap;
+use std::collections::{HashMap, HashSet};

And remove line 12.

As per coding guidelines: "Use crate-level import grouping with imports_granularity=Crate setting".

Also applies to: 12-12

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@interchain-indexer/interchain-indexer-logic/src/message_buffer/persistence.rs`
at line 1, The file imports from std::collections are split; consolidate them
into a single crate-level import by combining HashMap and HashSet into one
statement (e.g., use std::collections::{HashMap, HashSet};) and remove the
separate HashSet import line; update the top of persistence.rs where HashMap and
HashSet are currently imported so both symbols (HashMap, HashSet) are brought in
by a single use declaration.
interchain-indexer/interchain-indexer-server/src/services/stats.rs (1)

243-254: Consider parallelizing chain info lookups for large result sets.

The sequential await calls for get_chain_info in the loop could become a bottleneck if the result set is large. For typical message-path queries with a limited number of chain pairs, this is likely acceptable.

♻️ Optional: parallelize with futures::future::join_all
use futures::future::join_all;

let chain_ids: Vec<_> = rows.iter()
    .flat_map(|r| [r.src_chain_id, r.dst_chain_id])
    .collect::<std::collections::HashSet<_>>()
    .into_iter()
    .collect();

let chain_infos: std::collections::HashMap<_, _> = join_all(
    chain_ids.iter().map(|&id| async move {
        (id, self.chain_info.get_chain_info(id).await)
    })
).await.into_iter().collect();

let items = rows.into_iter().map(|row| {
    let source = chain_model_to_proto(chain_infos.get(&row.src_chain_id).cloned().flatten());
    let destination = chain_model_to_proto(chain_infos.get(&row.dst_chain_id).cloned().flatten());
    MessagePathRow {
        source_chain: Some(source),
        destination_chain: Some(destination),
        messages_count: i64_to_u64_nonneg(row.messages_count),
    }
}).collect();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interchain-indexer/interchain-indexer-server/src/services/stats.rs` around
lines 243 - 254, The loop is performing sequential awaits on
self.chain_info.get_chain_info for each row which can be slow for many rows;
instead collect unique src/dst chain IDs from rows, concurrently fetch all chain
infos (e.g., with futures::future::join_all or FuturesUnordered) into a HashMap
keyed by chain id, then map rows into MessagePathRow by looking up chain infos
and calling chain_model_to_proto for source and destination; update the items
construction to use the pre-fetched map and keep using i64_to_u64_nonneg for
messages_count.
interchain-indexer/interchain-indexer-logic/src/stats/projection.rs (3)

138-161: Consider increasing the batch size for marking messages as processed.

The batch size of 2 in run_in_batches (line 139) seems unusually small. This will generate many individual UPDATE statements for large batches of messages. Consider using a larger batch size (e.g., 500 or configurable based on PG_BIND_PARAM_LIMIT / 2) to reduce database round-trips.

♻️ Suggested change
     let mark: Vec<(i64, i32)> = rows.iter().map(|m| (m.id, m.bridge_id)).collect();
-    run_in_batches(&mark, 2, |batch| async {
+    let batch_size = (crate::bulk::PG_BIND_PARAM_LIMIT / 2).max(1);
+    run_in_batches(&mark, batch_size, |batch| async {
         crosschain_messages::Entity::update_many()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interchain-indexer/interchain-indexer-logic/src/stats/projection.rs` around
lines 138 - 161, The batch size passed to run_in_batches when updating
StatsProcessed is set to 2 which causes excessive small UPDATEs; change the
hardcoded 2 to a larger value or a configurable constant (e.g., 500 or computed
from PG_BIND_PARAM_LIMIT/2) so run_in_batches(&mark, <new_batch_size>, |batch|
...) groups many ids per crosschain_messages::Entity::update_many() call and
reduces DB round-trips; update any config/constant and tests accordingly.

787-806: Batch size of 1 is inefficient for marking transfers as processed.

Similar to the messages case, run_in_batches(&ids, 1, ...) (line 788) will generate one UPDATE per transfer. Consider increasing to a larger batch size.

♻️ Suggested change
     for (aid, ids) in by_asset {
-        run_in_batches(&ids, 1, |batch| async {
+        let batch_size = crate::bulk::PG_BIND_PARAM_LIMIT.max(1);
+        run_in_batches(&ids, batch_size, |batch| async {
             crosschain_transfers::Entity::update_many()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interchain-indexer/interchain-indexer-logic/src/stats/projection.rs` around
lines 787 - 806, The loop currently calls run_in_batches(&ids, 1, ...) which
issues one UPDATE per transfer; change the batch size from 1 to a larger value
(e.g., 100) or replace the literal with a named constant (e.g., BATCH_SIZE) and
use that in run_in_batches(&ids, BATCH_SIZE, ...) inside the same block that
calls crosschain_transfers::Entity::update_many(), keeping the same filters
(StatsProcessed.eq(0i16)) and column updates so updates are applied in larger
groups rather than per-transfer.

765-777: Consider adding on_conflict for edge inserts to handle race conditions.

The stats_asset_edges insert (lines 765-776) does not use on_conflict(). If two concurrent transactions attempt to insert the same edge (same stats_asset_id, src_chain_id, dst_chain_id), one will fail with a unique constraint violation.

Since project_transfers_batch may run concurrently during backfill or parallel processing, consider using an upsert pattern consistent with the stats_messages handling.

♻️ Suggested change
                 stats_asset_edges::Entity::insert(stats_asset_edges::ActiveModel {
                     stats_asset_id: Set(stats_asset_id),
                     src_chain_id: Set(src_chain_id),
                     dst_chain_id: Set(dst_chain_id),
                     transfers_count: Set(count),
                     cumulative_amount: Set(cumulative),
                     decimals: Set(working_decimals),
                     amount_side: Set(amount_side),
                     ..Default::default()
                 })
+                .on_conflict(
+                    OnConflict::columns([
+                        stats_asset_edges::Column::StatsAssetId,
+                        stats_asset_edges::Column::SrcChainId,
+                        stats_asset_edges::Column::DstChainId,
+                    ])
+                    .value(
+                        stats_asset_edges::Column::TransfersCount,
+                        Expr::col((stats_asset_edges::Entity, stats_asset_edges::Column::TransfersCount)).add(count),
+                    )
+                    .value(
+                        stats_asset_edges::Column::CumulativeAmount,
+                        Expr::col((stats_asset_edges::Entity, stats_asset_edges::Column::CumulativeAmount)).add(cumulative.clone()),
+                    )
+                    .value(stats_asset_edges::Column::UpdatedAt, Expr::current_timestamp())
+                    .to_owned(),
+                )
                 .exec(tx)
                 .await?;

Based on learnings: "Always use on_conflict() for upserts and batch large inserts when interacting with the database."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interchain-indexer/interchain-indexer-logic/src/stats/projection.rs` around
lines 765 - 777, The insert into stats_asset_edges (inside
project_transfers_batch) can race and violate the unique constraint because it
lacks on_conflict handling; change the stats_asset_edges::Entity::insert(...)
call to an upsert by adding on_conflict() that targets the unique key
(stats_asset_id, src_chain_id, dst_chain_id) and performs an update of the
updatable columns (transfers_count, cumulative_amount, decimals, amount_side) so
concurrent inserts merge instead of erroring; keep using exec(tx).await and
mirror the upsert pattern you used for stats_messages to ensure consistent
conflict resolution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/blockchain_id_resolver.rs`:
- Around line 29-39: The in-memory cache used by resolve currently only stores
chain_id, so when force_add_chain requires DB writes those writes can be skipped
on future cache hits; change the cache value to include persistence state (e.g.
store {chain_id, persisted: bool}) or, if easier, on cache hit when
force_add_chain==true re-run the persistence path that updates chains and
avalanche_icm_blockchain_ids until it succeeds; update all code paths in resolve
that read/write the cache and the persistence logic (references: resolve,
force_add_chain, avalanche_icm_blockchain_ids, chains) so that cached entries
reflect whether DB persistence has completed, and ensure warning/partial-failure
paths mark persisted=false so subsequent calls retry.

In `@interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/mod.rs`:
- Around line 835-838: The call sites to BlockchainIdResolver::resolve pass
ctx.process_unknown_chains but the implementation parameter is named
force_add_chain; rename the parameter in the resolve signature and all
implementations/overrides from force_add_chain: bool to process_unknown_chains:
bool (and update any docstrings/comments) so the semantics match the callers (no
caller changes required); ensure the parameter name is updated in the trait/impl
declarations for BlockchainIdResolver::resolve and any related tests or usages.

In `@interchain-indexer/interchain-indexer-logic/src/token_info/service.rs`:
- Around line 272-312: kickoff_token_fetch_for_stats_enrichment currently spawns
an unbounded tokio::spawn per unique token (and additionally uses
in_flight_fetches) which can fan out into thousands of concurrent RPC/DB
operations; change it to limit concurrency by introducing a bounded semaphore or
buffered stream: e.g., create a tokio::sync::Semaphore (or use
futures::stream::iter(uniq).map(|(chain_id,address)| async move { ...
}).buffer_unordered(MAX_CONCURRENCY)) and acquire a permit before calling
fetch_token_info_from_chain_and_persist, releasing the permit after the call;
keep using svc.in_flight_fetches to dedupe but ensure the spawn of
fetch_token_info_from_chain_and_persist only happens after acquiring a permit
(or run the fetches inside the buffered stream) so no more than MAX_CONCURRENCY
concurrent RPC/DB tasks run.
- Around line 235-238: The cache insertion uses or_insert_with which preserves
stale entries, so after a successful enrichment you must replace any existing
cached partial model with the fresh one; in the token_info_cache update inside
kickoff_token_fetch_for_stats_enrichment (the block that currently does
cache.entry(key.clone()).or_insert_with(|| model.clone())), change it to either
call cache.insert(key.clone(), model.clone()) or use entry(...).and_modify(|v|
*v = model.clone()).or_insert_with(|| model.clone()) so the cache is overwritten
with the enriched model rather than keeping stale data.

In `@interchain-indexer/interchain-indexer-proto/proto/v1/stats.proto`:
- Around line 15-16: Fix the typo in the comment above the Pagination message in
interchain-indexer-proto/proto/v1/stats.proto: replace "bridged-t/token stats"
with "bridged-token stats" so the comment reads "Pagination for bridged-token
stats only (`page_token` or explicit cursor fields)." and keep the rest of the
comment unchanged.

In
`@interchain-indexer/interchain-indexer-proto/swagger/v1/interchain-indexer.swagger.yaml`:
- Around line 825-827: The Swagger description contains a propagated typo
"bridged-t/token" — update the original proto comment that generates this
description to the correct term (e.g., "bridged token" or "bridged-tokens") in
the proto file's comment for the Pagination message (the comment that produced
the Swagger description "Pagination for bridged-t/token stats only..."), then
regenerate the OpenAPI/Swagger output so the corrected text replaces
"bridged-t/token" in interchain-indexer.swagger.yaml.

In `@interchain-indexer/interchain-indexer-server/src/server.rs`:
- Around line 123-130: The stats backfill (call to
stats.backfill_stats_until_idle_with_token_enrichment()) currently runs before
the metadata upserts and can write FK-backed rows against missing reference
data; move the entire conditional block that checks
settings.stats_backfill_on_start and calls
backfill_stats_until_idle_with_token_enrichment() so it executes after the
initial reference seeding calls (upsert_chains(), upsert_bridges(),
upsert_bridge_contracts()) have completed; ensure the async call and its await
remain intact and keep the tracing::info message but relocate it below the
upsert_chains/upsert_bridges/upsert_bridge_contracts sequence.

---

Nitpick comments:
In `@interchain-indexer/interchain-indexer-logic/src/chain_info/service.rs`:
- Around line 162-170: The get_all_chains_info function currently returns DB
rows without enforcing order; update get_all_chains_info to sort the normalized
Vec<chains::Model> by chains.id ascending before returning (after calling
normalize_chain in the loop or by sorting the original `chains` collection),
ensuring the returned ordering matches the documented contract; reference the
function name get_all_chains_info, the normalize_chain call, and the
db.get_all_chains result when implementing the in-service sort.

In
`@interchain-indexer/interchain-indexer-logic/src/message_buffer/persistence.rs`:
- Line 1: The file imports from std::collections are split; consolidate them
into a single crate-level import by combining HashMap and HashSet into one
statement (e.g., use std::collections::{HashMap, HashSet};) and remove the
separate HashSet import line; update the top of persistence.rs where HashMap and
HashSet are currently imported so both symbols (HashMap, HashSet) are brought in
by a single use declaration.

In `@interchain-indexer/interchain-indexer-logic/src/stats/projection.rs`:
- Around line 138-161: The batch size passed to run_in_batches when updating
StatsProcessed is set to 2 which causes excessive small UPDATEs; change the
hardcoded 2 to a larger value or a configurable constant (e.g., 500 or computed
from PG_BIND_PARAM_LIMIT/2) so run_in_batches(&mark, <new_batch_size>, |batch|
...) groups many ids per crosschain_messages::Entity::update_many() call and
reduces DB round-trips; update any config/constant and tests accordingly.
- Around line 787-806: The loop currently calls run_in_batches(&ids, 1, ...)
which issues one UPDATE per transfer; change the batch size from 1 to a larger
value (e.g., 100) or replace the literal with a named constant (e.g.,
BATCH_SIZE) and use that in run_in_batches(&ids, BATCH_SIZE, ...) inside the
same block that calls crosschain_transfers::Entity::update_many(), keeping the
same filters (StatsProcessed.eq(0i16)) and column updates so updates are applied
in larger groups rather than per-transfer.
- Around line 765-777: The insert into stats_asset_edges (inside
project_transfers_batch) can race and violate the unique constraint because it
lacks on_conflict handling; change the stats_asset_edges::Entity::insert(...)
call to an upsert by adding on_conflict() that targets the unique key
(stats_asset_id, src_chain_id, dst_chain_id) and performs an update of the
updatable columns (transfers_count, cumulative_amount, decimals, amount_side) so
concurrent inserts merge instead of erroring; keep using exec(tx).await and
mirror the upsert pattern you used for stats_messages to ensure consistent
conflict resolution.

In `@interchain-indexer/interchain-indexer-server/src/services/stats.rs`:
- Around line 243-254: The loop is performing sequential awaits on
self.chain_info.get_chain_info for each row which can be slow for many rows;
instead collect unique src/dst chain IDs from rows, concurrently fetch all chain
infos (e.g., with futures::future::join_all or FuturesUnordered) into a HashMap
keyed by chain id, then map rows into MessagePathRow by looking up chain infos
and calling chain_model_to_proto for source and destination; update the items
construction to use the pre-fetched map and keep using i64_to_u64_nonneg for
messages_count.

In `@interchain-indexer/interchain-indexer-server/src/settings.rs`:
- Around line 135-143: The test-specific function
stats_chains_period_default_for_test() redefines the literal 3600 and should be
removed; update StatsChainsPeriod to reuse the existing production default
function used for stats_chains_recalculation_period_secs (i.e., replace the
serde default reference to call the production default function instead of
stats_chains_period_default_for_test()), and delete the duplicate
stats_chains_period_default_for_test() helper so the test default will
automatically follow any production changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 08fd22f0-b4c6-4aad-8ee5-1220b83a11f4

📥 Commits

Reviewing files that changed from the base of the PR and between bd9cef2 and 1e18e67.

📒 Files selected for processing (54)
  • interchain-indexer/.memory-bank/gotchas.md
  • interchain-indexer/Cargo.toml
  • interchain-indexer/README.md
  • interchain-indexer/config/avalanche/bridges.json
  • interchain-indexer/interchain-indexer-entity/src/codegen/chains.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/crosschain_messages.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/crosschain_transfers.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/mod.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/prelude.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/sea_orm_active_enums.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/stats_asset_edges.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/stats_asset_tokens.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/stats_assets.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/stats_chains.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/stats_messages.rs
  • interchain-indexer/interchain-indexer-entity/src/codegen/stats_messages_days.rs
  • interchain-indexer/interchain-indexer-logic/src/bridged_tokens_query.rs
  • interchain-indexer/interchain-indexer-logic/src/chain_info/service.rs
  • interchain-indexer/interchain-indexer-logic/src/database.rs
  • interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/blockchain_id_resolver.rs
  • interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/consolidation.rs
  • interchain-indexer/interchain-indexer-logic/src/indexer/avalanche/mod.rs
  • interchain-indexer/interchain-indexer-logic/src/lib.rs
  • interchain-indexer/interchain-indexer-logic/src/message_buffer/buffer.rs
  • interchain-indexer/interchain-indexer-logic/src/message_buffer/maintenance.rs
  • interchain-indexer/interchain-indexer-logic/src/message_buffer/mod.rs
  • interchain-indexer/interchain-indexer-logic/src/message_buffer/persistence.rs
  • interchain-indexer/interchain-indexer-logic/src/pagination.rs
  • interchain-indexer/interchain-indexer-logic/src/stats/mod.rs
  • interchain-indexer/interchain-indexer-logic/src/stats/projection.rs
  • interchain-indexer/interchain-indexer-logic/src/stats/service.rs
  • interchain-indexer/interchain-indexer-logic/src/stats_chains_query.rs
  • interchain-indexer/interchain-indexer-logic/src/token_info/service.rs
  • interchain-indexer/interchain-indexer-migration/src/lib.rs
  • interchain-indexer/interchain-indexer-migration/src/m20260312_175120_add_stats_tables.rs
  • interchain-indexer/interchain-indexer-migration/src/migrations_down/m20260312_175120_add_stats_tables_down.sql
  • interchain-indexer/interchain-indexer-migration/src/migrations_up/m20260312_175120_add_stats_tables_up.sql
  • interchain-indexer/interchain-indexer-proto/build.rs
  • interchain-indexer/interchain-indexer-proto/proto/v1/api_config_http.yaml
  • interchain-indexer/interchain-indexer-proto/proto/v1/interchain_indexer.proto
  • interchain-indexer/interchain-indexer-proto/proto/v1/stats.proto
  • interchain-indexer/interchain-indexer-proto/swagger/v1/interchain-indexer.swagger.yaml
  • interchain-indexer/interchain-indexer-server/config/example.toml
  • interchain-indexer/interchain-indexer-server/src/config.rs
  • interchain-indexer/interchain-indexer-server/src/indexers.rs
  • interchain-indexer/interchain-indexer-server/src/server.rs
  • interchain-indexer/interchain-indexer-server/src/services/chain_info_proto.rs
  • interchain-indexer/interchain-indexer-server/src/services/interchain_service.rs
  • interchain-indexer/interchain-indexer-server/src/services/mod.rs
  • interchain-indexer/interchain-indexer-server/src/services/stats.rs
  • interchain-indexer/interchain-indexer-server/src/settings.rs
  • interchain-indexer/interchain-indexer-server/tests/avalanche_e2e.rs
  • interchain-indexer/justfile
  • interchain-indexer/types/package.json
💤 Files with no reviewable changes (1)
  • interchain-indexer/interchain-indexer-server/src/config.rs

Comment thread interchain-indexer/interchain-indexer-proto/proto/v1/stats.proto Outdated
Comment thread interchain-indexer/interchain-indexer-server/src/server.rs
@EvgenKor EvgenKor merged commit a289ea3 into main Apr 6, 2026
7 checks passed
@EvgenKor EvgenKor deleted the evgenkor/interchain/stats-tables branch April 6, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants