Skip to content

Conversation

@vedanthnyk25
Copy link

This PR addresses the severe latency issues on the stops-for-route endpoint (Issue #165), where large routes were frequently timing out (30s+). By implementing bulk-loading, localized caching, and database indexing, the latency has been reduced to ~1.5s .

Technical changes:

  • Added a CREATE INDEX on shapes(shape_id) to prevent full table scans during geometry lookups.

  • Replaced individual stop-by-stop database queries with a bulk-loading pattern in stopsForRouteHandler.

  • Implemented SetContextCache and SetShapeCache in AdvancedDirectionCalculator to allow for in-memory calculations.

  • Updated CalculateStopDirection to accept existing GTFS directions (sql.NullString), bypassing calculations when data is already present in the database.

Initial:

terminal_log1 postman_log1

Final:

Screenshot from 2025-12-28 21-50-15 Screenshot from 2025-12-28 21-49-18

Closes #165

@CLAassistant
Copy link

CLAassistant commented Dec 30, 2025

CLA assistant check
All committers have signed the CLA.

@vedanthnyk25
Copy link
Author

I have resolved the concurrency initialization panic by transitioning the AdvancedDirectionCalculator to request-scoped within the stopsForRouteHandler.

Full Suite Pass: Verified with go test -count=1 ./... to ensure no flaky behavior.
test_results

Copy link
Member

@aaronbrethorst aaronbrethorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together, @vedanthnyk25! This is excellent performance work that addresses a real pain point. Reducing the stops-for-route latency from 30+ seconds down to ~1.5s is a substantial improvement that will benefit users with large transit agencies.

The bulk-loading approach is well-designed, and I appreciate the careful attention to thread safety by using request-scoped calculators instead of modifying the global singleton. The addition of the database index on shapes(shape_id) is a smart complementary fix.


Issues to Fix

1. Missing Initialization Check in SetContextCache (Critical)

In internal/gtfs/advanced_direction_calculator.go, SetContextCache is missing the concurrency safety check that SetShapeCache has:

// SetContextCache sets a pre-loaded cache of stop shape context data to avoid database queries.
// This can improve performance when calculating directions for many stops.
// IMPORTANT: This must be called before any concurrent operations begin.
// Panics if called after CalculateStopDirection has been invoked.
func (adc *AdvancedDirectionCalculator) SetContextCache(cache map[string][]gtfsdb.GetStopsWithShapeContextRow) {
	adc.contextCache = cache  // Missing initialized check!
}

Compare to SetShapeCache which has the check:

func (adc *AdvancedDirectionCalculator) SetShapeCache(cache map[string][]gtfsdb.GetShapePointsWithDistanceRow) {
	if adc.initialized.Load() {
		panic("SetShapeCache called after concurrent operations have started")
	}
	adc.shapeCache = cache
}

What needs to change:

  • Add the same initialized.Load() check to SetContextCache for consistency and safety
  • The docstring already promises this behavior, so the code should match

2. Inconsistent Indentation (Minor)

In internal/gtfs/advanced_direction_calculator.go, lines 69-73 use 2-space indentation instead of the project's standard tab indentation:

func (adc *AdvancedDirectionCalculator) CalculateStopDirection(ctx context.Context, stopID string, gtfsDirection ...sql.NullString) string {
  if len(gtfsDirection) > 0 && gtfsDirection[0].Valid && gtfsDirection[0].String != "" {
    if direction := adc.translateGtfsDirection(gtfsDirection[0].String); direction != "" {
      return direction
    }
  }

What needs to change:

  • Run go fmt ./... to fix the indentation

Suggestions

Unused Query Added

The GetTripsByRoute query was added to query.sql but doesn't appear to be used anywhere in this PR. If it's not needed for this change, consider removing it to keep the PR focused. If it's for planned future work, that's fine too—just wanted to flag it.


What's Done Well

  • Excellent performance improvement with clear before/after metrics
  • Thread-safe design using request-scoped calculators
  • Proper database indexing to prevent full table scans
  • Good error handling for timezone loading with UTC fallback
  • Graceful handling of empty schedules (200 with empty data instead of 500)
  • Test updates to use explicit dates, preventing future CI failures from data expiration

Once the initialization check is added to SetContextCache and the formatting is fixed, this is good to merge!

@vedanthnyk25
Copy link
Author

vedanthnyk25 commented Jan 7, 2026

@aaronbrethorst
Thanks for the detailed review! I’m glad the performance improvements and the thread-safety design look solid.

I have pushed updates to address the items you noted:

Concurrency Safety: Added the missing initialized.Load() check to SetContextCache. It now panics correctly if called after operations have started, ensuring consistency with SetShapeCache.

Formatting: Ran go fmt to fix the indentation in advanced_direction_calculator.go.

Cleanup: Removed the unused GetTripsByRoute query from query.sql to keep the PR focused.

Everything should be good to go now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: 'stops-for-route' times out (>30s) due to N+1 query pattern

3 participants