Skip to content

Use SQLite IN for large scalar list filters#214

Open
wtsnz wants to merge 2 commits into
ash-project:mainfrom
wtsnz:codex/sqlite-in-list-membership
Open

Use SQLite IN for large scalar list filters#214
wtsnz wants to merge 2 commits into
ash-project:mainfrom
wtsnz:codex/sqlite-in-list-membership

Conversation

@wtsnz
Copy link
Copy Markdown

@wtsnz wtsnz commented May 20, 2026

Contributor checklist

Leave anything that you believe does not apply unchecked.

  • I accept the AI Policy
  • Bug fixes include regression tests
  • Chores
  • Documentation changes
  • Features include unit/acceptance tests
  • Refactoring
  • Update dependencies

This came out of a local SQLite app I’m building where I've got a route that asks for availability across a lot of media IDs, and the Ash filter is pretty ordinary:

filter expr(media_item_id in ^arg(:media_item_ids) and best == true)

AshSqlite currently turns that into a long OR chain:

media_item_id = ? OR media_item_id = ? OR media_item_id = ? ...

That works for small lists, but not for long lists. SQLite has a max expression depth, commonly 1000 terms, which can cause the query to crash before it runs.

I used Codex to help me explore some options, and I think these queries should use SQLite’s normal scalar IN (?, ...) shape for scalar lists, and keep the existing fallback behaviour for complex values.

This PR changes that generated SQL to:

media_item_id IN (?, ?, ...)

That moves this case from SQLite’s expression-depth limit to SQLite’s bind-variable limit, which is around 32k depending on how SQLite is compiled. It also keeps the left-hand column uncast, which should preserve the normal indexed lookup plan.

History

It looks like the existing OR expansion was added as a fix or workaround. See this ash-project/ash_sqlite@83ce541, fix: remove list literal usage for in in ash_sqlite.

There isn’t an explanatory PR or test attached to that commit, so I’m not 100% certain for why this was implemented. My read is that AshSqlite needed to stop using the shared list-literal RHS path for SQLite, and expanding to OR was a safe local workaround because each left == value reused the existing equality/type-casting path?

This PR keeps that fallback for complex RHS values, but uses direct IN binds for scalar lists.

Performance

I ran local comparisons to make sure the new IN shape didn’t make things worse: here (LLM written)

The compact IN shape is faster in the tested cases, and the old OR shape fails once it hits SQLite’s expression-depth limit.

Future

Based on feedback on my other PR there may be a better longer term shape where ash_sql owns the shared in operator flow and adapters provide capability/strategy hooks for backend-specific RHS rendering. I haven’t explored that properly yet - anyone have any instincts here?

Validation

  • MIX_ENV=test mix test test/filter_test.exs
  • MIX_ENV=test mix test
  • mix format --check-formatted lib/sql_implementation.ex test/filter_test.exs
  • git diff --check -- lib/sql_implementation.ex test/filter_test.exs

wtsnz added 2 commits May 19, 2026 21:53
Generate scalar SQLite IN filters as compact IN clauses instead of left-deep OR trees, while keeping the existing OR fallback for complex RHS values.

Dump RHS values through the selected SQLite adapter so booleans, datetimes, decimals, UUIDs, and custom storage types keep the behavior the old equality path provided.
Add regression coverage for large and empty pinned lists, complex RHS fallback behavior, typed scalar values, and the typed integer SQL shape used to avoid left-hand casts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant