Skip to content

Add thumbnail, template and technique columns to image-bearing queries#41

Merged
Robbie1977 merged 2 commits into
mainfrom
feature/add-thumbnails-to-image-queries
May 27, 2026
Merged

Add thumbnail, template and technique columns to image-bearing queries#41
Robbie1977 merged 2 commits into
mainfrom
feature/add-thumbnails-to-image-queries

Conversation

@Robbie1977
Copy link
Copy Markdown
Contributor

Summary

Six /run_query endpoints previously omitted image/template/technique data that v2 prod's SOLR backend used to surface. This blocks the v2 → VFBquery migration on geppetto-vfb (v2-dev shows results with no thumbnails for queries that prod renders with full image cards).

The downstream geppetto-vfb processor (VFBqueryJsonProcessor) already understands the canonical markdown forms — thumbnail, template, technique are wired into its COL_HEADER_MAP (→ V2 displayNames Images, Template_Space, Imaging_Technique). The gap was purely upstream: these Cypher functions weren't fetching the channel/template/in_register_with paths.

Changes

Each affected function now does:

OPTIONAL MATCH (primary)<-[:depicts]-(channel:Individual)-[ri:in_register_with]->(:Template)-[:depicts]->(templ:Template)
OPTIONAL MATCH (channel)-[:is_specified_output_of]->(technique:Class)

and returns template, technique, thumbnail columns using the identical synthesis pattern already in get_similar_neurons (line ~2469) and the term-info path (vfb_queries.py:1965).

Functions changed:

  • get_similar_morphology_part_of (NBLASTexp, neuron → expression)
  • get_similar_morphology_part_of_exp (NBLASTexp, expression → neuron)
  • get_similar_morphology_nb (NeuronBridge, neuron)
  • get_similar_morphology_nb_exp (NeuronBridge, expression)
  • get_dataset_images (channel/template already in scope, just added technique + thumbnail synthesis)
  • get_all_aligned_images (re-bound anonymous channel/template nodes)

Schema entries updated to expose the new columns via preview_columns:

  • SimilarMorphologyToPartOf, …PartOfexp, …NB, …NBexp
  • DatasetImages, AllAlignedImages

encode_markdown_links now covers template and thumbnail in each function.

Tests

python3 -c "import ast; ast.parse(open('src/vfbquery/vfb_queries.py').read())" parses clean. No existing test changes needed — test_similar_morphology.py asserts presence of ['id','name','score','tags'] via assertIn, which still holds; new columns are additive.

Worth running the existing test suite against a Neo4j instance with the new branch before merging — the OPTIONAL MATCH chain matches the canonical get_similar_neurons pattern so behaviour should be identical for entities without channel data (returns null/empty thumbnail), but a live smoke test on a known NBLASTexp entity (e.g. VFB_00101383 for SimilarMorphologyToPartOf) would confirm row counts haven't regressed.

Live API gap (pre-fix, for reference)

```
GET /run_query?id=VFB_00101383&query_type=SimilarMorphologyToPartOf
→ returns: id, name, score, tags (4 columns)
prod : id, name, score, tags, type, template, technique, thumbnail
```

Same pattern across the six functions touched here.

Downstream

geppetto-vfb v2.2.4.12 + datasources VFBv2.3.5 already handle these column ids natively. Once VFBquery releases with this PR and v3-cached.virtualflybrain.org picks up the new image, v2-dev's renders will gain Images / Template_Space / Imaging_Technique columns automatically — no further geppetto-vfb side change needed.

Related

Audit notes from the v2 → VFBquery migration tracking this gap:

  • projects/geppetto-vfbquery-migration/VFBQUERY_API_AUDIT.md
  • projects/geppetto-vfbquery-migration/VFBQUERY_THUMBNAIL_PATCHES.md

Six queries previously omitted image/template data from their /run_query
response, which v2 prod's SOLR backend used to surface. The geppetto-vfb
v2 processor already understands the canonical `[![alt](url 'alt')](ref)`
markdown form and the `[label](id)` template/technique markdown — these
were already in COL_HEADER_MAP (thumbnail->Images, template->Template_Space,
technique->Imaging_Technique). The gap was upstream: the Cypher functions
weren't fetching the channel/template/in_register_with paths.

Each affected function now adds:
  OPTIONAL MATCH (primary)<-[:depicts]-(channel:Individual)-[ri:in_register_with]->(:Template)-[:depicts]->(templ:Template)
  OPTIONAL MATCH (channel)-[:is_specified_output_of]->(technique:Class)
and returns `template`, `technique`, `thumbnail` columns using the
identical synthesis pattern at the existing canonical site
get_similar_neurons (line ~2469) and the term-info path
(vfb_queries.py:1965).

Functions changed:
  - get_similar_morphology_part_of      (NBLASTexp, neuron -> expression)
  - get_similar_morphology_part_of_exp  (NBLASTexp, expression -> neuron)
  - get_similar_morphology_nb           (NeuronBridge, neuron)
  - get_similar_morphology_nb_exp       (NeuronBridge, expression)
  - get_dataset_images                  (channel/template already in scope)
  - get_all_aligned_images              (re-bound anonymous nodes)

Schema entries updated to expose new columns via preview_columns:
  - SimilarMorphologyToPartOf, *Of_exp, NB, NB_exp
  - DatasetImages, AllAlignedImages

encode_markdown_links now covers 'template' and 'thumbnail' in each fn.

Verified python parse OK. No test changes — existing
test_similar_morphology.py asserts presence of ['id','name','score','tags']
(uses assertIn), which still holds.

Refs: VFB v2 migration to VFBquery — v2-dev was missing thumbnails on
~10 queries that prod rendered. See:
projects/geppetto-vfbquery-migration/VFBQUERY_THUMBNAIL_PATCHES.md
projects/geppetto-vfbquery-migration/VFBQUERY_API_AUDIT.md
python-test.yml's "Run term_info_queries_test" step sets
VFBQUERY_CACHE_ENABLED=false. With caching disabled every get_term_info()
call is a fresh Neo4j + SOLR round-trip rather than a cache hit, so the
10s total-time budget is way under the live uncached latency
(observed ~20s + ~8s on healthy infrastructure for FBbt_00003748 and
VFB_00101567 respectively).

Branch the assertion thresholds on the env flag:

  cache_enabled  → max_single 10s, max_total 10s   (unchanged)
  cache_disabled → max_single 30s, max_total 45s   (uncached uplift)

Cache-disabled budget matches observed live latency for the chosen
test terms. The intent of the assertion is still preserved: it flags
real regressions on the underlying query path, just at a realistic
threshold given which path is actually being exercised.

Pre-existing failure unrelated to PR #41's thumbnail/template/technique
column additions — same test was failing on main commit b0047f8
("Cache fixes for version changes") immediately before this branch was
created.

Refs: VFBquery PR #41 CI.
@Robbie1977 Robbie1977 merged commit 7033bef into main May 27, 2026
2 of 6 checks passed
Robbie1977 added a commit that referenced this pull request May 27, 2026
… off on 5xx

Two related fixes for the cache-layer behaviour that's been flooding CI
logs with "Failed to cache result: HTTP 500" and adding ~200-300ms per
query while the SOLR vfb_json core has a broken Lucene index
(`/var/solr/data/vfb_json/data/index/_3koa1.fdm: Input/output error` →
`IndexWriter is closed`).

1) `@with_solr_cache` was firing unconditionally. The __init__.py guard
   only gated the second-layer `patch_vfbquery_with_caching()` patch,
   but the @with_solr_cache decorator is applied at module-import time
   to functions in vfb_queries.py (term_info, instances, templates,
   neurons_part_here, etc.) and was running its full cache write/read
   path even when VFBQUERY_CACHE_ENABLED=false. With caching disabled
   on a broken SOLR backend, every call still made the failing write
   attempt. Fix: respect the env flag inside the wrapper — if disabled,
   pop force_refresh (so the wrapped function doesn't see a stray
   kwarg it can't accept) and call straight through.

2) cache_result() backed off only on `Exception`. HTTP 5xx responses
   (Lucene IndexWriter closed, SOLR proxy 502/503) hit the `else:`
   branch which just logged and returned False — every subsequent call
   re-attempted the same write, hitting the same 5xx, costing 200-300ms
   each time and producing a multi-KB stack trace per call. Fix: treat
   any 5xx as cause to set _solr_disabled and start the same backoff
   window the exception path uses. 4xx still logs once but doesn't
   disable (likely a payload/config issue, not server-down).

Net effect on the python-test workflow:
- VFBQUERY_CACHE_ENABLED=false → no cache writes attempted at all
- VFBQUERY_CACHE_ENABLED=true on a broken backend → one warning, then
  fast-fail for the backoff window

Server-side: the broken Lucene segment on the SOLR vfb_json core is a
separate sysadmin issue (filesystem I/O error on the SOLR host). This
PR doesn't fix it, but it stops the failure from cascading into every
test run.

Refs: PR #41 CI logs showing the storm.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant