Skip to content

Zero-result flowers for valid high-citation authors despite OpenAlex presence #78

@semenoffalex

Description

@semenoffalex

Summary

Influence Flower returns an empty/zero-result output for valid, high-citation authors, including:

  • Marcos López de Prado
  • Giuseppe Paleologo
    Both authors are present in OpenAlex and have substantial citation records, so the current behavior is unexpected.

Expected behavior

When a user selects a valid author from search results, the app should either:

  1. render a non-empty flower (if author data exists in the scoring snapshot), or
  2. show an explicit message that the author exists in OpenAlex but is missing from the local scoring snapshot.
    It should not fail silently with a zero-result flower.

Actual behavior

The app can return a zero-result/empty flower without a clear explanation for these authors.

Steps to reproduce

  1. Open Create New Flowers.
  2. Search for Marcos López de Prado (or Giuseppe Paleologo).
  3. Select the author and click Go.
  4. Observe zero/empty result behavior (or no meaningful explanation).

References

Google Scholar

OpenAlex (author exists)

Environment

  • macOS Sonoma 14.6 (Darwin 23.6.0)
  • Local run via Docker Compose

Investigation notes

During local debugging, two likely contributors were identified:

  1. Silent ID dropping in runtime scoring path
    If selected OpenAlex IDs are not present in the local binary snapshot, requests can degrade to empty results without clear user feedback.
  2. Preprocessing bug in reference extraction
    In konigsberg/preprocessor.py, generate_paper_references() used author prefix slicing for work IDs (PREFIX_AUTHOR instead of PREFIX_WORK), which can corrupt citation-reference edges in generated data.

Suggested fix direction

  • Return explicit API response when requested IDs are missing in local snapshot.
  • Surface user-facing warning in UI for snapshot/API mismatch.
  • Correct OpenAlex work ID prefix parsing in preprocessing.
  • Add regression tests for both:
    • preprocessing ID parsing correctness
    • non-silent runtime handling for unresolved IDs

Additional context

I can provide a tested patch branch with:

  • runtime missing-ID signaling,
  • UI mismatch messaging,
  • preprocessing fix,
  • focused tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions