Skip to content

Detect all local functions and imported functions recursively#155

Merged
samwaseda merged 37 commits intomainfrom
crawler
Feb 27, 2026
Merged

Detect all local functions and imported functions recursively#155
samwaseda merged 37 commits intomainfrom
crawler

Conversation

@samwaseda
Copy link
Member

Following today's pyiron meeting, I wrote a function to detect all local functions and imported functions used in a functions recursively. This in principle ensures reproducibility, since all packages would be correctly tracked (including versions).

Example:

from flowrep import crawler
import math

def add(x, y):
    return x + y


def op(a, b):
    c = add(a, b)
    d = math.sqrt(c)
    return d


def more_op(a, b):
    c = op(a, b)
    return c

print(crawler.analyze_function_dependencies(more_op)))

Output: ({<function add at 0x1034af690>, <function op at 0x103bdd220>}, {VersionInfo(module='math', qualname='sqrt', version='3.14.3')}).

I have no idea how reliably this works. Anyway, despite pyiron_snippets v.1.1.0 not fully published, I wanted to open this PR so that @liamhuber can review it before I wake up tomorrow morning 😎.

@samwaseda samwaseda requested a review from liamhuber February 23, 2026 21:53
@github-actions
Copy link

Binder 👈 Launch a binder notebook on branch pyiron/flowrep/crawler

@liamhuber
Copy link
Member

I went ahead and merged main to drop the diff down to the two relevant files. Tests should work now too

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.97%. Comparing base (0813585) to head (d81e55e).
⚠️ Report is 57 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #155      +/-   ##
==========================================
+ Coverage   98.94%   98.97%   +0.02%     
==========================================
  Files          27       28       +1     
  Lines        1901     1942      +41     
==========================================
+ Hits         1881     1922      +41     
  Misses         20       20              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
And return a map between the version info and the usages.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Copy link
Member

@liamhuber liamhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure yet exactly how you plan to use it, but it's a nice direction. TIL about ast.NodeVisitor!

There's a couple places where we can swap out the helper functions with stuff from flowrep.models.parsers.object_scope that is all fully tested. Otherwise, my only big concern was the asymmetry of what you get back depending on whether the function is "local" or not.

liamhuber and others added 2 commits February 24, 2026 07:01
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
@samwaseda
Copy link
Member Author

(I cherry-picked one commit because git diff was difficult to read)

@samwaseda samwaseda mentioned this pull request Feb 24, 2026
@samwaseda
Copy link
Member Author

@liamhuber I didn't understand why you were returning a list of (same?) functions in the returned dictionary. Was it because my intention of this module was not clear?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new crawler module that provides functionality to recursively detect all callable dependencies of a Python function through AST introspection. The implementation tracks both local functions and imported functions, capturing version information where available to support reproducibility tracking.

Changes:

  • Added flowrep/crawler.py with core dependency tracking functionality
  • Added comprehensive unit tests in tests/unit/test_crawler.py
  • Implemented recursive dependency resolution with cycle detection

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 13 comments.

File Description
flowrep/crawler.py Core implementation with get_call_dependencies() for recursive dependency tracking, split_by_version_availability() for partitioning dependencies by version info, and CallCollector AST visitor for extracting function calls
tests/unit/test_crawler.py Comprehensive test suite covering basic behavior, transitive dependencies, diamond patterns, cycle detection, and version availability partitioning

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@liamhuber
Copy link
Member

@liamhuber I didn't understand why you were returning a list of (same?) functions in the returned dictionary. Was it because my intention of this module was not clear?

Oh, really excellent catch. Because I did it right before bed and I was being dumb. The relevant changes are CallDependencies = dict[versions.VersionInfo, set[Callable]] and call_dependencies.setdefault(info, {}).add(caller).

I see you've switched it back to a single Callable in the dict -- I think this is a mistake and for generality it would be nice to hold everyone who depends on the same version info, but I'll look at what you did overnight after a cup of coffee to lubricate my brain 😂

@liamhuber
Copy link
Member

I think this is a mistake and for generality it would be nice to hold everyone who depends on the same version info, but I'll look at what you did overnight after a cup of coffee to lubricate my brain 😂

Yeah, you're 100% right, it's just CallDependencies = dict[versions.VersionInfo, Callable] because the VersionInfo directly contains the qualname and not just the module. 🙏 ☕

Then some of the AI complaints about "dependent" vs "dependents" are correct and need to be acted on.

liamhuber and others added 6 commits February 24, 2026 07:25
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@liamhuber
Copy link
Member

Copilots comments on the tests are largely reasonable, and the coverage is still lacking, so a bit more there would be nice

This is only reachable if something is identified by ast as a `Call`, but _isn't_ callable. This should only happen in contrived situations. Let's just fail hard and ask for clarity.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
@liamhuber
Copy link
Member

I think I'm done here @samwaseda.

I believe it does fairly what it says it will do -- get_call_dependencies. Unfortunately, per my comment in the meeting thread, I'm actually afraid that this is insufficient for our real objective of reproducibility. I think what we really need is all the dependencies, not just for the calls, and I'm not sure I want to try to do that 😅

Copy link
Member

@liamhuber liamhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samwaseda, we talked about spinning some files off into a new repo at some point, but in the meantime I'm happy to merge these developments here. I would only try to avoid getting it too entangled with the other parsing. This is completely standalone and #164 is just making some QoL changes to the object_scope that are already in #161, so so far so good.

@samwaseda samwaseda merged commit 5491049 into main Feb 27, 2026
20 checks passed
@samwaseda samwaseda deleted the crawler branch February 27, 2026 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants