-
Notifications
You must be signed in to change notification settings - Fork 13
binned points structure for sequencing-based technologies #337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
add head/tail method imports for terra
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughMajor refactoring that modularizes S4 class definitions from a monolithic R/classes.R file into separate topic-specific modules. Introduces giottoBinPoints class for binned spatial point data with overlap calculation capabilities. Moves existing classes (giottoPolygon, giottoPoints, overlap classes, virtual bases, images) into dedicated files while consolidating internal constructors. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lintr found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In @R/classes-binpoints.R:
- Around line 304-356: The function .gbp_create_overlap_point_dt mutates the
input S4 object's slot y@counts by adding feat_ID_uniq; to avoid this
side-effect, make a local deep copy of the counts data.table (e.g. counts_dt <-
data.table::copy(y@counts)), operate on counts_dt (add counts_dt$feat_ID_uniq <-
seq_len(nrow(counts_dt)) and use counts_dt in the join: overlap_data <-
counts_dt[overlap_data, on = .(j = count_j)]), and remove any direct writes to
y@counts so y is not mutated.
- Around line 166-197: createGiottoBinPoints currently mutates the incoming
spatial_locs in-place via "spatial_locs[] <- spatial_locs[][,
c('sdimx','sdimy')]", which can alter the caller's object if it has reference
semantics; make a local copy of spatial_locs at the start of the function (e.g.
via data.table::copy or assigning a new variable) and then subset that copy to
keep only "sdimx" and "sdimy" before converting to points (the code around the
spatial_locs[] <- ... and sl_points <- as.points(spatial_locs) lines); this
preserves the caller's object while preserving the existing logic for sl_points
and subsequent use in new("giottoBinPoints") and .gbp_compact.
In @R/classes-images.R:
- Around line 154-166: The logic in the block that sets attr(x, "max_window") is
inverted: it recalculates x@max_intensity only when x@max_intensity is NOT NULL.
Change the guard so you recalculate when x@max_intensity is missing (use if
(is.null(x@max_intensity) || is.na(x@max_intensity)) or equivalent) and then
assign x@max_intensity <- .spatraster_intensity_range(x@raster_object)[["max"]]
before computing attr(x, "max_window") <- .bitdepth(x@max_intensity, return_max
= TRUE); keep the rest of the flow unchanged.
In @R/classes-overlaps.R:
- Around line 146-148: The current check only tests the first 100 entries
(head(poly, 100)) and can miss later heterogeneous types; replace that
sample-based check with a full-column or more robust test (e.g., call
checkmate::test_integerish on the entire overlap_data$poly vector or use a
vectorized/class check like any(!vapply(overlap_data$poly, is.integer,
logical(1))) or similar) and then, if the full check fails, perform the
conversion overlap_data[, poly := match(poly, odt@spat_ids)]; ensure you
reference the same symbols (overlap_data, poly, odt@spat_ids,
checkmate::test_integerish) when making the change.
- Around line 131-135: The code filters NA in overlap_data$poly but still uses
overlap_data$feat_idx to subset y; ensure you remove rows with NA in feat_idx
before subsetting by changing the filter to exclude NA in feat_idx as well
(e.g., update overlap_data <- overlap_data[!is.na(poly) & !is.na(feat_idx), ] or
create a non-NA index vector and use that when subsetting y), then call ytab <-
terra::as.data.frame(y[overlap_data$feat_idx, keep]) so feat_idx contains no NAs
and the subsetting is safe.
In @R/classes-polygons.R:
- Around line 51-55: The NULL check is wrong because unique_ID_cache is
initialized to NA_character_; change the condition to test for NA and use direct
slot access: replace is.null(attr(gpoly, "unique_ID_cache")) with
is.na(gpoly@unique_ID_cache) (or equivalent slot(gpoly, "unique_ID_cache") if
preferred) and initialize via gpoly@unique_ID_cache <-
unique(gpoly@spatVector$poly_ID) instead of using attr(...) and as.list(...).
🧹 Nitpick comments (3)
R/classes-overlaps.R (1)
97-101: Add input validation.The constructor assumes
overlap_datais a valid data.table but doesn't verify structure or column presence before accessingncol().🛡️ Add validation
.create_overlap_intensity_dt <- function(overlap_data) { + checkmate::assert_data_table(overlap_data, min.cols = 2) odt <- new("overlapIntensityDT", data = overlap_data) odt@nfeats <- ncol(overlap_data) - 1L odt }man/as.polygons.Rd (1)
43-43: Consider whether the cross-reference should point to the generic or specific method.The cross-reference was changed from
as.data.table()toas.data.table.giottoBinPoints(), which makes it more specific but potentially less discoverable for users seeking generalas.data.tablefunctionality. Verify this aligns with your documentation strategy.R/classes-binpoints.R (1)
96-133: Consider removing commented-out code.The
calculateOverlapmethod is well-implemented with proper subset handling, centroid calculation, and flexible return options. However, lines 108-112 contain commented-out alternative implementation code that could be removed or moved to documentation if it's meant as a reference.Suggested cleanup
overlap_data <- terra::extract(x[], y@spatial) - # res <- terra::relate(x[], y@spatial, - # relation = "intersects", pairs = TRUE) |> - # data.table::as.data.table() - # names(res) <- c("poly_ID", "b") - # res <- res[, .gbp_spatial_select_counts(y, b), by = "poly_ID"] res <- .gbp_create_overlap_point_dt(x, y, overlap_data)
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (30)
DESCRIPTIONNAMESPACER/aggregate.RR/classes-binpoints.RR/classes-images.RR/classes-overlaps.RR/classes-points.RR/classes-polygons.RR/classes-utils.RR/classes-virtuals.RR/classes.RR/package_imports.Rman/as.data.table.Rdman/as.matrix.Rdman/as.points.Rdman/as.polygons.Rdman/featureNetwork-class.Rdman/giottoAffineImage-class.Rdman/giottoImage-class.Rdman/giottoLargeImage-class.Rdman/giottoPoints-class.Rdman/giottoPolygon-class.Rdman/miscData-class.Rdman/overlapPointDT-class.Rdman/processParam-class.Rdman/r_spatial_conversions.Rdman/svkey-class.Rdman/terraVectData-class.Rdman/updateGiottoPointsObject.Rdman/updateGiottoPolygonObject.Rd
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-16T17:23:54.804Z
Learnt from: jiajic
Repo: drieslab/GiottoClass PR: 300
File: R/methods-coerce.R:211-219
Timestamp: 2025-06-16T17:23:54.804Z
Learning: In R/methods-coerce.R, the as.matrix method for overlapIntensityDT intentionally creates a dense intermediate matrix before wrapping with Matrix::Matrix. This is appropriate because intensity overlaps are based on raster data which starts as dense, and the Matrix wrapper is used for consistency rather than sparsity optimization.
Applied to files:
man/overlapPointDT-class.RdNAMESPACEman/as.matrix.RdR/classes-overlaps.RR/classes-binpoints.RR/package_imports.Rman/as.data.table.RdR/aggregate.R
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lintr
- GitHub Check: unit-tests
🔇 Additional comments (52)
man/processParam-class.Rd (1)
2-2: Verify source file reference is accurate.The roxygen header reference has been updated to point to R/classes-utils.R as part of the class modularization. Ensure that the
processParamclass is actually defined in this file.man/overlapPointDT-class.Rd (1)
2-2: Verify source file references are accurate.The roxygen header has been updated to reference R/classes-overlaps.R for the class definition and R/methods-extract.R for extraction methods. Ensure that
overlapPointDTis actually defined in R/classes-overlaps.R and that extraction methods remain in R/methods-extract.R.man/terraVectData-class.Rd (1)
2-2: Verify source file reference is accurate.The roxygen header reference has been updated to point to R/classes-virtuals.R as part of the class modularization. Ensure that the virtual class
terraVectDatais actually defined in this file.man/giottoImage-class.Rd (1)
2-2: LGTM! Documentation reference updated for modularization.The source file reference has been correctly updated to reflect the new modular organization where image classes are now defined in
R/classes-images.R.man/svkey-class.Rd (1)
2-2: LGTM! Documentation reference updated.The documentation now correctly points to
R/classes-utils.Rwhere thesvkeyclass is defined, consistent with the modular refactoring.man/giottoLargeImage-class.Rd (1)
2-2: LGTM! Documentation reference updated.The source reference has been correctly updated to
R/classes-images.Ras part of the class definition modularization.man/as.points.Rd (1)
38-38: Review the scope of the cross-reference change in the "Other As coercion functions" section.The reference has been narrowed from
as.data.table()to onlyas.data.table.giottoBinPoints(). However, multiple otheras.data.table.*methods exist in the codebase:
as.data.table.giottoPoints()as.data.table.giottoPolygon()as.data.table.SpatVector()The current reference to only
giottoBinPointsappears incomplete, especially since theas.points()function itself does not document or exemplify conversion ofgiottoBinPointsobjects. If the goal is to reference related coercion functions, consider whether otheras.data.table.*variants should also be cross-referenced, or clarify whygiottoBinPointsis specifically highlighted.man/giottoAffineImage-class.Rd (1)
2-2: LGTM! Documentation reference updated correctly.The documentation now correctly references
R/classes-images.Rfor thegiottoAffineImageclass definition, which is properly defined in that file as a class extendinggiottoLargeImage.man/giottoPolygon-class.Rd (1)
1-2: LGTM!The documentation reference correctly updated to point to the new modular file
R/classes-polygons.Ras part of the class definition reorganization.NAMESPACE (2)
14-14: LGTM!S3 method registration for
as.data.table.giottoBinPointscorrectly added to support the newgiottoBinPointsclass.
433-433: LGTM!Importing
headandtailmethods from terra properly enables S4 method dispatch for terra-based objects.Also applies to: 446-446
DESCRIPTION (2)
83-88: LGTM!The new modular class files are correctly added to the Collate field with appropriate ordering—virtual/base classes first, followed by concrete class implementations.
96-96: Verify placement ofclasses-binpoints.Rin Collate order.
classes-binpoints.Ris placed separately from the otherclasses-*.Rfiles (afterbuffer.Rrather than with lines 83-88). This may be intentional ifgiottoBinPointsdepends on buffer functionality, but please confirm this ordering is deliberate.R/aggregate.R (2)
1-4: LGTM!The
@include classes.Rdirective correctly establishes the collation dependency, ensuring class definitions are available before this file is processed.
806-808: LGTM!The comment helpfully documents that the constructor
.create_overlap_point_dtis now defined inclasses-overlaps.R, which aids maintainability after the modularization refactor.R/package_imports.R (1)
17-17: LGTM!Consolidating the terra method imports into a single
@importMethodsFromdirective improves readability while correctly addingheadandtailimports needed for the new functionality.R/classes-virtuals.R (5)
1-21: LGTM!Class unions and the
gIndextype are well-defined. The reference toallMatrixinzzz.R(line 13) follows the common R pattern of defining class unions that depend on package load order in the zzz file.
23-79: LGTM!The virtual base classes (
giottoSubobject,gdtData,nameData,exprData,coordDataDT) establish a clean inheritance hierarchy with appropriate slot definitions and prototypes.
90-193: LGTM!The data-holding virtual classes (
metaData,enrData,nnData,spatNetData,spatGridData) are well-structured with consistent use ofnullOrDatatablefor optional data.table slots and appropriate default prototypes.
199-265: LGTM!The provenance and spatial/feature tracking classes (
provData,spatData,featData,spatFeatData) correctly implement multiple inheritance patterns, enabling flexible composition for Giotto's data model.
267-297: LGTM!
miscDataandterraVectDatacomplete the virtual class hierarchy. ThemiscDataclass appropriately includes an example for documentation since it's a user-facing concept.man/miscData-class.Rd (1)
1-2: LGTM!The documentation reference correctly updated to
R/classes-virtuals.Rto match the new location of themiscDataclass definition.R/classes-overlaps.R (4)
5-91: LGTM: Well-structured class hierarchy.The virtual class hierarchy (overlapInfo → overlapPoint/overlapIntensity) and concrete implementations are well-designed. Documentation is comprehensive and includes helpful examples.
178-214: LGTM: Flexible dispatcher pattern.The dispatcher correctly handles multiple input types and uses recursion for nested list structures. The passthrough behavior (line 213) provides good extensibility for future types.
221-250: LGTM: Correct conversion logic.The function properly converts legacy SpatVector overlap data to the new overlapPointDT format, with appropriate NA filtering and index mapping.
252-264: LGTM: Appropriate intensity aggregation.The sum aggregation by
poly_IDis appropriate for intensity overlaps. Based on learnings, intensity overlaps derive from raster data which naturally starts dense.R/classes-polygons.R (2)
21-34: LGTM: Clean class definition.The giottoPolygon class is well-structured with appropriate inheritance and slot definitions. The prototype correctly initializes all slots.
70-84: LGTM: Packed class for serialization.The packedGiottoPolygon class correctly mirrors the main class structure with packed slots for use with terra's
wrap()generic.man/updateGiottoPolygonObject.Rd (1)
1-22: LGTM: Documentation source updated for modularization.The documentation correctly reflects the new modular file structure where polygon classes are now in R/classes-polygons.R.
man/r_spatial_conversions.Rd (1)
104-104: Theas.data.table.giottoBinPoints()method is implemented in R/classes-binpoints.R (line 154) and properly exported in NAMESPACE. No action required.Likely an incorrect or invalid review comment.
man/as.data.table.Rd (3)
35-36: LGTM on argument documentation cleanup.The consolidation of the
\dotsargument documentation is correct and improves clarity.
13-14: No changes needed. Theas.data.table.giottoBinPointsmethod implementation exists inR/classes-binpoints.Rand is properly registered in NAMESPACE. The documentation signature matches the implementation.
2-4: This documentation structure is intentional and standard.The
\name{as.data.table.giottoBinPoints}withas.data.tableas an alias is roxygen2's expected behavior when using@rdname as.data.tableto consolidate multiple S3 methods. All methods documented this way—giottoBinPoints,SpatVector,giottoPolygon, andgiottoPoints—are merged onto a single page, with roxygen2 selecting the alphabetically first method name as the primary identifier. The genericas.data.tableremains fully accessible via?as.data.tablethrough aliasing, so discoverability is not affected. This pattern is standard across R packages and properly documents method families.R/classes-points.R (3)
7-31: LGTM!The
giottoPointsclass definition is well-structured with appropriate slots, prototype defaults, and documentation. The class properly extendsfeatData,terraVectData, andgiottoSubobject.
34-60: LGTM!The
featureNetworkclass is properly defined with clear slot documentation. UsingANYtype for flexible slot contents is appropriate for this use case.
98-115: LGTM!The
packedGiottoPointsclass mirrors the structure ofgiottoPointsappropriately for use with thewrap()generic. The slots and prototypes are consistent with the main class.R/classes-utils.R (3)
5-25: LGTM!The
affine2dclass is well-defined with appropriate slot types and sensible prototype defaults for 2D affine transformations. The identity matrix default and zero-initialized transformations are correct.
27-40: LGTM!The
processParamvirtual class is properly defined with theVIRTUALcontains and exported via@exportClass. This provides a clean extension point for child classes.
44-67: LGTM!The
svkeyclass is appropriately marked as@keywords internaland provides a clean interface for referencing data retrieval from a giotto object. The use ofnullOrCharandnullOrLogicalunion types allows for flexible optional parameters.R/classes-images.R (3)
30-52: LGTM!The
giottoImageclass is properly defined for storing magick-based image objects with appropriate slots for spatial metadata, scaling, and file path information.
83-116: LGTM!The
giottoLargeImageclass is well-documented and properly extendsgiottoSubobject. The newcolorsandmax_windowslots are appropriately added with sensible defaults. The inline# REMOVE?comments onextentandoverall_extentslots indicate pending cleanup decisions.
131-138: LGTM!The
giottoAffineImageclass properly extendsgiottoLargeImageand adds theaffine2dslot for lazy affine transformations. The design supports deferred operations as documented.R/classes.R (2)
1-8: LGTM!The modularization of class definitions into separate files is well-organized. The
@includedirectives properly establish the dependency order: virtuals first, then specific class modules, ensuring classes are defined before they're referenced.
1043-1048: LGTM!The
giottoSpatialandspatialClassesclass unions properly reference the modularized class definitions from the included files. This maintains backward compatibility while supporting the new modular structure.R/classes-binpoints.R (8)
2-16: LGTM!The
giottoBinPointsclass is well-structured with clear slot purposes. Thecompactslot for tracking compaction state is a good design choice for optimizing memory usage.
19-29: LGTM!The
showmethod provides useful diagnostic output including dimensions, compact state, and a preview viahead(). The use of helper functions.show_feat()andprint_list()keeps the code clean.
31-37: Verify:dimreturns single column count.The
dimmethod returnsc(nrow(x@counts), 1L), always indicating 1 column. Verify this is intentional—typically spatial data might have multiple dimensions or the counts data.table might have multiple columns.
39-63: LGTM!The subsetting methods are well-designed with:
- Proper delegation from logical/character to numeric subsetting
- Auto-compaction logic to optimize memory after subsetting
- Standard R-style logical vector recycling
The
compact = "auto"parameter provides good flexibility.
72-94: LGTM!The
plotmethod handles both regular and density visualization modes cleanly. Input validation withcheckmate::assert_functionis good practice. The density transform option provides flexibility for data exploration.
139-149: LGTM!The
headandtailmethods properly handle edge cases withmin()andmax()guards to prevent out-of-bounds access.
151-161: LGTM!The
as.data.tableS3 method correctly transforms the internal sparse representation to a user-friendly format with proper column renaming.
253-260: LGTM!The
.gbp_get_spatial_sum_countsfunction correctly aggregates counts per spatial point using data.table syntax. The matching logic properly aligns the aggregated counts with spatial points.
also additional docs
- reworked using internal S4 generic - separation of concerns for subobject and backing files - save/load for giottoBinPoints subobjects resolved
featFeatureInfo for binpoints does work, but param `return_giottoPoints` must be `TRUE`
saving for
giottoBinPointsstill needs to be addressed.Summary by CodeRabbit
Release Notes
New Features
giottoBinPointsclass for managing binned spatial point data with subsetting, plotting, and overlap computation capabilities.Improvements
head,tail, andas.data.tablecoercion for binned points.✏️ Tip: You can customize this high-level summary in your review settings.