Added parentage functions by josuechinchilla · Pull Request #59 · Breeding-Insight/BIGr

josuechinchilla · 2026-04-22T17:44:46Z

Added 2 parentage functions along with test files for them. updated imporFrom statemetns for ped_check

Development

Copilot

Pull request overview

This PR updates package exports/imports to support new pedigree/parentage-related functionality, including exposing validate_pedigree() and tightening check_ped() dependency declarations.

Changes:

Export validate_pedigree() and add its data.table roxygen imports.
Replace broad @import directives in check_ped() with more specific @importFrom entries.
Regenerate/update NAMESPACE accordingly (including adding tools::file_path_sans_ext and janitor::clean_names imports).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
R/validate_pedigree.R	Adds `@importFrom data.table ...` for `validate_pedigree()` and ensures it’s exported.
R/check_ped.R	Switches roxygen imports to `@importFrom` for `dplyr`/`janitor` and adds `tools::file_path_sans_ext`.
NAMESPACE	Updates exports/imports to match roxygen (adds `export(validate_pedigree)`, switches janitor import style, adds additional `dplyr` imports).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T18:02:21Z

 #'     \item{Best_Female_Parent_Error_Pct}{Homozygous mismatch percentage for the best female parent candidate.}
 #'   }
 #' @export
+#' @importFrom data.table fread fwrite copy  data.table set rbindlist


The roxygen @importFrom data.table ... line is redundant with the current implementation, since all referenced data.table APIs are called via data.table::.... Consider either removing this @importFrom (and relying on data.table::), or dropping the data.table:: qualifiers and keeping the import—having both makes the dependency surface harder to reason about.

Suggested change

#' @importFrom data.table fread fwrite copy data.table set rbindlist

Copilot · 2026-04-22T18:02:22Z

+#' @importFrom janitor clean_names
 #' @importFrom stats setNames
 #' @importFrom utils read.table
+#' @importFrom tools file_path_sans_ext


These @importFrom tags for janitor::clean_names, stats::setNames, and tools::file_path_sans_ext are currently unused because the code calls them with explicit namespaces (janitor::, stats::, tools::). Consider removing the unused imports, or alternatively removing the namespace qualifiers if you want to rely on imports—mixing both makes it harder to audit dependencies.

Suggested change

#' @importFrom janitor clean_names

#' @importFrom stats setNames

#' @importFrom utils read.table

#' @importFrom tools file_path_sans_ext

#' @importFrom utils read.table

codecov · 2026-04-22T18:13:49Z

Codecov Report

❌ Patch coverage is 54.51713% with 146 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.96%. Comparing base (ccc0952) to head (f798362).

Files with missing lines	Patch %	Lines
R/find_parentage.R	16.45%	66 Missing ⚠️
R/validate_pedigree.R	32.63%	64 Missing ⚠️
R/check_ped.R	89.11%	16 Missing ⚠️

Additional details and impacted files

@@               Coverage Diff               @@
##           development      #59      +/-   ##
===============================================
- Coverage        81.53%   78.96%   -2.58%     
===============================================
  Files               23       23              
  Lines             2616     2814     +198     
===============================================
+ Hits              2133     2222      +89     
- Misses             483      592     +109

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ement for documentation on check_ped

…and to plot results.

Copilot

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

man/validate_pedigree.Rd:90

The function always writes corrected_pedigree.txt to the working directory (even when write_results = FALSE), but this side-effect isn’t documented here. Please document this output file (including when it’s written and how it relates to write_results) so users aren’t surprised by unexpected files in their working directory.

\description{
Validates parent-offspring trios by calculating Mendelian error rates from
SNP genotype data. Identifies incorrect parentage assignments and suggests
best-matching replacements. If a list of founders is supplied, trios that
are declared founders (both parents coded as 0) are preserved unchanged
with no recommendations. Trios removed due to missing genotype data are
retained in the output with a NO_GENOTYPE_DATA status.
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T18:38:56Z

+      # Replace any empty strings with NA
+      for (col in base::names(final_df)) {
+        data.table::set(final_df, which(final_df[[col]] == ""), col, NA_character_)


In best_pair, the post-merge loop that replaces empty strings uses final_df[[col]] == "" for all columns and then assigns NA_character_ via data.table::set(). This can coerce numeric/integer columns (e.g., Markers_Tested) to character or error depending on data.table coercion rules. Restrict the empty-string cleanup to character columns (and assign NA_character_ only there), or use type-appropriate NA values per column.

Suggested change

# Replace any empty strings with NA

for (col in base::names(final_df)) {

data.table::set(final_df, which(final_df[[col]] == ""), col, NA_character_)

# Replace any empty strings with NA in character columns only

char_cols <- base::names(final_df)[

base::vapply(final_df, base::is.character, logical(1))

]

for (col in char_cols) {

empty_idx <- which(final_df[[col]] == "")

if (base::length(empty_idx) > 0) {

data.table::set(final_df, empty_idx, col, NA_character_)

}

Copilot · 2026-04-27T18:38:57Z

+          linewidth  = 1
+        ) +
+        ggplot2::scale_x_continuous(breaks = seq(0, 100, by = 5)) +
+        ggplot2::scale_y_continuous(breaks = seq(0, 10000, by = 5)) +


scale_y_continuous(breaks = seq(0, 10000, by = 5)) creates ~2000 y-axis breaks, which can significantly slow plot rendering and often produces unreadable axes. Consider removing the explicit y breaks (let ggplot pick), or use a much coarser/pretty break strategy based on the actual range (e.g., scales::pretty_breaks()).

Suggested change

ggplot2::scale_y_continuous(breaks = seq(0, 10000, by = 5)) +

ggplot2::scale_y_continuous(breaks = scales::pretty_breaks()) +

Copilot · 2026-04-27T18:38:57Z

  if (write_txt) {
-    output_filename <- "parentage_results_dt.txt"
+    output_filename <- "parentage_testing_results.txt"
    tryCatch({
-      data.table::fwrite(final_df, file = output_filename, sep = "\t", quote = FALSE)
+      data.table::fwrite(final_df, file = output_filename, sep = "\t", quote = FALSE,
+                         na = na_string)


The output filename for write_txt = TRUE is hard-coded to parentage_testing_results.txt. Since this used to be a different filename, this is a behavioral breaking change for scripts that relied on the old name, and there’s no way to override it. Consider adding an output_filename argument (with the current value as default) and/or supporting the previous filename via a deprecated option to avoid breaking existing workflows.

Copilot · 2026-04-27T18:38:57Z

+  #### check 4: missing parents ####
+  all_ids     <- unique(data$id)
+  ref_ids     <- unique(c(data$male_parent, data$female_parent))
+  ref_ids     <- ref_ids[ref_ids != "0"]


ref_ids <- ref_ids[ref_ids != "0"] does not drop NA values (because NA != "0" yields NA), so missing_ids <- setdiff(ref_ids, all_ids) can include NA. That can lead to a founder row with id = NA being added when correct = TRUE. Filter out NA explicitly (and consider trimming empty strings) before computing missing_ids.

Suggested change

ref_ids <- ref_ids[ref_ids != "0"]

ref_ids <- trimws(ref_ids)

ref_ids <- ref_ids[!is.na(ref_ids) & ref_ids != "" & ref_ids != "0"]

Copilot · 2026-04-27T18:38:58Z

+  #### zip export ####
+  if (save_zip) {
+    tmp_dir   <- tempfile()
+    dir.create(tmp_dir)


When save_zip = TRUE, a temporary directory is created (tmp_dir <- tempfile(); dir.create(tmp_dir)) but it’s never cleaned up. Add an on.exit(unlink(tmp_dir, recursive = TRUE), add = TRUE) (or equivalent) so tests/usage don’t leave temp folders behind.

Suggested change

dir.create(tmp_dir)

dir.create(tmp_dir)

on.exit(unlink(tmp_dir, recursive = TRUE), add = TRUE)

Copilot · 2026-04-27T18:38:58Z

+        ggplot2::theme(legend.position = "top")
+      print(p)
+      if (write_results) {
+        plot_filename <- base::sub("\\.[^.]+$", ".jpg", output_filename)


plot_filename <- sub("\\.[^.]+$", ".jpg", output_filename) only changes the name if output_filename already has an extension. If a user passes output_filename = "report", the plot would be saved as report (no .jpg). Consider building the plot filename with tools::file_path_sans_ext(output_filename) + .jpg to handle both cases reliably.

Suggested change

plot_filename <- base::sub("\\.[^.]+$", ".jpg", output_filename)

plot_filename <- paste0(tools::file_path_sans_ext(output_filename), ".jpg")

Copilot

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T18:44:51Z

 export(updog2vcf)
+export(validate_pedigree)
 import(dplyr)
-import(janitor)
 import(parallel)
 import(quadprog)
 import(stringr)


janitor was removed from the NAMESPACE imports, and there are no remaining references in the R code. The package still lists janitor under Imports in DESCRIPTION, which will keep it as a hard dependency unnecessarily. Consider removing it from DESCRIPTION (or re-adding the NAMESPACE import if it's still intended to be required).

Copilot · 2026-04-27T18:44:52Z

      comparisons <- base::sum(!base::is.na(cand_hom) & !base::is.na(prog_hom))
      if (comparisons == 0) return(NA_real_)
      (base::sum(cand_hom != prog_hom, na.rm = TRUE) / comparisons) * 100
    })


In find_best_parent(), if all candidate mismatch rates are NA (e.g., no overlapping homozygous markers), which.min(errors) returns integer(0), so candidates[best_idx] becomes character(0) instead of NA. This can propagate empty strings/zero-length values into Best_*_Candidate and make downstream logic/tests brittle. Consider explicitly handling the all-NA case (e.g., if all(is.na(errors)) then return list(id = NA_character_, error_pct = NA_real_)).

Suggested change

})

})

if (base::all(base::is.na(errors)))

return(base::list(id = NA_character_, error_pct = NA_real_))

Copilot · 2026-04-27T18:44:52Z

+\item{write_results}{Logical. If TRUE, writes the validation results to
+output_filename and saves the plot as a .jpg file (default: TRUE).}

-\item{output_filename}{Character. Path/name of the output file
-(default: "pedigree_validation_results.txt").}
+\item{output_filename}{Character. Path/name of the output results file
+(default: "__validation_report.txt"). The plot will be saved using the
+same base name with a .jpg extension.}
+
+\item{plot_results}{Logical. If TRUE, prints a histogram of Trio Mendelian
+Error percentages with a threshold line (default: TRUE). If write_results
+is also TRUE, the plot is additionally saved as a .jpg file.}
+
+\item{na_string}{Character. String for missing values in the output file.


validate_pedigree() always writes corrected_pedigree.txt to the working directory, but the Rd docs don't mention this side effect or how (if at all) it can be disabled/configured. Since this can surprise callers (and is why tests now need temp working dirs), please document the corrected pedigree output (and ideally add a parameter to control its generation or output path).

Copilot

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

NAMESPACE:33

NAMESPACE no longer imports janitor, and a repo-wide search shows no remaining janitor:: usage. However, DESCRIPTION still lists janitor under Imports, which will typically trigger an R CMD check NOTE about unused imports. Either remove janitor from DESCRIPTION or reintroduce an actual use/import as appropriate.

import(dplyr)
import(parallel)
import(quadprog)
import(stringr)
import(tibble)
import(tidyr)
import(vcfR)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T19:14:54Z

  detect_all_cycles <- function(data) {
    adj_list <- lapply(data$id, function(x) {
      row <- data[data$id == x, ]
-      c(row$sire, row$dam)
+      c(row$male_parent, row$female_parent)
    })
    names(adj_list) <- data$id



detect_all_cycles() builds adj_list using data$id and then sets names(adj_list) <- data$id. If id is duplicated (which is explicitly one of the issues this function detects), the adjacency list and the visited/rec_stack named vectors end up with duplicate names. Indexing by name (visited[node], adj_list[[node]]) then becomes ambiguous and can silently mis-detect or miss cycles. Consider running cycle detection on a de-duplicated/normalized pedigree (e.g., one row per id after duplicate removal / conflict resolution) or explicitly constructing the graph from unique(data$id) (or a key) so node names are unique.

Copilot · 2026-04-27T19:14:54Z

+    invisible(capture.output(
+      utils::zip(zipfile = zip_path, files = zip_files, flags = "-j")
+    ))
+    if (verbose) cat(paste0("\nZip archive saved to: ", zip_path, "\n"))


The save_zip path calls utils::zip(...) without any error handling. On systems without an available zip utility (or where getOption("zip") points to a missing binary), this will error and abort check_ped(), even though zipping is an optional feature. Wrap the zip creation in tryCatch() (and/or preflight with nzchar(Sys.which(getOption("zip")))) so failures produce a clear warning/error message without leaving partial output behind.

Suggested change

invisible(capture.output(

utils::zip(zipfile = zip_path, files = zip_files, flags = "-j")

))

if (verbose) cat(paste0("\nZip archive saved to: ", zip_path, "\n"))

zip_cmd <- getOption("zip")

zip_bin <- if (nzchar(zip_cmd)) Sys.which(zip_cmd) else ""

if (!nzchar(zip_bin)) {

warning(

paste0(

"Zip archive was not created because no usable zip utility was found. ",

"Configure options(zip = ...) to point to a valid zip binary."

),

call. = FALSE

)

} else {

zip_ok <- tryCatch({

invisible(capture.output(

utils::zip(zipfile = zip_path, files = zip_files, flags = "-j")

))

TRUE

}, error = function(e) {

if (file.exists(zip_path)) unlink(zip_path)

warning(

paste0("Zip archive could not be created: ", conditionMessage(e)),

call. = FALSE

)

FALSE

})

if (zip_ok && verbose) cat(paste0("\nZip archive saved to: ", zip_path, "\n"))

}

Copilot

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T19:50:49Z

  detect_all_cycles <- function(data) {
    adj_list <- lapply(data$id, function(x) {
      row <- data[data$id == x, ]
-      c(row$sire, row$dam)
+      c(row$male_parent, row$female_parent)
    })
    names(adj_list) <- data$id

    dfs <- function(node, visited, rec_stack, path) {
-      visited[node] <- TRUE
+      visited[node]   <- TRUE
      rec_stack[node] <- TRUE
-      path <- append(path, node)
+      path   <- append(path, node)
      cycles <- list()
-
      for (neighbor in adj_list[[node]]) {
        if (neighbor %in% names(adj_list)) {
          if (!visited[neighbor]) {
            cycles <- append(cycles, dfs(neighbor, visited, rec_stack, path))
          } else if (rec_stack[neighbor]) {
            cycle_start <- match(neighbor, path)
            cycles <- append(cycles, list(path[cycle_start:length(path)]))
          }
        }
      }
-
      rec_stack[node] <- FALSE
      return(cycles)
    }

-    visited <- stats::setNames(rep(FALSE, length(adj_list)), names(adj_list))
+    visited   <- stats::setNames(rep(FALSE, length(adj_list)), names(adj_list))
    rec_stack <- stats::setNames(rep(FALSE, length(adj_list)), names(adj_list))
    all_cycles <- list()
-
    for (node in names(adj_list)) {
      if (!visited[node]) {
        node_cycles <- dfs(node, visited, rec_stack, character())


Cycle detection can error when the input pedigree contains duplicate IDs (which is exactly what check_ped is meant to detect). adj_list and visited/rec_stack are keyed by data$id (which may include duplicates), so expressions like if (!visited[node]) / if (!visited[neighbor]) can receive a vector and throw the condition has length > 1. Build the adjacency list over unique(data$id) (and aggregate parents per ID), ensuring visited/rec_stack have unique names before running DFS.

Copilot · 2026-04-27T19:50:49Z

+      plot_df <- final_df[!is.na(final_df$Mendelian_Error_Pct)]
+      plot_df$Mendelian_Error_Pct <- base::as.numeric(plot_df$Mendelian_Error_Pct)
+


The plot code has to coerce Mendelian_Error_Pct with as.numeric(), which suggests the results column is not consistently numeric. That makes downstream comparisons and summaries more error-prone (and can introduce NAs if formatting changes). Prefer keeping Mendelian_Error_Pct numeric in the returned data.table for all methods, and only format it as a string when writing/printing.

Copilot

Pull request overview

Copilot reviewed 12 out of 19 changed files in this pull request and generated 7 comments.

Files not reviewed (6)

man/allele_freq_poly.Rd: Language not supported
man/check_ped.Rd: Language not supported
man/find_parentage.Rd: Language not supported
man/imputation_concordance.Rd: Language not supported
man/solve_composition_poly.Rd: Language not supported
man/validate_pedigree.Rd: Language not supported

Comments suppressed due to low confidence (1)

DESCRIPTION:66

janitor remains listed under Imports, but there are no longer any janitor:: (or clean_names) usages in the R sources. This makes janitor an unnecessary hard dependency; consider removing it from DESCRIPTION Imports if it's truly unused.

Imports: 
    parallel,
    dplyr,
    Rdpack (>= 0.7),
    readr (>= 2.1.5),
    reshape2 (>= 1.4.4),
    rlang,
    tidyr (>= 1.3.1),
    vcfR (>= 1.15.0),
    Rsamtools,
    Biostrings,
    pwalign,
    janitor,
    quadprog,
    tibble,
    stringr,
    data.table
Suggests:

  data <- utils::read.table(ped.file, header = TRUE) %>%
-    janitor::clean_names() %>%
-    mutate(
-      id = as.character(id),
-      sire = as.character(sire),
-      dam = as.character(dam)
+    dplyr::mutate(
+      id            = as.character(id),
+      male_parent   = as.character(male_parent),
+      female_parent = as.character(female_parent)
+    )
+
+  # Validate required column names before any processing
+  required_cols <- c("id", "male_parent", "female_parent")
+  missing_cols  <- setdiff(required_cols, colnames(data))
+  if (length(missing_cols) > 0) {
+    stop(
+      "Input file is missing required column(s): ",
+      paste(missing_cols, collapse = ", "),
+      ".\nExpected columns: id, male_parent, female_parent."
    )


+check_ped <- function(ped.file,
+                      seed                           = NULL,
+                      verbose                        = TRUE,
+                      correct_conflicting_trios      = TRUE,
+                      correct_inconsistent_sex_roles = TRUE) {


  input_ped_report <- list(
-    exact_duplicates = exact_duplicates,
-    repeated_ids_diff = repeated_ids_report,
-    messy_parents = messy_parents,
-    missing_parents = missing_parents,
-    dependencies = data.frame(Dependency = unique(unlist(errors)))
+    exact_duplicates       = exact_duplicates,
+    conflicting_trios      = repeated_ids_report,
+    inconsistent_sex_roles = inconsistent_sex_roles,
+    missing_parents        = missing_parents,
+    dependencies           = data.frame(Dependency = unique(unlist(errors)),
+                                        stringsAsFactors = FALSE),
+    corrected_pedigree     = data %>% dplyr::select(-row_number)
  )


+test_that("check_ped returns a named list of length 5", {
+  ped <- data.frame(
+    id            = c("A", "B", "C"),
+    male_parent   = c("0", "A", "A"),
+    female_parent = c("0", "0", "0")
+  )
+  out <- check_ped(write_ped(ped), seed = 1, verbose = FALSE, correct = FALSE)

-  #Calculations
-  output.list <- check_ped(ped_file,
-                           seed = 101919,
-                           verbose = FALSE)
+  expect_type(out, "list")
+  expect_length(out, 5)
+  expect_named(out, c(
+    "exact_duplicates",
+    "repeated_ids_diff",
+    "inconsistent_sex_roles",
+    "missing_parents",
+    "dependencies"
+  ))


+check_ped(
+  ped.file,
+  seed = NULL,
+  verbose = TRUE,
+  correct = TRUE,
+  save_zip = FALSE,
+  save_corrected_zip = FALSE
+)
 }
 \arguments{
 \item{ped.file}{Path to the pedigree text file.}

-\item{seed}{Optional seed for reproducibility.}
+\item{seed}{Optional integer seed for reproducibility.}

-\item{verbose}{Logical. If TRUE (default), prints errors and prompts for interactive saving.}
+\item{verbose}{Logical. Print findings to the console? Default \code{TRUE}.}
+
+\item{correct}{Logical. Build and save a corrected pedigree? Default \code{TRUE}.}
+
+\item{save_zip}{Logical. Export report components as a \code{.zip} archive? Default \code{FALSE}.}
+
+\item{save_corrected_zip}{Logical. Include corrected pedigree in the \code{.zip}?
+Only applies when \code{save_zip = TRUE} and \code{correct = TRUE}. Default \code{FALSE}.}
 }


+importFrom(ggplot2,aes)
+importFrom(ggplot2,element_text)
+importFrom(ggplot2,geom_bar)
+importFrom(ggplot2,ggplot)
+importFrom(ggplot2,labs)
+importFrom(ggplot2,theme)
+importFrom(ggplot2,theme_minimal)


 #'
+#' @importFrom dplyr %>% filter arrange
 #' @importFrom stats reorder
+#' @importFrom ggplot2 ggplot aes geom_bar labs theme_minimal theme element_text


josuechinchilla and others added 3 commits April 22, 2026 13:27

Merge pull request #58 from Breeding-Insight/development

f590dad

Development

corrected importFrom statement in parentage functions and ped_check

20d2764

updated NAMESPACE to reflect updated importFrom statements

f2d3f40

josuechinchilla requested review from alex-sandercock and Copilot April 22, 2026 17:44

Copilot started reviewing on behalf of josuechinchilla April 22, 2026 17:45 View session

alex-sandercock added the in_progress not ready to merge label Apr 22, 2026

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Comment thread R/check_ped.R Outdated

Update R/check_ped.R

c1fe543

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

josuechinchilla requested a review from Copilot April 22, 2026 17:58

Copilot started reviewing on behalf of josuechinchilla April 22, 2026 17:59 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

josuechinchilla removed the request for review from alex-sandercock April 23, 2026 20:13

josue.chinchilla added 3 commits April 24, 2026 13:52

updated error handling and reporting for check_ped, Also major improv…

0a13483

…ement for documentation on check_ped

updated error handling and reporting for check_ped

9bbc507

updated parentage functions to work allow user to display NA or not, …

f798362

…and to plot results.

josuechinchilla requested a review from Copilot April 27, 2026 18:33

Copilot started reviewing on behalf of josuechinchilla April 27, 2026 18:33 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

josuechinchilla requested a review from Copilot April 27, 2026 18:39

Copilot started reviewing on behalf of josuechinchilla April 27, 2026 18:40 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Made edits suggested by autopilot

03725d8

josuechinchilla requested a review from Copilot April 27, 2026 19:10

Copilot started reviewing on behalf of josuechinchilla April 27, 2026 19:10 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

added edits suggested by copilot

506a47f

josuechinchilla requested a review from Copilot April 27, 2026 19:45

Copilot started reviewing on behalf of josuechinchilla April 27, 2026 19:45 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

josuechinchilla requested review from alex-sandercock April 27, 2026 19:52

josue.chinchilla added 4 commits April 28, 2026 11:15

modified default params

977e070

Added authors fror functions and made headers concise

5db5fa2

staged comments for next push

6b84e18

removed zip option and added specific correction arguments

4eb79bb

alex-sandercock requested a review from Copilot May 20, 2026 20:26

Copilot started reviewing on behalf of alex-sandercock May 20, 2026 20:26 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

-      # Replace any empty strings with NA
-      for (col in base::names(final_df)) {
-        data.table::set(final_df, which(final_df[[col]] == ""), col, NA_character_)
+      # Replace any empty strings with NA in character columns only
+      char_cols <- base::names(final_df)[
+        base::vapply(final_df, base::is.character, logical(1))
+      ]
+      for (col in char_cols) {
+        empty_idx <- which(final_df[[col]] == "")
+        if (base::length(empty_idx) > 0) {
+          data.table::set(final_df, empty_idx, col, NA_character_)
+        }

	ggplot2::scale_y_continuous(breaks = seq(0, 10000, by = 5)) +
	ggplot2::scale_y_continuous(breaks = scales::pretty_breaks()) +

	ref_ids <- ref_ids[ref_ids != "0"]
	ref_ids <- trimws(ref_ids)
	ref_ids <- ref_ids[!is.na(ref_ids) & ref_ids != "" & ref_ids != "0"]

	dir.create(tmp_dir)
	dir.create(tmp_dir)
	on.exit(unlink(tmp_dir, recursive = TRUE), add = TRUE)

	plot_filename <- base::sub("\\.[^.]+$", ".jpg", output_filename)
	plot_filename <- paste0(tools::file_path_sans_ext(output_filename), ".jpg")

-    })
+    })
+    if (base::all(base::is.na(errors)))
+      return(base::list(id = NA_character_, error_pct = NA_real_))

-    invisible(capture.output(
-      utils::zip(zipfile = zip_path, files = zip_files, flags = "-j")
-    ))
-    if (verbose) cat(paste0("\nZip archive saved to: ", zip_path, "\n"))
+    zip_cmd  <- getOption("zip")
+    zip_bin  <- if (nzchar(zip_cmd)) Sys.which(zip_cmd) else ""
+    if (!nzchar(zip_bin)) {
+      warning(
+        paste0(
+          "Zip archive was not created because no usable zip utility was found. ",
+          "Configure options(zip = ...) to point to a valid zip binary."
+        ),
+        call. = FALSE
+      )
+    } else {
+      zip_ok <- tryCatch({
+        invisible(capture.output(
+          utils::zip(zipfile = zip_path, files = zip_files, flags = "-j")
+        ))
+        TRUE
+      }, error = function(e) {
+        if (file.exists(zip_path)) unlink(zip_path)
+        warning(
+          paste0("Zip archive could not be created: ", conditionMessage(e)),
+          call. = FALSE
+        )
+        FALSE
+      })
+      if (zip_ok && verbose) cat(paste0("\nZip archive saved to: ", zip_path, "\n"))
+    }

		plot_df <- final_df[!is.na(final_df$Mendelian_Error_Pct)]
		plot_df$Mendelian_Error_Pct <- base::as.numeric(plot_df$Mendelian_Error_Pct)

Conversation

josuechinchilla commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

codecov Bot commented Apr 22, 2026 •

edited

Loading