-
Notifications
You must be signed in to change notification settings - Fork 0
45 add method to evaluate traffic dynamics #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
gmatosferreira
merged 16 commits into
main
from
45-add-method-to-evaluate-traffic-dynamics
Jan 30, 2026
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
e53390a
#45 rt_collect methods to scrap GTFS-RT data
gmatosferreira bf3cb48
#45 rt_collect methods improved to gather data in a single csv file
gmatosferreira 2a1b8a2
#45 manuals updated
gmatosferreira e9ca1be
#45 rt_extend_prioritization method kick-off
gmatosferreira 93dd15d
#45 rt_collect methods scrap sleep time feedback improved
gmatosferreira 4f867be
#45 rt_extend_prioritization method implemented
gmatosferreira e44c708
#45 rt_extend_prioritization method adjustments
gmatosferreira 3494e25
#45 inst/extdata/gtfs_sources_pt extended with GTFS-RT urls (when exist)
gmatosferreira c7b59ae
#45 rt methods docs
gmatosferreira d2031c3
#45 vignette for rt
gmatosferreira dc87364
#45 vignette for prioritization extended with information on rt
gmatosferreira fcc1259
#45 rt vignette missing detail on rt_collect_protobuf
gmatosferreira ae2f2d8
#45 scrap typo corrected to scrape
gmatosferreira 1735ed2
Apply suggestions from code review
gmatosferreira 9055389
Merge branch '45-add-method-to-evaluate-traffic-dynamics' of github.c…
gmatosferreira 1336a26
#45 copilot reviewed corrections
gmatosferreira File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| #' Collect GTFS-RT data | ||
| #' | ||
| #' | ||
| #' @param gtfs_rt_url String. URL of the GTFS-RT feed in JSON format. | ||
| #' @param destination_file String. File to save the downloaded GTFS-RT data. Content is appended in each iteration. | ||
| #' @param header_key String (Default "header"). Key in the JSON corresponding to the feed header. Set to NA if not present. | ||
| #' @param entity_key String (Default "entity"). Key in the JSON corresponding to the feed entities. Set to NA if response is a flat list. | ||
| #' @param fields_collect Character vector. Fields to extract from each entity in the feed. | ||
| #' @param scrape_interval Integer (Default 60). Interval in seconds between each download. Negative to run only once. | ||
| #' @param log_file String (Optional). Path to a log file to save download logs. | ||
| #' | ||
| #' @details | ||
| #' Downloads GTFS-RT data from the specified URL at regular intervals and saves them to the destination file. | ||
| #' | ||
| #' This function will run indefinitely until manually stopped. | ||
| #' | ||
| #' | ||
| #' @examples | ||
| #' \dontrun{ | ||
| #' GTFShift::rt_collect("https://api.example.com/gtfs-rt", "gtfs_rt_data.csv") | ||
| #' } | ||
| #' | ||
| #' @import jsonlite | ||
| #' @import progress | ||
| #' | ||
| #' @export | ||
| rt_collect <- function( | ||
| gtfs_rt_url, destination_file, | ||
| header_key="header", # Optional | ||
| entity_key="entity", | ||
| fields_collect = c("id", "vehicle.trip.trip_id", "vehicle.position.latitude", "vehicle.position.longitude", "vehicle.position.speed", "vehicle.timestamp", "vehicle.current_status", "vehicle.current_stop_sequence", "vehicle.stop_id"), | ||
| scrape_interval = 60, log_file = NA | ||
| ) { | ||
| # Log script start | ||
| m = sprintf("[%s] Starting GTFS-RT data collection from %s", format(Sys.time(), "%Y-%m-%d %H:%M:%S"), gtfs_rt_url) | ||
| message(m) | ||
| if (!is.na(log_file)) cat(paste(m, "\n"), file = log_file, append = TRUE) | ||
|
|
||
| # Each scrape_interval seconds, download the GTFS-RT feed and save it to the destination folder | ||
| count = 0 | ||
| repeat { | ||
| count = count + 1 | ||
| timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S") | ||
| feed <- jsonlite::fromJSON(gtfs_rt_url) | ||
|
|
||
| if (!is.na(entity_key)) { | ||
| entities <- as.data.frame(feed[[entity_key]]) | ||
| } else { | ||
| entities <- feed | ||
| } | ||
|
|
||
| # For each field in fields_collect, extract the data and add it to the data frame | ||
| feed_df <- data.frame() | ||
| for (field in fields_collect) { | ||
| field_parts <- unlist(strsplit(field, "\\.")) | ||
| field_data <- entities | ||
| for (part in field_parts) { | ||
| if (part %in% names(field_data)) { | ||
| field_data <- field_data[[part]] | ||
| } else { | ||
| field_data <- NA | ||
| break | ||
| } | ||
| } | ||
| if (nrow(feed_df) == 0) { | ||
| feed_df <- data.frame(field_data) | ||
| names(feed_df) <- field | ||
| } else { | ||
| feed_df[[field]] <- field_data | ||
| } | ||
| } | ||
|
|
||
| if (!is.na(header_key)) { | ||
| header = feed[[header_key]] | ||
| if ("timestamp" %in% names(header)) { | ||
| feed_df$feed_timestamp <- header$timestamp | ||
| } | ||
| if ("incrementality" %in% names(header)) { | ||
| feed_df$feed_incrementality <- header$incrementality | ||
| } | ||
| } | ||
|
|
||
| write.table( | ||
| feed_df, | ||
| file = destination_file, | ||
| sep = ",", | ||
| row.names = FALSE, | ||
| col.names = !file.exists(destination_file), # only write header if file is new | ||
| append = TRUE | ||
| ) | ||
|
|
||
| m = sprintf("[%s] Iteration %d completed", timestamp, count) | ||
| message(m) | ||
| if (!is.na(log_file)) cat(paste(m, "\n"), file = log_file, append = TRUE) | ||
|
|
||
| # Wait for scrape_interval seconds before the next download | ||
| if (scrape_interval<0) { | ||
| break | ||
| } | ||
| interval_start <- Sys.time() | ||
| pb <- progress::progress_bar$new( # Track progress | ||
| format = "Sleeping [:bar] :percent :spin elapsed=:elapsed", | ||
| clear = FALSE, show_after=0 | ||
| ) | ||
| pb$update(0) | ||
| repeat { | ||
| elapsed_time <- as.numeric(difftime(Sys.time(), interval_start, units="secs")) | ||
| if (elapsed_time >= scrape_interval) break; | ||
| pb$update( min(elapsed_time / scrape_interval, 1) ); | ||
| Sys.sleep(0.1); | ||
| } | ||
| pb$update(1) | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,112 @@ | ||
| #' Collect GTFS-RT data (with Protocol Buffers support) | ||
| #' | ||
| #' | ||
| #' @param gtfs_rt_url String. URL of the Protocol Buffers GTFS-RT feed. | ||
| #' @param destination_file String. File to save the downloaded GTFS-RT data. Content is appended in each iteration. | ||
| #' @param fields_collect Character vector. Fields to extract from each entity in the feed. | ||
| #' @param scrape_interval Integer (Default 60). Interval in seconds between each download. Negative to run only once. | ||
| #' @param log_file String (Optional). Path to a log file to save download logs. | ||
| #' | ||
| #' @details | ||
| #' Downloads GTFS-RT data from the specified URL at regular intervals and saves them to the destination file. | ||
| #' | ||
| #' This function will run indefinitely until manually stopped. Each downloaded file is named with a timestamp to ensure uniqueness. | ||
| #' | ||
| #' | ||
| #' @examples | ||
| #' \dontrun{ | ||
| #' GTFShift::rt_collect_protobuf("https://api.example.com/gtfs-rt-protobuf", "gtfs_rt_data.csv") | ||
| #' } | ||
| #' | ||
| #' @import RProtoBuf | ||
| #' @import jsonlite | ||
| #' @import progress | ||
| #' | ||
| #' @export | ||
| rt_collect_protobuf <- function( | ||
| gtfs_rt_url, destination_file, | ||
| fields_collect = c("id", "vehicle.trip.trip_id", "vehicle.position.latitude", "vehicle.position.longitude", "vehicle.position.speed", "vehicle.timestamp", "vehicle.current_status", "vehicle.current_stop_sequence", "vehicle.stop_id"), | ||
| scrape_interval = 60, log_file = NA | ||
| ) { | ||
| # Log script start | ||
| m = sprintf("[%s] Starting GTFS-RT data collection from %s", format(Sys.time(), "%Y-%m-%d %H:%M:%S"), gtfs_rt_url) | ||
| message(m) | ||
| if (!is.na(log_file)) cat(paste(m, "\n"), file = log_file, append = TRUE) | ||
|
|
||
| # Each scrape_interval seconds, download the GTFS-RT feed and save it to the destination folder | ||
| count = 0 | ||
| repeat { | ||
| count = count + 1 | ||
| timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S") | ||
|
|
||
| # Load protobuf | ||
| RProtoBuf::readProtoFiles((system.file("extdata", "gtfs-realtime.proto", package = "GTFShift"))) | ||
| f <- file(gtfs_rt_url, "rb") | ||
| feed <- RProtoBuf::read(`transit_realtime.FeedMessage`, f) | ||
| close(f) | ||
|
|
||
| # Convert to R list | ||
| fields <- names(feed) | ||
|
|
||
| protobuf_to_list <- function(msg) { | ||
| if (!inherits(msg, "Message")) return(msg) | ||
|
|
||
| # get all fields | ||
| fields <- names(msg) | ||
|
|
||
| lapply(fields, function(f) { | ||
| value <- msg[[f]] | ||
|
|
||
| # recursively convert nested Message objects | ||
| if (inherits(value, "Message")) { | ||
| protobuf_to_list(value) | ||
| } else if (is.list(value)) { | ||
| lapply(value, protobuf_to_list) | ||
| } else { | ||
| value | ||
| } | ||
| }) |> setNames(fields) | ||
| } | ||
|
|
||
| feed_list <- protobuf_to_list(feed) | ||
| temp_json = tempfile(fileext = ".json") | ||
| write_json( | ||
| feed_list, | ||
| temp_json, | ||
| pretty = TRUE, | ||
| auto_unbox = TRUE | ||
| ) | ||
|
Comment on lines
+73
to
+78
|
||
|
|
||
| suppressMessages({ | ||
| rt_collect( | ||
| gtfs_rt_url = temp_json, | ||
| destination_file = destination_file, | ||
| fields_collect = fields_collect, | ||
| scrape_interval = -1, | ||
| log_file = NA | ||
| ) | ||
| }) | ||
|
|
||
| m = sprintf("[%s] Iteration %d completed", timestamp, count) | ||
| message(m) | ||
| if (!is.na(log_file)) cat(paste(m, "\n"), file = log_file, append = TRUE) | ||
|
|
||
| # Wait for scrape_interval seconds before the next download | ||
| if (scrape_interval<0) { | ||
| break | ||
| } | ||
| interval_start <- Sys.time() | ||
| pb <- progress::progress_bar$new( # Track progress | ||
| format = "Sleeping [:bar] :percent :spin elapsed=:elapsed", | ||
| clear = FALSE, show_after=0 | ||
| ) | ||
| pb$update(0) | ||
| repeat { | ||
| elapsed_time <- as.numeric(difftime(Sys.time(), interval_start, units="secs")) | ||
| if (elapsed_time >= scrape_interval) break; | ||
| pb$update( min(elapsed_time / scrape_interval, 1) ); | ||
| Sys.sleep(0.1); | ||
| } | ||
| pb$update(1) | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,138 @@ | ||
| #' Extend prioritization with GTFS-RT metrics | ||
| #' | ||
| #' This function extends lane segment indicators for prioritization with metrics produced with GTFS-RT data. | ||
| #' | ||
| #' @param lane_prioritization sf data.frame. Result of \code{GTFShift::prioritize_lanes()} | ||
| #' @param rt_collection sf data.frame. GTFS-RT data collection. Must include \code{speed} column. | ||
| #' @param lane_buffer numeric (Default 15). Buffer distance (in meters) to create around lane segments to capture nearby GTFS-RT points. | ||
| #' | ||
| #' @details | ||
| #' Extends the \code{lane_prioritization} data with speed metrics calculated from the GTFS-RT data points that fall within a buffer around each lane segment. | ||
| #' | ||
| #' Refer to \code{GTFShift::rt_collect()} for details on GTFS-RT data collection. | ||
| #' | ||
| #' | ||
| #' @returns The \code{lane_prioritization} \code{sf} \code{data.frame}, extended with the following columns: | ||
| #' \itemize{ | ||
| #' \item \code{speed_avg}, the average speed of the vehicles on the way. | ||
| #' \item \code{speed_median}, the median speed of the vehicles on the way. | ||
| #' \item \code{speed_p25}, the 25th percentile speed of the vehicles on the way. | ||
| #' \item \code{speed_p75}, the 75th percentile speed of the vehicles on the way. | ||
| #' \item \code{speed_count}, the number of speed observations on the way. | ||
| #' } | ||
| #' | ||
| #' @examples | ||
| #' \dontrun{ | ||
| #' rt_collect_file <- "gtfs_rt_data.csv" | ||
| #' GTFShift::rt_collect("https://api.example.com/gtfs-rt", rt_collect_file) | ||
| #' lane_prioritization <- GTFShift::prioritize_lanes(gtfs, osm_query) | ||
| #' | ||
| #' rt_collection <- read.csv(rt_collect_file) |> sf::st_as_sf(coords = c("longitude", "latitude"), crs = 4326) | ||
| #' lane_prioritization_extended <- GTFShift::rt_extend_prioritization( | ||
| #' lane_prioritization = lane_prioritization, | ||
| #' rt_collection = rt_collection | ||
| #' ) | ||
| #' } | ||
| #' | ||
| #' @import progress | ||
| #' @import dplyr | ||
|
gmatosferreira marked this conversation as resolved.
|
||
| #' @import callr | ||
| #' | ||
| #' @export | ||
| rt_extend_prioritization <- function( | ||
| lane_prioritization, | ||
| rt_collection, | ||
| lane_buffer = 15 # in meters | ||
| ) { | ||
| # 1. Validate inputs | ||
| required_cols = c("way_osm_id", "geometry") | ||
| missing_cols = setdiff(required_cols, colnames(lane_prioritization)) | ||
| if (length(missing_cols) > 0) { | ||
| stop(paste("lane_prioritization is missing required columns:", paste(missing_cols, collapse = ", "))) | ||
| } | ||
| rt_attr_speed = "speed" | ||
| required_rt_cols = c(rt_attr_speed) | ||
| missing_rt_cols = setdiff(required_rt_cols, colnames(rt_collection)) | ||
| if (length(missing_rt_cols) > 0) { | ||
| stop(paste("rt_collection is missing required columns:", paste(missing_rt_cols, collapse = ", "))) | ||
| } | ||
|
|
||
| # Display feedback | ||
| pb <- progress::progress_bar$new( # Track progress | ||
| format = "Extending prioritization with GTFS-RT metrics [:bar] :percent :spin elapsed=:elapsed", | ||
| clear = FALSE, show_after=0 | ||
| ) | ||
| pb$update(0) | ||
|
|
||
| # 2. Get only updates IN_TRANSIT | ||
| if ("current_status" %in% colnames(rt_collection)) { | ||
| rt_collection <- rt_collection %>% | ||
| dplyr::filter(current_status == "IN_TRANSIT_TO") | ||
| } | ||
| pb$update(0.166) | ||
|
|
||
| # 3. Get unique lane segments (to optimize spatial join) | ||
| job <- callr::r_bg(function(lane_prioritization) { # update spinner while blocking method call | ||
| library(sf) | ||
| return(lane_prioritization |> | ||
| dplyr::distinct(way_osm_id, .keep_all = TRUE) |> | ||
| dplyr::select(way_osm_id)) | ||
| }, args=list(lane_prioritization)) | ||
| while (job$is_alive()) { pb$tick(0); Sys.sleep(0.1) } | ||
| lanes_unique <- job$get_result() | ||
| pb$update(0.333) | ||
|
|
||
| # 4. Create buffers in lane segments to overlap with updates | ||
| job <- callr::r_bg(function(lanes_unique, lane_buffer) { # update spinner while blocking method call | ||
| return(sf::st_buffer( | ||
| sf::st_transform(lanes_unique, crs=3857), | ||
| dist=lane_buffer | ||
| ) |> sf::st_transform(crs=sf::st_crs(lanes_unique))) | ||
| }, args=list(lanes_unique, lane_buffer)) | ||
| while (job$is_alive()) { pb$tick(0); Sys.sleep(0.1) } | ||
| lane_buffers <- job$get_result() | ||
| pb$update(0.5) | ||
|
|
||
| # 4. Spatial join between lane buffers and rt_collection points | ||
|
gmatosferreira marked this conversation as resolved.
|
||
| job <- callr::r_bg(function(rt_collection, lane_buffers) { # update spinner while blocking method call | ||
| return(sf::st_join( | ||
| rt_collection, | ||
| lane_buffers |> dplyr::select(way_osm_id), | ||
| left = FALSE, | ||
| join = sf::st_within | ||
| ) |> sf::st_drop_geometry()) | ||
| }, args=list(rt_collection, lane_buffers)) | ||
| while (job$is_alive()) { pb$tick(0); Sys.sleep(0.1) } | ||
| overlap <- job$get_result() | ||
| pb$update(0.666) | ||
|
|
||
| # 5. Aggregate speed metrics by way_osm_id | ||
| job <- callr::r_bg(function(overlap, rt_attr_speed) { # update spinner while blocking method call | ||
| return(overlap |> | ||
| dplyr::group_by(way_osm_id) |> | ||
| dplyr::summarise( | ||
| speed_avg = mean(.data[[rt_attr_speed]], na.rm = TRUE), | ||
| speed_median = stats::median(.data[[rt_attr_speed]], na.rm = TRUE), | ||
| speed_p25 = stats::quantile(.data[[rt_attr_speed]], probs = 0.25, na.rm = TRUE), | ||
| speed_p75 = stats::quantile(.data[[rt_attr_speed]], probs = 0.75, na.rm = TRUE), | ||
| speed_count = dplyr::n() | ||
| ) |> | ||
| dplyr::ungroup()) | ||
| }, args=list(overlap, rt_attr_speed)) | ||
| while (job$is_alive()) { pb$tick(0); Sys.sleep(0.1) } | ||
| speed_metrics <- job$get_result() | ||
| pb$update(0.833) | ||
|
|
||
| # 6. Join speed metrics back to lane_prioritization | ||
| job <- callr::r_bg(function(lane_prioritization, speed_metrics) { # update spinner while blocking method call | ||
| library(sf) | ||
| return(lane_prioritization |> | ||
| dplyr::left_join(speed_metrics, by = "way_osm_id")) | ||
| }, args=list(lane_prioritization, speed_metrics)) | ||
| while (job$is_alive()) { pb$tick(0); Sys.sleep(0.1) } | ||
| lane_prioritization_extended <- job$get_result() | ||
| pb$update(1) | ||
| pb$terminate() | ||
|
|
||
| return(lane_prioritization_extended) | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.