Skip to content

changed some functions to speed up parsing#7

Merged
joeroe merged 3 commits intomasterfrom
speed_up_parsing
Feb 5, 2025
Merged

changed some functions to speed up parsing#7
joeroe merged 3 commits intomasterfrom
speed_up_parsing

Conversation

@MartinHinz
Copy link

Replaced tidyverse functionality (map) with more base R functions. Code is not as nice as before, but significantly faster.

library(microbenchmark)

mbm <- microbenchmark(
  new = xronos_parse(response),
  old = xronos:::xronos_parse(response)
)

image

Copy link
Contributor

@joeroe joeroe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance, this does raise a couple of issues:

  • #8
  • Up until now we've avoided hard-coding (apart from measurement, period, typochronology_unit, and ecochronological_unit, which was a workaround for the nested API response). Since this lists them all explicitly as expected columns, we would need to update the R package whenever they change in the API response, and any workflows relying on previous versions of the package would break.

@MartinHinz
Copy link
Author

MartinHinz commented Jun 10, 2024

Thanks for pointing this out! I am happy to take back these two adjustments, as the following reasons were originally behind them:

  • Progress Bar: Since loading and processing the data took so long, I wanted to give the user feedback that things are still running. Maybe you can find a more elegant solution for this? Until then, I will remove it again in the next commit
  • The hardcoded list was introduced to align the column order with that which was previously output from the package. If this order has no meaning, then there is no reason why the columns have to be rearranged, so it can be removed.

I will produce another update to the PR in a moment.

However, this solution still does not create a completely flat structure, as the periods and the associated columns are still nested, even when the references are not nested!

@joeroe joeroe merged commit 5d7c219 into master Feb 5, 2025
6 of 7 checks passed
@joeroe joeroe deleted the speed_up_parsing branch February 5, 2025 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants