Skip to content

Discuss DSL Design for R Client Translating dplyr Verbs to ODSQL Syntax #2

@ramiromagno

Description

@ramiromagno

DSL interface based on dplyr verbs

Before jumping into implementation, I'd like to start a design discussion around the idea of offering a DSL interface based on dplyr verbs, which would get translated internally into valid ODSQL query strings.

ODSQL is a SQL-inspired, URL-encoded query language, but using it directly via httr2 or manually building strings can be error-prone and unidiomatic for R users.

By designing a DSL that mirrors the tidyverse's data manipulation grammar, we can let users write code like this:

odsql("dataset_id") |>
  select(city, population) |>
  filter(population > 1000000) |>
  arrange(desc(population)) |>
  slice_head(n = 5) |>
  collect()

Which would internally generate:

GET /api/explore/v2.1/catalog/datasets/dataset_id/records?
  select=city,population&
  where=population>1000000&
  order_by=population desc&
  limit=5

Design Questions

  1. Core verbs: Which dplyr verbs do we want to support out-of-the-box? Obvious candidates are:
  • select()
  • filter()
  • group_by() + summarise()
  • arrange()
  • slice_head() for limit= possibly mutate() for things like year = year(date)
  1. Custom expression translation: How do we handle things like startsWith(), in(), and date/time functions? Should we reimplement a small expression parser using {rlang}?

  2. S3 query object: Should we introduce a class like odsql_query that holds the pipeline before it's collected via collect()?

  3. Caching, error handling, metadata introspection: What should be built into the request pipeline vs delegated to user code?

Precedents

This kind of DSL translation has good grounding in the R ecosystem:

  • {dbplyr} translates dplyr verbs to SQL for remote databases.
  • {httr2} itself uses a piped DSL to build HTTP requests.

Worth looking into

  • {sos4R}
  • {ckanr}
  • {eurostat}

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions