-
Notifications
You must be signed in to change notification settings - Fork 10
Expand file tree
/
Copy pathREADME.Rmd
More file actions
145 lines (103 loc) · 5.78 KB
/
README.Rmd
File metadata and controls
145 lines (103 loc) · 5.78 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
---
output: github_document
---
# nycOpenData <img src="man/figures/nycOpenData_hex.png" alt="nycOpenData logo" width="72" align="right" />
[](https://CRAN.R-project.org/package=nycOpenData)
[](https://r-pkg.org/pkg/nycOpenData)
[](https://lifecycle.r-lib.org/articles/stages.html)
[](https://www.repostatus.org/#active)
[](https://app.codecov.io/gh/martinezc1/nycOpenData)
[](https://www.r-bloggers.com/2026/01/nycopendata-a-unified-r-interface-to-nyc-open-data-apis/)
[](https://rweekly.org/#RintheRealWorld)
[](https://github.com/martinezc1/nycOpenData/actions/workflows/R-CMD-check.yaml)
`nycOpenData` provides simple, reproducible access to datasets from the
[NYC Open Data](https://opendata.cityofnewyork.us/) platform — directly from R,
with **no API keys** or manual downloads required. The package is available on
**CRAN**.
Version **0.2.1** introduces a streamlined, catalog-driven interface for NYC Open Data.
Instead of maintaining dozens of individual dataset wrappers, the package now provides three core functions:
- `nyc_list_datasets()` — Browse available datasets from the live NYC Open Data catalog
- `nyc_pull_dataset()` — Pull any cataloged dataset by key, with filtering, ordering, and optional date controls
- `nyc_any_dataset()` — Pull any NYC Open Data dataset directly via its Socrata JSON endpoint
The catalog currently includes 30+ curated NYC Open Data datasets, covering topics such as:
- 311 Service Requests
- For-Hire Vehicles (FHV)
- Juvenile Justice (rearrest rates + caseloads)
- School Discharge Reporting
- Violent & Disruptive School Incidents
- Detention Admissions
- Borough/Community District Reports
- Street Tree Census
- Urban Park Ranger Animal Condition Responses
- Permitted Events (Historical)
- and more
Datasets pulled via `nyc_pull_dataset()` automatically apply sensible defaults from the catalog (such as default ordering and date fields), while still allowing user control over:
- limit
- filters
- date / from / to
- where
- order
- clean_names
- coerce_types
This redesign reduces maintenance burden, improves extensibility, and provides a more scalable interface for working with NYC Open Data.
All functions return clean **tibble** outputs and support filtering via
`filters = list(field = "value")`.
---
## Installation
### From **CRAN**
```r
install.packages("nycOpenData")
```
### Development version (GitHub)
```r
devtools::install_github("martinezc1/nycOpenData")
```
---
## Example
```{r}
library(nycOpenData)
# Get 5,000 most recent 311 requests
data <- nyc_pull_dataset(key = "nyc_311", limit = 5000)
# Filter by agency and city
filtered <- nyc_pull_dataset(
key = "nyc_311",
limit = 2000,
filters = list(agency = "NYPD", city = "BROOKLYN")
)
head(filtered)
```
---
## Learn by example
- `vignette("nyc-311", package = "nycOpenData")` – Working with NYC 311 data end-to-end
## About
`nycOpenData` makes New York City's civic datasets accessible to students,
educators, analysts, and researchers through a unified and user-friendly R interface.
Developed to support reproducible research, open-data literacy, and real-world analysis.
---
## Comparison to Other Software
While the [`RSocrata`](https://CRAN.R-project.org/package=RSocrata) package provides a general interface for any Socrata-backed portal, `nycOpenData` is specifically tailored for the New York City ecosystem.
- **Ease of Use**: No need to hunt for 4x4 dataset IDs (e.g., `erm2-nwe9`); use `nyc_pull_dataset()` with a human-readable catalog key.
- **Pre-configured Logic**: Wrappers include default sorting (e.g., `created_date DESC`) and optimized limit handling specific to NYC’s massive data volumes.
- **Open Literacy**: Designed specifically for students and researchers to lower the barrier to entry for civic data analysis.
---
## Contributing
We welcome contributions! If you find a bug or would like to request a wrapper for a specific NYC dataset, please open an issue or submit a pull request on [GitHub](https://github.com/martinezc1/nycOpenData).
---
## Authors & Contributors
### Maintainer
**Christian A. Martinez** 📧 [c.martinez0@outlook.com](mailto:c.martinez0@outlook.com)
GitHub: [@martinezc1](https://github.com/martinezc1)
### ✨ Contributors
Special thanks to the students of **PSYC 7750G – Reproducible Psychological Research** at Brooklyn College (CUNY) who have contributed functions and documentation:
* **Crystal Adote** ([@crystalna20](https://github.com/crystalna20))
* **Jonah Dratfield** ([@jdratfield38](https://github.com/jdratfield38))
* **Joyce Escatel-Flores** ([@JoyceEscatel](https://github.com/JoyceEscatel))
* **Rob Hutto** ([@robhutto](https://github.com/robhutto))
* **Isley Jean-Pierre** ([@ijpier](https://github.com/ijpier))
* **Shannon Joyce** ([@shannonjoyce](https://github.com/shannonjoyce))
* **Laura Rose-Werner** ([@laurarosewerner](https://github.com/laurarosewerner))
* **Emma Tupone** ([@emmatup0205](https://github.com/emmatup0205))
* **Xinru Wang** ([@xrwangxr](https://github.com/xrwangxr))
---
## Academic Context
This package is developed as a primary pedagogical tool for teaching data acquisition and open science practices at **Brooklyn College, City University of New York (CUNY)**.