-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathdata-format.Rmd
More file actions
35 lines (26 loc) · 2.35 KB
/
data-format.Rmd
File metadata and controls
35 lines (26 loc) · 2.35 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# (PART) Covid19R Standards {-}
# Data Format Standard
The minimal tidy data format that all packages need to incorporate looks like the following. This is a bare-bones minimum. Additional data columns are possible and welcomed, but, for inter-operability, the following are required.
```{r data_table, echo = FALSE}
tab <-
"|date |location |location_type |location_code |location_code_type |data_type | value|
|:----------|:----------|:---------|:-----------------|:----------------------|:-----------|-----:|
|2020-01-21 |Washington |state |53 |fips_code |cases_total | 1|
|2020-01-21 |Washington |state |53 |fips_code |deaths_total | 0|
|2020-01-22 |Washington |state |53 |fips_code |cases_total | 1|
|2020-01-22 |Washington |state |53 |fips_code |deaths_total | 0|
|2020-01-23 |Washington |state |53 |fips_code |cases_total | 1|
|2020-01-23 |Washington |state |53 |fips_code |deaths_total | 0|
"
knitr::kable(tab, format = "html") %>%
kableExtra::kable_styling(font_size=11) %>%
kableExtra::scroll_box(width = "100%", box_css = "border: 0px;")
```
The data columns are as follows:
* date - The date in YYYY-MM-DD form
* location - The name of the location as provided by the data source. The counties dataset provides county and state. They are combined and separated by a `,`, and can be split by `tidyr::separate()`, if you wish.
* location_type - The type of location using the covid19R controlled vocabulary. Nested locations are indicated by multiple location types being combined with a `_
* location_code - A standardized location code using a national or international standard. In this case, FIPS state or county codes. See https://en.wikipedia.org/wiki/Federal_Information_Processing_Standard_state_code and https://en.wikipedia.org/wiki/FIPS_county_code for more
* location_code_type The type of standardized location code being used according to the covid19R controlled vocabulary. Here we use `fips_code`
* data_type - the type of data in that given row. Includes `total_cases` and `total_deaths`, cumulative measures of both.
* value - number of cases of each data type