forked from r-dbi/bigrquery
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
116 lines (85 loc) · 3.77 KB
/
README.Rmd
File metadata and controls
116 lines (85 loc) · 3.77 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# bigrquery
[](https://travis-ci.org/r-dbi/bigrquery)
[](https://cran.r-project.org/package=bigrquery)
[](https://codecov.io/github/r-dbi/bigrquery?branch=master)
The bigrquery packages provides an interface to
[Google BigQuery](https://developers.google.com/bigquery/) from R. It makes it easy
to retrieve metadata about your projects, datasets, tables and jobs, and
provides a convenient wrapper for working with bigquery from R.
## Installation
The current bigrquery release can be installed from CRAN:
```R
install.packages("bigrquery")
```
The newest development release can be installed from github:
```R
# install.packages('devtools')
devtools::install_github("r-dbi/bigrquery")
```
## Authentication
The first time you use bigrquery in a session, it will ask you to
[authorize bigrquery](https://developers.google.com/bigquery/authorization) in
the browser. This gives bigrquery the credentials to access data on your
behalf. By default, bigrquery picks up [httr's](http://github.com/hadley/httr)
policy of caching per-working-directory credentials in `.httr-oauth`.
Note that `bigrquery` requests permission to modify your data; in general, the
only data created or modified by `bigrquery` are the temporary tables created
as query results, unless you explicitly modify your own data (say by calling
`bq_table_delete()` or `bq_table_upload()`).
## Sample data and a billing project
If you just want to play around with the bigquery API, it's easiest to start
with the Google's free
[sample data](https://developers.google.com/bigquery/docs/sample-tables). To
do that, you'll also need to create your own project for billing purposes. If
you're just playing around, it's unlikely that you'll go over the 10,000
request/day free limit, but google still needs a project that it can bill (you
don't even need to provide a credit card).
To create a project:
1. Open https://console.cloud.google.com/ and create a project.
Make a note of the "Project ID" in the "Project info" box.
1. Click on "APIs & Services", then "Dashboard" in the left the left menu.
1. Click on "Enable Apis and Services" at the top of the page,
then search for "BigQuery API" and "Cloud storage".
## Layers
### Low-level API
To run your first query:
```{r}
library(bigrquery)
billing <- bq_test_project() # replace this with your project ID
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality` LIMIT 5"
bq_project_query(billing, sql)
```
## DBI
For a more traditional database connection using `dbConnect()`, the driver to use is `bigrquery::dbi_driver()`. You will need to supply two arguments: `project` and `dataset`. If this is a public data, you'll also need to specific your own `billing` project for google to charge.
```{r}
con <- DBI::dbConnect(
bigrquery::bigquery(),
project = "bigquery-public-data",
dataset = "noaa_gsod",
billing = bq_test_project()
)
con
head(dbListTables(con))
```
### dplyr
You can also use this connection with dplyr. The `stations` table can then be referred to using `tbl()`
```{r}
dplyr::tbl(con, "stations")
```
## Useful links
* [SQL reference](https://developers.google.com/bigquery/query-reference)
* [API reference](https://developers.google.com/bigquery/docs/reference/v2/)
* [Query/job console](https://bigquery.cloud.google.com/)
* [Billing console](https://console.cloud.google.com/)