-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
502 lines (397 loc) · 18 KB
/
README.Rmd
File metadata and controls
502 lines (397 loc) · 18 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# MMASN: Multi-scale Morphometric Analysis of Stream Networks
<!-- badges: start -->
<!-- badges: end -->
The goal of MMASN is to facilitate morphometric analysis of stream networks and provide landscape metrics that quantify spatial pattern of habitat patches in dendritic networks. These functions rely on data provide by the [National Hydrography Dataset Plus Version 2; NHDPlusv2](https://www.epa.gov/waterdata/get-nhdplus-national-hydrography-dataset-plus-data).
# Installation
This Package is still under development but can be installed from [GitHub](https://github.com/):
``` r
# install.packages("devtools")
devtools::install_github("dkopp3/MMASN")
```
# Example
Here, I download data from the NHDPlusV2 and optional data sources. I then use MMASN to measure watershed-, NHDPlus COMID- and reach- scale morphometry and demonstrate how these values can be used to identify habitat patches in stream networks with hierarchical clustering. Finally, I quantify the spatial pattern of these habitat patches using landscape metrics developed for dendritic stream networks.
```{r load packages}
library(MMASN)
```
```{r results='hide', message=FALSE, warning=FALSE}
#Other packages for the example
library(archive) #7zip software workaround
library(foreign) #reads .dbf files
library(sf) #workhorse
library(lwgeom) #line stuff
library(raster) #external data
library(cluster) #patch identification
library(tidyverse) #landscape metrics #tidyverse masks tidygraph, load first
library(igraph) #landscape metrics
library(tidygraph) #landscape metrics
library(ggplot2)#lazy plots :)
library(gridExtra)
```
# Data Preparation
## Download data
NHDPlusv2 (Required, digital stream network and catchments)
```{r eval = F}
#use NHDPlusV2 data from VPU 17
#https://www.epa.gov/waterdata/nhdplus-pacific-northwest-data-vector-processing-unit-17
#Flowlines - hydropgraphy
download.file("https://s3.amazonaws.com/edap-nhdplus/NHDPlusV21/Data/NHDPlusPN/NHDPlusV21_PN_17_NHDSnapshot_08.7z",
"Data/NHDPLUSV2/NHDPlusV21_PN_17_NHDSnapshot_08.7z", mode = "wb")
archive_extract(archive("Data/NHDPLUSV2/NHDPlusV21_PN_17_NHDSnapshot_08.7z"), dir = "Data/NHDPLUSV2")
#Value added attribute table
#PlusFlow table - navigation
download.file("https://s3.amazonaws.com/edap-nhdplus/NHDPlusV21/Data/NHDPlusPN/NHDPlusV21_PN_17_NHDPlusAttributes_10.7z",
"Data/NHDPLUSV2/NHDPlusV21_PN_17_NHDPlusAttributes_10.7z", mode = "wb")
archive_extract(archive("Data/NHDPLUSV2/NHDPlusV21_PN_17_NHDPlusAttributes_10.7z"), dir = "Data/NHDPLUSV2")
#catchment files were unavailable at NHDPlusV2 page
#download.file(mode = "wb")
#archive_extract(archive(), dir = "Data/NHDPLUSV2")
```
StreamCat (optional, COMID Scale values)
```{r eval = F}
#download StreamCat (filename = NULL, downloads all files)
#https://www.epa.gov/national-aquatic-resource-surveys/streamcat-metrics-and-definitions
StreamCat_download(path = "Data/StreamCat", vpu = "17", filename = "NRSA_PredictedBioCondition")
StreamCat_download(path = "Data/StreamCat", vpu = "17", filename = "Lithology")
StreamCat_download(path = "Data/StreamCat", vpu = "17", filename = "NLCD2006RipBuf100_Region")
```
Average accumulated growing degree days (Optional, Reach scale dataset values)
```{r eval = F}
#download Ancillary data - Accumulated Degree days >32degF
#https://www.usanpn.org/data/agdd_maps
url <- paste0("http://geoserver.usanpn.org/geoserver/gdd/wcs?service=WCS&",
"version=2.0.1&",
"request=GetCoverage&CoverageId=gdd:30yr_avg_agdd&",
"subset=http://www.opengis.net/def/axis/OGC/0/elevation(", 365,")&",
"format=image/geotiff")
download.file(url, destfile = paste0("Data/AGDD_30yrAVG_", 365) , method = "libcurl", mode = "wb")
```
Stream network outlets (Required)
```{r eval = F}
#download Sampling locations from the National Rivers and Streams Assessment
download.file("https://www.epa.gov/sites/production/files/2021-04/nrsa_1819_site_information_-_data.csv",
"Data/nrsa_1819_site_information_-_data.csv", mode = "wb")
```
## Read NHDPlusV2 Files
```{r eval = F}
flow = foreign::read.dbf("Data/NHDPLUSV2/NHDPlusPN/NHDPlus17/NHDPlusAttributes/PlusFlow.dbf")
vaa = foreign::read.dbf("Data/NHDPLUSV2/NHDPlusPN/NHDPlus17/NHDPlusAttributes/PlusFlowlineVAA.dbf")
elevslope = foreign::read.dbf("Data/NHDPLUSV2/NHDPlusPN/NHDPlus17/NHDPlusAttributes/elevslope.dbf")
NHDFlowline = sf::read_sf("Data/NHDPLUSV2/NHDPlusPN/NHDPlus17/NHDSnapshot/Hydrography/NHDFlowline.shp")
NHDFlowline <- sf::st_zm(NHDFlowline, what = "ZM", drop = T)
NHDCatchments <- sf::read_sf("Data/NHDPLUSV2/NHDPlusPN/NHDPlus17/NHDPlusCatchment/Catchment.shp")
```
## Select stream network outlets
```{r eval = F}
#read in file
outlets <- read.csv("Data/nrsa_1819_site_information_-_data.csv")
#select unique values
outlets <- unique(subset(outlets,
select = c("UNIQUE_ID", "COMID",
"LON_DD83","LAT_DD83")))
#create sf object
outlets <- sf::st_as_sf(outlets,
coords = c("LON_DD83","LAT_DD83"),
crs = 4269)
```
```{r eval = F}
#read in NHDPlusV2 vector processing unit (vpu)
vpu_poly <- sf::read_sf("Data/VPU_NAD83.shp")
#associate outlets with vpu polugons
vpu_ls <- sf::st_contains(st_transform(vpu_poly, crs = 5070),
st_transform(outlets, crs = 5070))
names(vpu_ls) <- vpu_poly$VPUID
#select outlets in vpu 17
outlets <- outlets[vpu_ls[["17"]],]
```
```{r eval = F}
#quick plot
ggplot()+
geom_sf(data = vpu_poly, color = "blue") +
geom_sf(data = outlets, color = "black")
```
## Find Nearest NHDPlus COMID
In many instances the sampling point may not completely align with the NHDPlusV2 flowline. The find_comid() function returns a list of length outlets, with each element containing the COMID's within a given buffer distance.
```{r eval = F}
#returns all nhdplus COMID within the maxdist buffer
comids <- find_comid(pts = outlets, NHDFlowline, maxdist = 200)
#select closest COMID for this ecample
comids <- do.call(rbind, lapply(comids, function(x) x[which.min(x$dist_m), ]))
#assign COMID value
outlets$COMID[comids$index] <- comids$COMID
head(comids)
```
## Delineate Upstream Network
net_delin extracts all the COMID's upstream of a given COMID. In this example we delineate the stream network upstream of two comids
```{r eval = F}
upstr_comid_1 <- net_delin(comid = 24423157, flow = flow, vaa = vaa)
upstr_comid_2 <- net_delin(comid = 947100116, flow = flow, vaa = vaa)
```
```{r eval = F}
#quick plot of Stream Networks
#select outlets
sites_1 <- outlets[outlets$COMID == 24423157, ]
sites_2 <- outlets[outlets$COMID == 947100116, ]
#extract flowlines from NHDPlus via simple matching
network_1 <- NHDFlowline[NHDFlowline$COMID%in%upstr_comid_1, ]
network_2 <- NHDFlowline[NHDFlowline$COMID%in%upstr_comid_2, ]
#plot
p1 <- ggplot()+
geom_sf(data = network_1, color = "blue") +
geom_sf(data = sites_1, color = "black")
p2<-ggplot()+
geom_sf(data = network_2, color = "blue")+
geom_sf(data = sites_2, color = "black")
#install.packages("gridExtra")
gridExtra::grid.arrange(p1, p2, nrow = 1)
```
# Watershed Scale Morphometry
Watershed scale morphometry refers to indices that quantify the shape of the watershed (i.e. land area surrounding the entire stream network) or the stream network itself.
## Basin Shape
Basin Shape includes:
Compactness coefficient (CmpC), defined as the ratio of the watershed perimeter to the circumference of equivalent circular area.
Elongation ratio (ElnR), defined as the ratio of diameter of a circle of the same area as the watershed to the maximum watershed length. The numerical value varies from 0 (in highly elongated shape) to 1 (in circular shape).
Circulatory ratio (CrcR), defined as the ratio of watershed area to the area of the circle having the same perimeter as the watershed perimeter. The numeric value may vary in between 0 (in line) and 1 (in a circle). [see for more information](http://www.jnkvv.org/PDF/04042020192203Geomorphology%20of%20Watershed.pdf)
```{r eval =F}
#calculate basin shape metrics
cats_1 <- cat_shp(network = network_1, NHDCatchments)
cats_2 <- cat_shp(network = network_2, NHDCatchments)
cats_1
```
## Stream Network Shape
Stream network shape includes:
Bifurcation Ratio (Rb) is defined as the ratio of number of streams of a particular order (Nu) to number of streams of next higher order (Nu+i).
Length Ratio (Rl) is defined as the ratio of mean stream length (Lu) of a particular stream order to mean stream length of the next lower order (Lu-1).
Area Ratio is defined as the ratio of mean catchment area (Lu) of a particular stream order to mean catchment area of the next lower order (Lu-1).
These values are estimated from the geometric relationship between the ratio and a given order.
```{r eval = F}
#calcualte stream network characteristics for two stream networks
netshp_1 <- net_shp(network = network_1, vaa = vaa)
netshp_2 <- net_shp(network = network_2, vaa = vaa)
netshp_1
```
```{r eval = F}
# Quick plot of watersheds
p1 <- ggplot()+
geom_sf(data = cats_1, color = "green")+
geom_sf(data = network_1, color = "blue") +
geom_sf(data = sites_1, color = "black")
p2 <- ggplot()+
geom_sf(data = cats_2, color = "green")+
geom_sf(data = network_2, color = "blue")+
geom_sf(data = sites_2, color = "black")
#install.packages("gridExtra")
gridExtra::grid.arrange(p1, p2, nrow = 1)
```
```{r eval = F}
# Combine watershed scale morphometry into dataframe for later use
WS_SCALE <- data.frame(rbind(
cbind(st_set_geometry(cats_1, NULL), netshp_1),
cbind(st_set_geometry(cats_2, NULL), netshp_2)),
row.names = NULL)
head(WS_SCALE)
```
# COMID Scale Metrics
COMID scale metrics refer to attributes of individual stream channels
## Channel Shape
channel shape includes the mean elevation and slope of the COMID provided by NHDPlusV2 and the sinuosity. Sinuosity is measured as the length of the stream channel divided by the straight line distance between the COMID inlet/outlet.
```{r eval = F}
planform <- chn_geom(network = network_1, elevslope = elevslope)
#drop the NHDPlusV2 smoothed values
planform <- subset(planform, select = -c(MAXELEVSMO, MINELEVSMO))
head(planform)
```
## Confluence Attributes
Confluence attributes include:
Tributary angle defined as the angle that the tributaty and mainstem meet
Area ratio is the catchment area of the tributary divided by the catchment area of the mainstem
Confluence area is the catchment area upstream of the confluence
Confluence order is the stream order directly downstream of the confluence
Confluence class is the concatenation of the tributary and mainstem stream orders
```{r eval = F}
confl_attrs <- net_confl(network = network_1, vaa = vaa, flow = flow)
head(confl_attrs)
```
```{r eval = F}
#quick plot of confluence
ggplot()+
geom_sf(data = cats_1, color = "green")+
geom_sf(data = network_1, color = "blue") +
geom_sf(data = network_1[network_1$COMID %in%confl_attrs$COMID,], color = "black")+
geom_sf(data = network_1[network_1$COMID %in% sites_1$COMID,], color = "purple", lwd = 1.25)+
geom_sf(data = sites_1, color = "black")
confl_attrs[confl_attrs$COMID==sites_1$COMID, ]
```
## Adding StreamCat Data
The get StreamCat function extracts values from a given StreamCat data table for all COMIDs within a network
```{r eval = F}
#get lithology values for each COMID in the network
lithology <- get_streamcat(comid = network_1$COMID,
strcat_path = "Data/StreamCat",
csvfile = "Lithology", vpu = "17")
#clean output: select columns with values
#many lithology values are 0 for all COMIDs
lithology <- subset(lithology,
select = c("COMID", grep("Cat", names(lithology), value = T)))
lithology <- lithology[,apply(lithology, 2, sum, na.rm = T)>0]
```
```{r eval = F}
#get predicted biological condition
biocond <- get_streamcat(comid = network_1$COMID,
strcat_path = "Data/StreamCat",
csvfile = "NRSA_PredictedBioCondition",
vpu = "17")
biocond <- biocond[,c("COMID", "prG_BMMI")]
```
```{r eval = F}
#clean COMID scale data
COMID_SCALE <- Reduce(function(x,y) merge(x, y, by = "COMID", all.x = T),
list(planform, confl_attrs, lithology, biocond))
#some reaches are not directly downstream of confluence
x<-COMID_SCALE[,names(confl_attrs)]
#suppress invalid factor level on tributary class
suppressWarnings(x[is.na(x)]<-0)
#update levels
levels(x$Confl_class)<-c(levels(x$Confl_class), "0.0")
x$Confl_class[is.na(x$Confl_class)] <- "0.0"
COMID_SCALE[,names(confl_attrs)]<-x
rm(x)
head(COMID_SCALE)
```
# Reach scale metrics
In some instances users may want to use data at a finer resolution than COMID. We define reaches as divisions of NHPLUS Comids and provide a function that splits the COMIDs within a stream network into equal parts, identifies the point and creates lateral transects. These sf objects (reaches, midpoints, and transects) can be used to extract new data at finer resolutions.
## Creating Stream Network Reaches
```{r eval = F}
#split NHDPlusV2 COMIDs into 2 reaches (i.e. n=2)
net_rch <- net_sample(network = network_1, vaa = vaa, n = 2, what = "reaches", keep.divergent = F)
#create point at midpoint of reaches
net_pts <- net_sample(network_1, vaa, n = 2, what = "midpoints", keep.divergent = F)
#create lateral transects at each midpoint
net_trn <- net_sample(network_1, vaa, n = 2, what = "transects",
NHDCatchments = NHDCatchments, keep.divergent = F)
```
```{r eval = F}
#quick plot of watersheds
p1 <- ggplot()+
geom_sf(data = cats_1, color = "green")+
geom_sf(data = net_rch, color = "blue") +
geom_sf(data = net_pts, color = "black")
p2 <- ggplot()+
geom_sf(data = cats_1, color = "green")+
geom_sf(data = net_rch, color = "blue") +
geom_sf(data = net_trn, color = "black")
#install.packages("gridExtra")
gridExtra::grid.arrange(p1, p2, nrow = 1)
```
## extract data at midpoints and along transects
```{r eval = F}
#Use annual growing degree days from the National Phenology Network
r <- raster("Data/AGDD_30yrAVG_365")
agdd = raster::extract(r,
st_coordinates(
st_transform(net_pts, crs = crs(r))))
agdd <- data.frame(COMID = net_pts$COMID,
PointID = net_pts$PointID,
agdd)
#Valley width is length of lateral transect
valley_width<-data.frame(COMID = net_trn$COMID,
PointID = net_trn$PointID,
vln = units::set_units(st_length(net_trn),NULL))
```
```{r eval = F}
#merge data
REACH_SCALE <- merge(agdd, valley_width, by = c("COMID", "PointID"), all.x = T)
head(REACH_SCALE)
```
# Idenfitying "Patches" within a Stream Network
## combine Watershed, COMID, and Reach data
```{r eval = F}
#combine multiscale data
StrNet_data <- cbind(WS_SCALE[1,],
merge(COMID_SCALE,
REACH_SCALE,
by = "COMID",
all.x = T))
#select variables for clustering
StrNet_vars <- names(StrNet_data)[!names(StrNet_data) %in%
c("COMID", "PointID",
"prG_BMMI", "Class",
"CatPctFull","Rl_rsq",
"Ra_rsq","Rb_rsq")]
#Use complete cases
StrNet_data <- StrNet_data[complete.cases(StrNet_data[,StrNet_vars]),]
head(StrNet_data[,c("COMID", "PointID", StrNet_vars)])
```
## Hierarchical Clustering Analysis
```{r eval = F}
#calculate dissimiilarity
d <- daisy(StrNet_data[,StrNet_vars], metric = "gower")
#Clustering using wards method
cl <- agnes(d, method = "ward")
```
```{r eval = F}
#view dendrogram
plot(cl, which.plots = 2)
#define 5 groups of reaches
StrNet_data$grps <- cutree(cl, 5)
```
```{r eval = F}
#plot Cluster Results
#merge results with the sampled reaches
#retain all.x = T because some reaches were missing
#data remained unclassified
net <- merge(net_rch,
StrNet_data[,c("PointID", "COMID", "prG_BMMI", "grps")],
by = c("COMID", "PointID"), all.x = T)
#update sites that could not be included in classification because
#missing data to 0
net$grps[is.na(net$grps)] <- 0
p1 <- ggplot()+
geom_sf(data = net, color = net$grps + 1)
gridExtra::grid.arrange(p1, nrow = 1)
```
# Quantifying Spatial Pattern
Spatial pattern can refer to composition of configuration of habitat patches within the steam network. Below, each landscape metric is calculated for each patch type (i.e. class scale).
## Composition
```{r eval = F}
#merge adjacent habitat patches of the same type into continuous segment
net_HGP <- merge_lines(lines = st_transform(net, 5070),
groupName = "grps")
#create example composition metrics network
#claculate the number and total and mean length,
#of a given patch type
HGP_cmp <- do.call(rbind,
lapply(split(net_HGP, net_HGP$grps),
function(x)
data.frame(patch = unique(x$grps),
num = nrow(x),
tlen = sum(st_length(x)),
meanlen = mean(st_length(x)))))
HGP_cmp
```
## Configuration
```{r eval = F}
#function sf_to_tidygraph adapted from #https://www.r-spatial.org/r/2019/09/26/spatial-networks.html
#Enables measurement of watercourse distance between
#habitat patches within the stream network
graph <- sf_to_tidygraph(st_transform(net, 5070), directed = F)
#here we calculate pairwise watercourse distances
#separating patches of type 2
HGP_cnf <- patch_dist(graph, net_HGP, groupName = "grps", patch = "2")
#claculate a modified dendritic connectivity index baised on watercourse
#distance between the patches
DCI_dist(plen_i = HGP_cnf$patch_i_len,
plen_j = HGP_cnf$patch_j_len,
dist_ij = HGP_cnf$d, mu = 12000)
```