You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are many different bits of information, including detection information (how was the host-virus association quantified?), taxonomic information on hosts and viruses, and the host-virus association data. Let's load the entire dataset and get a better idea of the scope and nature of the data
is where the data lives. It is compressed to avoid file size limitation on GitHub, but can be read into `R` using base functionality (or using the `vroom` package).
56
-
51
+
There are many different bits of information, including detection information (how was the host-virus association quantified?), taxonomic information on hosts and viruses, and the host-virus association data. Let's load the entire dataset and get a better idea of the scope and nature of the data. Go to the following link and download the data.
It is compressed to avoid file size limitation on GitHub, but can be read into `R` using base functionality.
59
56
60
57
```{r eval=FALSE}
61
-
# if you downloaded the file
58
+
62
59
virion <- read.delim(gzfile('Virion.csv.gz'))
63
60
64
61
```
65
62
66
63
67
-
The code below may not work. If it does not, please download the file and read it in using the code above. You will have to change the directory or the path to the file based on what you learned a few lectures ago.
64
+
Be sure that you know what you're working directory is and that you are making the call to the right directory. This should help reinforce file path skills that are pretty essential.
68
65
69
-
```{r}
70
-
# reading directly from web resource
71
-
con <- gzcon(url("https://github.com/viralemergence/virion/raw/main/Virion/Virion.csv.gz"))
72
-
txt <- readLines(con)
73
-
virion <- read.delim(textConnection(txt))
74
-
75
-
```
76
66
77
67
78
68
79
-
80
-
```{r}
69
+
```{r eval=FALSE}
81
70
82
71
# make an interaction matrix
83
72
virionInt <- table(virion$Host, virion$Virus)
84
73
85
-
86
74
dim(virionInt)
87
75
88
76
# how host-specific are viruses?
@@ -312,6 +300,9 @@ str(seqData)
312
300
Notice the slight difference in the structure of the data. How would you actually get at the sequence itself? How is it coded?
313
301
314
302
303
+
304
+
305
+
315
306
Map the numbers in the sequence to the actual characters (a,c,g,t) that they correspond to.
316
307
317
308
```{r}
@@ -320,6 +311,9 @@ Map the numbers in the sequence to the actual characters (a,c,g,t) that they cor
320
311
```
321
312
322
313
314
+
315
+
316
+
323
317
Combine into a single string and tell me how many times the combinations `TAA`, `TAG`, or `TGA` occur in the sequence. These are stop codons.
0 commit comments