-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathPart2_Git_RStudio.qmd
More file actions
425 lines (257 loc) · 15.9 KB
/
Part2_Git_RStudio.qmd
File metadata and controls
425 lines (257 loc) · 15.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
---
title: "Part 2: Working Locally with Git & RStudio"
author: "Workshop Instructor"
format: html
editor: visual
---
## Introduction
Welcome to Part 2! In this section, we will get hands-on with Git right inside RStudio. We'll learn the fundamental workflow of saving changes to our project's history. This entire process happens locally, on our own computers. We will explore doing this in two ways: using RStudio's user-friendly graphical interface and using the powerful command line terminal.
## 1. Recap: The Initialized Repo (2 mins)
When your RStudio Project was created, the option to "Create a git repository" was selected. This means Git has already been initialized and is tracking your project folder.
Your mission control for Git in RStudio is the **Git tab**, typically located in the top-right pane next to "Environment" and "History".
**Action:** Click on the **Git tab**. You should see that the pane is currently empty, which means Git doesn't see any unsaved changes.
## 2. The Git Workflow in RStudio (10 mins)
The core Git workflow involves three steps:
1. You make changes to your files.
2. You stage the specific changes you want to save.
3. You commit those staged changes, creating a permanent snapshot in your project's history.
Let's walk through this.
### Staging a Change
1. **Make a small change.** Open any of your R scripts. If you don't have one, create a new one (File -\> New File -\> R Script) and save it as `analysis.R`. Add a comment to the script, like this:
```{r}
#| eval: false
# This is our first tracked change.
```
2. **View the change in the Git tab.** Click back to the **Git tab** in the top-right pane. You will now see your `analysis.R` file listed! RStudio is telling you that the file has been modified since your last commit. The yellow "M" icon indicates a modification.
3. **Stage the file.** Staging is how you tell Git, "I want to include this change in my next snapshot." To do this, click the checkbox under the "Staged" column for the `analysis.R` file. The file's icon will turn into a green "A".
### Committing a Change
Now that you've staged your change, you're ready to commit it to the project's history.
1. Click the **"Commit"** button at the top of the Git pane.
2. **Write a clear commit message.** A new window will open. In the top-right "Commit message" box, you need to write a short description of the change you made. Good commit messages are short, descriptive, and use the imperative mood (e.g., "Add," "Fix," "Change" instead of "Added," "Fixed," "Changed").
Type this message: `Add initial analysis comment`
3. **Commit the file.** Click the **"Commit"** button. A dialog box will pop up confirming the changes. You can close this.
4. **View the history.** Back in the main RStudio window, click the **"History"** button (it looks like a clock) at the top of the Git pane. You can now see your first commit! It includes a unique ID (the "SHA"), the author, the date, and the descriptive message you wrote.
**Congratulations, you've completed your first version control cycle!**
## 3. Git on the Command Line (CLI) - A Quick Look (10 mins)
The RStudio Git tab is a fantastic visual tool, but it's just running Git commands for you in the background. Let's pull back the curtain and run those same commands ourselves.
1. **Open the Terminal.** In RStudio, there is a tab next to the "Console" tab labeled "Terminal". Click on it.
2. **Check the project status.** The single most useful Git command is `git status`. It tells you the current state of your repository.
**Action:** In the terminal, type `git status` and press Enter.
**Result:** It should say `nothing to commit, working tree clean`. This is because we just saved all our changes with the last commit.
3. **Make another change.** Go back to your `analysis.R` script and add a new line of code.
```{r}
#| eval: false
# This is our first tracked change.
print("Hello from R!")
```
4. **Check the status again.** Go back to the terminal and run `git status` again. Notice that Git now sees that `analysis.R` has been modified.
5. **Stage the change (CLI).** To stage the file using the command line, we use `git add`.
**Action:** In the terminal, type `git add .` and press Enter. The `.` is shorthand for "all changed files in the current directory."
**Check:** Run `git status` one more time. You'll see the file is now listed under "Changes to be committed."
6. **Commit the change (CLI).** To commit, we use `git commit` with the `-m` flag to provide a message.
**Action:** In the terminal, type `git commit -m "Add print statement via CLI"` and press Enter.
**You've just done the exact same workflow as before, but using the command line!**
**Key Takeaway:** The RStudio Git tab is a user-friendly interface for running these fundamental Git commands. You can use whichever you prefer, but it's good to know what's happening behind the scenes.
## 4. Practice: Two More Commits (10 mins)
Let's reinforce what we've learned with two more examples. This time, you'll do one commit using the GUI and one using the command line.
### Example A: Adding Data Analysis Code (GUI Method)
1. **Add more code to your script.** Open your `analysis.R` file and add these lines:
```{r}
#| eval: false
# Load and explore data
data <- mtcars
summary(data)
```
2. **Check the Git tab.** You'll see `analysis.R` appears again with a blue "M" (modified).
3. **Review your changes (Optional but helpful!).** Before staging, click on the filename `analysis.R` in the Git tab. A "diff" window will open showing exactly what changed: lines in red were removed, lines in green were added. This is incredibly useful for reviewing your work before committing.
4. **Stage and commit.** Check the box to stage the file, click "Commit", and write the message: `Add data loading and summary`
::: callout-tip
## GUI Tip: Staging Specific Lines
If you made multiple unrelated changes to a file, you can actually stage only *some* of those changes! In the diff window, select the specific lines you want to commit, right-click, and choose "Stage Selection". This helps keep each commit focused on one logical change.
:::
### Example B: Adding a Simple Plot (CLI Method)
1. **Add plotting code.** Add this to your `analysis.R` file:
```{r}
#| eval: false
# Create a simple plot
plot(data$mpg, data$hp,
xlab = "Miles per Gallon",
ylab = "Horsepower")
```
2. **Use the Terminal workflow:**
- Check status: `git status`
- Stage the file: `git add analysis.R` (this time we're being specific!)
- Check again: `git status`
- Commit: `git commit -m "Add scatter plot of mpg vs horsepower"`
3. **Verify your work.** Click the "History" button in the Git tab (or run `git log --oneline` in the terminal). You should now see all your commits listed!
::: callout-note
## When to Use GUI vs. CLI?
- **GUI:** Great for visual learners, reviewing changes before committing, and when you're just starting out.
- **CLI:** Faster once you learn it, more powerful for advanced operations, and works everywhere (even on remote servers).
Many experienced users mix both approaches: GUI for reviewing diffs, CLI for quick commits!
:::
## 5. Local-First vs. Remote-First Repositories
This is a perfect time to discuss the two main ways a Git project begins.
### Local-First (What we just did):
- **The process:** You start a project on your own computer first. You initialize a Git repository, work on your files, and create a history of commits, all locally.
- **The goal:** Later, you can decide to "push" this local project to a remote server like GitHub to back it up, share it, or collaborate.
- **Analogy:** You write a new document in Microsoft Word on your desktop. When you're ready, you upload it to OneDrive to share with your colleagues.
### Remote-First (Cloning):
- **The process:** The project already exists on a remote server like GitHub. Your first step is to "clone" it, which downloads a complete copy of the project and its entire version history onto your computer.
- **The goal:** This is the standard way to join a project that a colleague has already started or to contribute to an open-source project.
- **Analogy:** A colleague shares a document with you via OneDrive. You use the "Open in Desktop App" feature to download a local copy to your computer so you can start editing.
**In this workshop, we started "local-first." Later, in the collaboration section, you will experience the "remote-first" workflow when you clone your partner's repository.**
## 6. Publishing Your Repository to GitHub (20 mins)
Now that we have a local repository with a commit history, let's back it up and share it by publishing to GitHub.com. This requires two steps: setting up authentication with SSH keys, and creating a remote repository.
### Step 1: Set Up SSH Authentication
SSH (Secure Shell) is a secure way to connect your computer to GitHub. Think of it as a special key that proves your identity without typing your password every time.
::: callout-note
## Alternative: Personal Access Tokens (PATs)
Personal Access Tokens are an alternative authentication method using HTTPS instead of SSH. However, they expire every 90 days and require renewal, which can disrupt your workflow. **SSH is strongly preferred for long-term research projects** because it's a one-time setup.
If SSH doesn't work due to network/firewall restrictions, GitHub Desktop (which uses HTTPS) is a good backup option, or consult your instructor about PAT setup.
:::
#### Check for Existing SSH Keys
First, let's see if you already have SSH keys on your computer.
**In the RStudio Terminal**, run:
``` bash
ls ~/.ssh
```
If you see files named `id_ed25519` or `id_rsa`, you already have keys! Skip to "Add Your Key to GitHub" below. If not, continue to generate new keys.
#### Generate a New SSH Key
::::: panel-tabset
## macOS/Linux
1. **Generate the key.** In the Terminal, run this command (replace with your actual GitHub email):
``` bash
ssh-keygen -t ed25519 -C "your_email@example.com"
```
2. **Save the key.** When prompted "Enter a file in which to save the key," press **Enter** to accept the default location (`~/.ssh/id_ed25519`).
3. **Set a passphrase (optional).** You'll be asked to enter a passphrase. This adds extra security but is optional. You can press **Enter** twice to skip it, or type a passphrase you'll remember.
4. **Start the SSH agent:**
``` bash
eval "$(ssh-agent -s)"
```
You should see something like `Agent pid 12345`.
5. **Add your key to the SSH agent:**
``` bash
ssh-add ~/.ssh/id_ed25519
```
6. **Copy your public key to the clipboard:**
``` bash
pbcopy < ~/.ssh/id_ed25519.pub
```
Your key is now copied and ready to paste into GitHub!
## Windows
::: callout-note
## Windows Users: Use Git Bash or RStudio Terminal
These commands work in **Git Bash** (installed with Git for Windows) or the **RStudio Terminal**. If you're using Command Prompt or PowerShell, some commands may differ. We recommend using the RStudio Terminal for consistency.
:::
1. **Generate the key.** In the Terminal, run this command (replace with your actual GitHub email):
``` bash
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
```
2. **Save the key.** When prompted "Enter a file in which to save the key," press **Enter** to accept the default location.
You'll see: `Your identification has been saved in /c/Users/YourName/.ssh/id_rsa`
3. **Set a passphrase (optional).** You'll be asked to enter a passphrase. This adds extra security but is optional.
- To skip: Press **Enter** twice
- To add security: Type a passphrase you'll remember, press **Enter**, then type it again
4. **Start the SSH agent:**
``` bash
eval "$(ssh-agent -s)"
```
You should see something like `Agent pid 12345`.
5. **Add your key to the SSH agent:**
``` bash
ssh-add ~/.ssh/id_rsa
```
You should see: `Identity added: /c/Users/YourName/.ssh/id_rsa`
6. **Copy your public key to the clipboard:**
**Option 1 (Git Bash/RStudio Terminal):**
``` bash
cat ~/.ssh/id_rsa.pub | clip
```
**Option 2 (If clip doesn't work):**
``` bash
cat ~/.ssh/id_rsa.pub
```
Then manually select and copy the entire output (it starts with `ssh-rsa` and ends with your email).
**Option 3 (PowerShell users):**
``` powershell
Get-Content ~/.ssh/id_rsa.pub | Set-Clipboard
```
Your key is now copied and ready to paste into GitHub!
::: callout-warning
## Windows Troubleshooting
If `ssh-agent` won't start, try opening RStudio or Git Bash **as Administrator**. Right-click the application icon and select "Run as administrator."
If you see "Could not open a connection to your authentication agent," run this first:
``` bash
ssh-agent bash
```
Then retry the `ssh-add` command.
:::
:::::
::: callout-important
## Keep Your Keys Safe!
- The `.pub` file is your **public key** (safe to share with GitHub)
- The file without `.pub` is your **private key** (never share this!)
- If someone asks for your SSH key, they always mean the public one
:::
#### Add Your SSH Key to GitHub
Now we'll add your public key (which is copied to your clipboard) to your GitHub account:
1. Go to [github.com](https://github.com) and log in to your account
2. Click on your **profile photo** in the upper-right corner
3. Click **Settings**
4. In the left sidebar, click **SSH and GPG keys**
5. Click the green **New SSH key** button
6. In the **Title** field, add a descriptive name like "My RStudio Laptop"
7. In the **Key** field, paste your key (Cmd+V on Mac, Ctrl+V on Windows)
8. Click **Add SSH key**
9. If prompted, confirm your GitHub password
#### Test Your Connection
Back in the RStudio Terminal, run:
``` bash
ssh -T git@github.com
```
You may see a warning about authenticity - type `yes` and press Enter.
You should see a message like: `Hi username! You've successfully authenticated...`
**Success!** Your computer can now securely talk to GitHub.
### Step 2: Create a Repository on GitHub.com
Now let's create a home for your local repository on GitHub:
1. Go to [github.com](https://github.com) and log in
2. Click on your **profile picture** in the top-right corner
3. Select **Your repositories**
4. Click the green **New** button
5. **Repository name:** Enter a name (e.g., `ARCS-workshop`)
6. **Description (optional):** Add a brief description like "Reproducibility workshop project"
7. **Visibility:** Choose **Private** (or Public if you want others to see it)
8. **Important:** Do NOT check "Add a README file" - we already have files locally!
9. Click the green **Create repository** button
### Step 3: Connect Your Local Repository to GitHub
GitHub will show you a page with setup instructions. We'll use the commands for "push an existing repository from the command line."
In your RStudio Terminal, run these commands one at a time:
1. **Add the remote connection** (replace `username` and `yourreponame` with yours):
``` bash
git remote add origin git@github.com:username/yourreponame.git
```
This tells Git where the remote repository lives.
2. **Ensure you're on the main branch:**
``` bash
git branch -M main
```
3. **Push your commits to GitHub:**
``` bash
git push -u origin main
```
The `-u` flag sets up tracking so future pushes are easier.
You should see output showing your commits being uploaded. When it finishes, refresh your GitHub repository page in your browser - you should see all your files and commit history!
::: callout-tip
## Troubleshooting: Using GitHub Desktop as a Backup Method
If you encounter authorization issues with the command line, you can use GitHub Desktop as an alternative:
1. Download and install [GitHub Desktop](https://desktop.github.com/)
2. Sign in with your GitHub account
3. Click "Add" → "Add Existing Repository"
4. Navigate to your project folder
5. Click "Publish repository" to upload to GitHub.com
This method uses HTTPS authentication instead of SSH.
:::
**Congratulations!** Your local repository is now backed up on GitHub. Any changes you commit locally can be pushed to GitHub with `git push`, and you can access your code from anywhere.
ne .ine