Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions azure/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
> [<img src="https://aka.ms/deploytoazurebutton"/>](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fconservationmetrics.github.io%2Fgc-deploy%2Fazure%2Fnew-vm.arm.json)
2. Fill in required parameters:
- **Subscription:** Select the subscription you want to use.
- **Resource Group:** Recommend creating new, so the only thing in the resource group is this Guardian Deployment deployment. See also ["Prerequisites"](#prerequisites) below for discussion about permission requirements.
- **Resource Group:** Recommend creating new, so the only thing in the resource group is this Guardian Deployment deployment. See also ["Prerequisites"](#prerequisites) below for discussion about permission requirements.
- CMI's convention is to use `guardian-<alias>` for the resource group name, where `<alias>` is the alias chosen by the community.
- **Region:** Where will this stack be hosted? e.g. for data about Brazil, choose "`Brazil South`" to adhere to [Brazilian Data Protection Laws](https://www.gov.br/esporte/pt-br/acesso-a-informacao/lgpd). The Instance (VM) "Region" will be same as the Resource group's region.
- **Create Storage Account / Storage Account Name:** See ["Configuring Azure Files"](#configuring-azure-files-optional) below.
Expand Down Expand Up @@ -140,7 +140,7 @@ Azure Backup can automatically back up the VM's disks (OS disk and any data disk
- Azure Files shares (these have their own backup/redundancy options)
- External database servers

**Setting up backups:**
#### Setting up backups

It's recommended to use the ARM template to configure backup by providing a Recovery Services Vault during initial VM deployment. A vault must already exist in the same region as the VM.

Expand All @@ -151,6 +151,10 @@ When expanding to a new region, one-time instructions to create a Recovery Servi
2. Choose the same region as your VMs
3. Recommended to create an Enhanced Policy called with a distinguishable name such as "GuardianConnectorPolicy"

#### Recovery

See our ["Recover from Backup"](backup-recovery.md) documentation to recover from backup.


## 🛠️ Building the Template

Expand Down
137 changes: 137 additions & 0 deletions azure/backup-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Recover from Backup

This documentation assumes you've set up Azure Backup on a VM (which needs to be have been done manually).


![Available restore options on Azure](restore-options.png)

Use [Recover VM](#recover-vm) when you need to quickly restore an entire disk or the VM itself.

Use [File Recovery](#file-recovery) for targeted extraction of specific files (rate-limited to 1GB/hour).


## Recover VM

Use Recover VM when you need to quickly restore an entire disk or the VM itself.

There are [at least](https://learn.microsoft.com/en-us/azure/backup/backup-azure-arm-restore-vms#restore-options) two ways to recover entire hard disk from backup. Both are described in detail below; we recommend "Replace existing disks" as the more common case.

### Prerequisite

Whichever method you choose, the restore operation will require a Staging Storage Account:
- This storage account must use flat blob storage (a.k.a."hierarchical namespace" disabled, a.k.a. Cannot be "ADLSv2"). Note that our community data warehouse accounts are all ADLSv2 so they cannot be used for this.
- **It needs to be in the same Azure region and the same Subscription as the VM.**
- CMI Staff: If your VM is in East US 2, use storage account `vmrestorestaginguseast2` or `vmrestorestaguseast2` which we leave on-standby for you (they are in different Subscriptions). However to restore VMs in other regions, you need to create a new storage account.

### Restore target: Replace existing disks

This is likely the option you want.

It restores a backed up hard disk to a new disk, which will replace the disk on the existing VM.

Use when:
- The VM still exists and is accessible. If it's been deleted, this option can't be used.
- You want to preserve the VM's identity, for example IP (and therefore DNS settings), resource name in Azure.
- Minimal reconfiguration needed post-restore

#### How to do it

This comment was marked as resolved.

1. The VM needs to be in deallocated state for performing replace disks operation. In the Azure Portal, find the VM and click "Stop" and wait for the operation to finish.
1. In Azure Portal, under the VM, click "Backup".
1. Click **"Recover VM"**
1. Select a restore point, then `Restore target: **"Replace Existing"**`
1. Select the Staging Storage Account from above. It's suggested to uncheck "Skip pre-restore backup" but use your gut.
1. Click the **Restore** button. You will be routed back to the VM overview page while the operation continues in the background.
1. After a minute or two, you should see a message "Restore triggered successfully. Please monitor progress in backup jobs page" if things went according to plan.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that after this, we should document the next set of questions e.g.

  • Restore Type: Select "Create new virtual machine".
  • Virtual machine name: Give a sensible name.
  • Resource Group: Select an existing Resource group (you cannot create a new one). So you can either select your existing VM Resource Group or create a new one.
    • CMI recommends creating a new one using Create a resource group in Azure.
    • (Right? Just to eliminate any risk of something going wrong.)
  • Virtual network: Select an existing Resource group (you cannot create a new one). So you can either select your existing VM Resource Group or create a new one.
    • CMI recommends creating a new one by navigating to Virtual networks and then Create (assigned to your chosen Resource Group).
    • (Right? Just to eliminate any risk of something going wrong.)
  • Subnet: leave as "default"
  • Staging Location: select an appropriate temporary staging location.
    • _CMI recommends creating and using an on-standby location instead of a production storage location used by a Guardian Connector instance.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies -- this should be for "Create new VM"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, and made more opinionated by me.

Follow-up:

This comment was marked as resolved.

- Azure took a pre-replacement snapshot (retained in Backups). If ever you need to undo the restore, maybe you can use this.
- The original hard disk is retained in the resource group. Once recovery was successful, delete this disk manually.


### Restore target: Create new VM

Quickly spins up a new VM from a restore point.
You will specify a new name for the new VM and select the resource group and virtual network (VNet) in which it will be placed.

Use when:
- You need the old VM running alongside the restored version (e.g., for comparison, staged migration)
- You want to restore to a different VNet or resource group (must remain in same region)
- You are just testing recovery.

#### How to do it

In Azure Portal, under the VM, click "Backup".

* **Restore Type**: Select "Create new virtual machine".
* **Virtual machine name**: Give a sensible name.
* **Resource Group**: Select an existing Resource Group (you cannot create a new one). CMI recommends creating a new one in a different browser window, but you may select old VM's existing Resource Group.
* **Virtual network**: Select an existing Virtual Network (you cannot create a new one). CMI recommends creating a new one in a different browser window, but you may select old VM's existing Virtual network.
* **Subnet**: leave as "default"
* **Staging Location**: select the storage account described above.

Please note that only the VM is recovered. After the backup finishes, you will need to manually re-configure Azure network interface settings such as **firewall rules**.

This comment was marked as resolved.


# File Recovery

Use File Recovery for targeted extraction of specific files, such as the PostgreSQL data dir or an environment variable from `config-captain.json`. It's rate-limited to 1GB/hour so not a reasonable solution for full VM recovery.

[Official documentation](https://learn.microsoft.com/en-us/azure/backup/backup-azure-restore-files-from-vm)

## What it does

It adds a local volume to remote backup, for every volume on the VM. Example:

************ Volumes of the recovery point and their mount paths on this machine ************
Sr.No. | Disk | Volume | MountPath
1) | /dev/sdc | /dev/sdc1 | /home/cmiadmin/demo-20260120174605/Volume1
2) | /dev/sdc | /dev/sdc15 | /home/cmiadmin/demo-20260120174605/Volume3
3) | /dev/sdc | /dev/sdc16 | /home/cmiadmin/demo-20260120174605/Volume4

Note that transfer of files from these volumes is rate-limited (1GB/hour) so use only if you need to recover a targeted few files.

These all get mounted as `root` user, so to get into them you need to login as root:

$ sudo su -
# cd /home/cmiadmin

In practice only ONE of these volume backups is the OS Disk, i.e only one is useful to you (was `Volume1` when I tried it).

# ls demo-20260120174605/Volume1/
bin bin.usr-is-merged boot captain dev etc home lib lib.usr-is-merged
lib64 lost+found media mnt opt proc root run sbin sbin.usr-is-merged
snap srv sys tmp usr var

### Examples

If your caprover got corrupted, you could copy `Volume1/captain/data/config-captain.json` into your primary OS Disk.

If your PostgreSQL database got corrupted, you could copy `Volume1/var/lib/docker/volumes/captain--postgres-redis-data` onto your primary OS Disk.


## How to do it

1. In Azure Portal, find the VM. Click "Backup".
1. Click **"File Recovery"**
1. Select restore point, click Download script.
1. Download Script
1. Get the script onto the VM:
scp ./largedisk_0_guardian-XXX.py cmiadmin@captain.XXX.guardianconnector.net:/home/cmiadmin/
1. SSH yourself into the VM:
scp cmiadmin@captain.XXX.guardianconnector.net

1. Run the script. It will prompt you for the VM admin password, and also for the password shown in the portal.

python3 ./largedisk_0_guardian-XXX.py

1. As shown above, `sudo su -` and `cd` into the backed-up disk volume.


## How to cleanup afterwards

After recovery, remove the disks and close the connection to the recovery point by clicking the 'Unmount Disks' button from the portal or by using the relevant unmount command in case of powershell or CLI.

After unmounting disks, run the script with the parameter 'clean' to remove the mount paths of the recovery point from this machine.

python3 ./largedisk_0_guardian-XXX.py clean
Binary file added azure/restore-options.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.