Skip to content

Commit 43f7fe3

Browse files
authored
Merge pull request #100 from daos-stack/develop
DAOSGCP-218 Merge develop to main for v0.5.0 Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com>
2 parents 80d8b18 + 75c61db commit 43f7fe3

50 files changed

Lines changed: 447 additions & 271 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
# Direnv
2+
.envrc
3+
14
# Local .terraform directories
25
**/.terraform/
36
**/.terraform/*

.tflint.hcl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
plugin "google" {
22
enabled = true
3-
version = "0.16.1"
3+
version = "0.26.0"
44
source = "github.com/terraform-linters/tflint-ruleset-google"
55
}
66
rule "terraform_deprecated_index" {

docs/deploy_daos_cluster_example.md

Lines changed: 22 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,19 @@ These instructions describe how to deploy a DAOS Cluster using the example in [t
77
Deployment tasks described in these instructions:
88

99
- Deploy a DAOS cluster using Terraform
10+
- Log into the first DAOS client instance
1011
- Perform DAOS administrative tasks to prepare the storage
11-
- Mount a DAOS container with [DFuse (DAOS FUSE)](https://docs.daos.io/v2.0/user/filesystem/?h=dfuse#dfuse-daos-fuse)
12+
- Mount a DAOS container with [DFuse (DAOS FUSE)](https://docs.daos.io/v2.4/user/filesystem/?h=dfuse#dfuse-daos-fuse)
1213
- Store files in a DAOS container
1314
- Unmount the container
14-
- Remove the deployment (terraform destroy)
15+
- Undeploy DAOS cluster (terraform destroy)
16+
17+
## Prerequisites
18+
19+
The steps in the [Pre-Deployment Guide](pre-deployment_guide.md) must be completed prior to deploying the DAOS cluster in this example.
20+
21+
The [Pre-Deployment Guide](pre-deployment_guide.md) describes how to build the DAOS images that are used to deploy server and client instances.
22+
1523

1624
## Clone the repository
1725

@@ -25,7 +33,7 @@ cd ~/google-cloud-daos/terraform/examples/daos_cluster
2533

2634
## Create a `terraform.tfvars` file
2735

28-
Before you run `terraform` you need to create a `terraform.tfvars` file in the `terraform/examples/daos_cluster` directory.
36+
Before you run `terraform apply` to deploy the DAOS cluster you need to create a `terraform.tfvars` file in the `terraform/examples/daos_cluster` directory.
2937

3038
The `terraform.tfvars` file contains the variable values for the configuration.
3139

@@ -111,23 +119,19 @@ gcloud compute instances list \
111119
--format="value(name,INTERNAL_IP)"
112120
```
113121

114-
## Perform DAOS administration tasks
115-
116-
After your DAOS cluster has been deployed you can log into the first DAOS server instance to perform administrative tasks.
117-
118-
### Log into the first DAOS server instance
122+
## Log into the first DAOS client instance
119123

120124
Log into the first server instance
121125

122126
```bash
123-
gcloud compute ssh daos-server-0001
127+
gcloud compute ssh daos-client-0001
124128
```
125129

126-
### Verify that all daos-server instances have joined
130+
## Perform DAOS administration tasks
127131

128-
The DAOS Management Tool `dmg` is meant to be used by administrators to manage the DAOS storage system and pools.
132+
The `dmg` command is used to perform adminstrative tasks such as formatting storage and managing pools and therefore must be run with `sudo`.
129133

130-
You will need to run `dmg` with `sudo`.
134+
### Verify that all daos-server instances have joined
131135

132136
Use `dmg` to verify that the DAOS storage system is ready.
133137

@@ -172,9 +176,7 @@ This shows how much NVMe-Free space is available for each server.
172176
Create a pool named `pool1` that uses the total NVMe-Free for all servers.
173177

174178
```bash
175-
TOTAL_NVME_FREE="$(sudo dmg storage query usage | awk '{split($0,a," "); sum += a[10]} END {print sum}')TB"
176-
echo "Total NVMe-Free: ${TOTAL_NVME_FREE}"
177-
sudo dmg pool create --size="${TOTAL_NVME_FREE}" --tier-ratio=3 --label=pool1
179+
sudo dmg pool create --size="100%" pool1
178180
```
179181

180182
View the ACLs on *pool1*
@@ -193,44 +195,23 @@ A:G:GROUP@:rw
193195

194196
Here we see that root owns the pool.
195197

196-
Add an [ACE](https://docs.daos.io/v2.0/admin/pool_operations/#adding-and-updating-aces) that will allow any user to create a container in the pool
198+
Add an [ACE](https://docs.daos.io/v2.4/admin/pool_operations/#adding-and-updating-aces) that will allow any user to create a container in the pool
197199

198200
```bash
199201
sudo dmg pool update-acl -e A::EVERYONE@:rcta pool1
200202
```
201203

202-
This completes the administration tasks for the pool.
203-
204204
For more information about pools see
205205

206206
- [Overview - Storage Model - DAOS Pool](https://docs.daos.io/latest/overview/storage/#daos-pool)
207207
- [Administration Guide - Pool Operations](https://docs.daos.io/latest/admin/pool_operations/)
208208

209-
### Log out of the first server instance
210-
211-
Now that the administrative tasks have been completed, you may log out of the first server instance.
212-
213-
```bash
214-
logout
215-
```
216-
217209
## Create a Container
218210

219-
User tasks such as creating and mounting a container will be done on the first client
220-
221-
### Log into the first DAOS client instance
222-
223-
Log into the first client instance
224-
225-
```bash
226-
gcloud compute ssh daos-client-0001
227-
```
228-
229-
230211
Create a [container](https://docs.daos.io/latest/overview/storage/#daos-container) in the pool
231212

232213
```bash
233-
daos container create --type=POSIX --properties=rf:0 --label=cont1 pool1
214+
daos container create --type=POSIX --properties=rf:0 pool1 cont1
234215
```
235216

236217
For more information about containers see
@@ -261,8 +242,10 @@ Create a 20GiB file which will be stored in the DAOS filesystem.
261242

262243
```bash
263244
cd ${HOME}/daos/cont1
245+
246+
# Create a 20GB file
264247
time LD_PRELOAD=/usr/lib64/libioil.so \
265-
dd if=/dev/zero of=./test21G.img bs=1G count=20
248+
dd if=/dev/zero of=./test20.img bs=1G count=20
266249
```
267250

268251
## Unmount the container and logout of the first client

docs/pre-deployment_guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ Since *project name* and *project ID* are used in many configurations it is reco
2020

2121
To create a project, refer to the following documentation
2222

23-
- [Get Started with Google Cloud](https://cloud.google.com/docs/get-started)
2423
- [Creating and managing projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects)
2524

2625
Make note of the *Project Name* and *Project ID* for the project that you plan to use for your DAOS deployment as you will be using it later in various configurations.
@@ -152,6 +151,7 @@ If you are currently in Cloud Shell, you don't need to run this command.
152151

153152
```bash
154153
gcloud auth login
154+
gcloud auth application-default login
155155
```
156156

157157
To learn more about using the Google Cloud CLI see the various [How-to Guides](https://cloud.google.com/sdk/docs/how-to).

images/README.md

Lines changed: 59 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,27 @@
11
# Images
22

3-
This directory contains files necessary for building DAOS images using [Cloud Build](https://cloud.google.com/build) and [Packer](https://developer.hashicorp.com/packer/downloads).
3+
This directory contains files necessary for building DAOS images using
4+
[Cloud Build](https://cloud.google.com/build) and
5+
[Packer](https://developer.hashicorp.com/packer/downloads).
46

57
## Pre-Deployment steps required
68

7-
If you have not done so yet, please complete the steps in [Pre-Deployment Guide](../docs/pre-deployment_guide.md).
9+
If you have not done so yet, please complete the steps in the
10+
[Pre-Deployment Guide](../docs/pre-deployment_guide.md).
811

9-
The pre-deployment steps will have you run the `images/build.sh` script once in order to build a DAOS server image and a DAOS client image with the configured default settings.
12+
The pre-deployment steps will have you run the `images/build.sh` script once in
13+
order to build a DAOS server image and a DAOS client image with the configured
14+
default settings.
1015

11-
That should be all you need to run the Terraform examples in the `terraform/examples` directory or to run the [DAOS examples in the Google HPC Toolkit](https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/community/examples/intel).
16+
That should be all you need to run the Terraform examples in
17+
the `terraform/examples` directory or to run the [DAOS examples in the Google HPC Toolkit](https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/community/examples/intel).
1218

13-
The information in this document is provided in case you need to build custom images with non-default settings.
19+
The information in this document is provided in case you need to build custom
20+
images with non-default settings.
1421

1522
## Building DAOS images
1623

17-
To rebuild the images with the default settings run:
24+
To build the images with the default settings run:
1825

1926
```bash
2027
cd images
@@ -23,13 +30,32 @@ cd images
2330

2431
## The Packer HCL template file
2532

26-
A single Packer HCL template file `daos.pkr.hcl` is used to build either a DAOS server or DAOS client image.
33+
A single Packer HCL template file `daos.pkr.hcl` is used to build either a DAOS
34+
server or DAOS client image.
2735

28-
The `daos.pkr.hcl` file does not build both server and client images in a single `packer build` run. This is by design since there are use cases in which only one type of image is needed. If both types of images are needed, then `packer build` must be run twice with different variable values.
36+
The `daos.pkr.hcl` file does not build both server and client images in a single `packer build` run.
37+
This is by design since there are use cases in which only one type of image is needed. If both types
38+
of images are needed, then `packer build` must be run twice with different variable values.
39+
40+
The `build.sh` script does this for you by running packer twice with different variable values for
41+
server and client images.
2942

3043
### Source Block
3144

32-
Within the `daos.pkr.hcl` template there is a single `source` block. Most of the settings for the block are set by variable values.
45+
Within the `daos.pkr.hcl` template there is a single `source` block. The settings
46+
settings for the block are provided by variable values. This allows the settings
47+
to be passed to packer via a variables file which is specified by the `-var-file` parameter
48+
of the `packer build` command.
49+
50+
The `build.sh` script generates a packer variables file from the `GCP_*` and `DAOS_*` environment
51+
variables defined in the script.
52+
53+
Run `./build.sh --help` to see a list of environment variables that are used
54+
by the `./build.sh` script to create a packer variables file that will be
55+
passed to packer to create the images.
56+
57+
You can export these variables before running the `build.sh` script to customize
58+
the images or to modify Cloud Build settings.
3359

3460
### Build Block
3561

@@ -41,7 +67,8 @@ The `build` block consists of provisioners that do the following:
4167

4268
These provisioners are the same for building both DAOS server and DAOS client images.
4369

44-
The `daos_install_type` variable in the `daos.pkr.hcl` template is passed in the `--extra-vars` parameter when running the `daos.yml` ansible playbook.
70+
The `daos_install_type` variable in the `daos.pkr.hcl` template is passed in the `--extra-vars`
71+
parameter of the `ansible-playbook` command when running the `daos.yml` ansible playbook.
4572

4673
If `daos_install_type=server`, then the `daos.yml` playbook will install the DAOS server packages.
4774

@@ -74,13 +101,15 @@ The `images/build.sh` script uses the following environment variables.
74101

75102
To view the default values for these variables see the defaults set in the `build.sh` script.
76103

77-
Running `build.sh --help` will display the values of these variables so that you can inspect them before running `build.sh`
104+
Running `build.sh --help` will display the values of these variables so that you can inspect them
105+
before running `build.sh`
78106

79107
### Controlling the version of DAOS to be installed
80108

81109
Official DAOS packages are hosted at https://packages.daos.io/
82110

83-
Unfortunately, the paths to the `.repo` files for each repository do not follow a standard convention that can be dynamically created based on something like the `/etc/os-release` file.
111+
Unfortunately, the paths to the `.repo` files for each repository do not follow a standard
112+
convention that can be dynamically created based on something like the `/etc/os-release` file.
84113

85114
To specify the path to a repo file the following 3 environment variables are used:
86115

@@ -98,28 +127,16 @@ The values of these variables should not start or end with a `/`
98127

99128
**Examples:**
100129

101-
To install DAOS v2.2.0 on CentOS 7
102-
103-
```bash
104-
DAOS_REPO_BASE_URL=https://packages.daos.io
105-
DAOS_VERSION="2.2.0"
106-
DAOS_PACKAGES_REPO_FILE="CentOS7/packages/x86_64/daos_packages.repo"
107-
```
108-
109-
To install DAOS v2.2.0 on Rocky 8
130+
To install DAOS v2.4.0 on Rocky 8
110131

111-
```bash
112-
DAOS_REPO_BASE_URL=https://packages.daos.io
113-
DAOS_VERSION="2.2.0"
114-
DAOS_PACKAGES_REPO_FILE="EL8/packages/x86_64/daos_packages.repo"
115-
```
132+
```bash
133+
DAOS_REPO_BASE_URL=https://packages.daos.io
134+
DAOS_VERSION="2.4.0"
135+
DAOS_PACKAGES_REPO_FILE="EL8/packages/x86_64/daos_packages.repo"
136+
```
116137

117138
## Building only the DAOS Server or the DAOS Client image
118139

119-
If you do not want to build one of the images, you must set the appropriate environment variable.
120-
121-
For example,
122-
123140
To build only the DAOS Server image
124141

125142
```bash
@@ -138,7 +155,8 @@ export DAOS_BUILD_SERVER_IMAGE="false" # Do not run the job to build the DAOS se
138155

139156
## Custom image builds
140157

141-
To create images that do not use the default settings, export one or more of the environment variables listed above before running `build.sh`
158+
To create images that do not use the default settings, export one or more of the environment
159+
variables listed above before running `build.sh`
142160

143161
### Change the name of the image family
144162

@@ -151,7 +169,8 @@ export DAOS_CLIENT_IMAGE_FAMILY="my-daos-client"
151169

152170
### Use a different source image
153171

154-
For the source image, use the `rocky-linux-8-optimized-gcp` community image instead of the `hpc-rocky-linux-8` image.
172+
For the source image, use the `rocky-linux-8-optimized-gcp` community image instead of the
173+
`hpc-rocky-linux-8` image.
155174

156175
```bash
157176
cd images
@@ -204,6 +223,12 @@ export GCP_USE_CLOUDBUILD="false" # Do not run packer in Cloud Build
204223
./build.sh
205224
```
206225

207-
When running `build.sh` this way, all project configuration steps are skipped.
226+
When running `build.sh` this way, all GCP project configuration steps (setting permissions) are skipped.
227+
228+
When `GCP_USE_CLOUDBUILD="true"` the `build.sh` will check your GCP project to ensure the default
229+
service account has the proper permissions needed for the Cloud Build job to run packer and create
230+
the images in your project.
208231

209-
When `GCP_USE_CLOUDBUILD="true"` the `build.sh` will check your GCP project to ensure the default service account has the proper permissions needed for the Cloud Build job to run packer and create the images in your project. Setting `GCP_USE_CLOUDBUILD="true"` will skip the project configuration steps. In this case, it's up to you to make sure the proper permissions are configured for you to run packer locally to build the images.
232+
Setting `GCP_USE_CLOUDBUILD="false"` will skip the project configuration steps. In this case, it's
233+
up to you to make sure the proper permissions are configured for you to run packer locally to build
234+
the images.

images/ansible_playbooks/daos.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222

2323
vars:
2424
daos_install_type: "all"
25-
daos_version: "2.2.0"
25+
daos_version: "2.4.0"
2626
daos_repo_base_url: "https://packages.daos.io"
2727
daos_packages_repo_file: "EL8/packages/x86_64/daos_packages.repo"
2828
daos_packages:
@@ -33,6 +33,7 @@
3333
packages:
3434
- clustershell
3535
- curl
36+
- fuse
3637
- git
3738
- jq
3839
- patch

images/build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
set -eo pipefail
1717
trap 'echo "Unexpected and unchecked error. Exiting."' ERR
1818

19-
: "${DAOS_VERSION:="2.2.0"}"
19+
: "${DAOS_VERSION:="2.4.0"}"
2020
: "${DAOS_REPO_BASE_URL:="https://packages.daos.io"}"
2121
: "${DAOS_PACKAGES_REPO_FILE:="EL8/packages/x86_64/daos_packages.repo"}"
2222
: "${GCP_PROJECT:=}"

images/daos.pkr.hcl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,9 +134,9 @@ build {
134134
provisioner "shell" {
135135
execute_command = "echo 'packer' | sudo -S env {{ .Vars }} {{ .Path }}"
136136
inline = [
137+
"dnf clean packages",
137138
"dnf -y install epel-release",
138-
"dnf -y install python3.11 python3.11-pip ansible-core",
139-
"alternatives --set python3 /usr/bin/python3.11"
139+
"dnf -y install ansible-core"
140140
]
141141
}
142142

0 commit comments

Comments
 (0)