Skip to content

Refactor code, add lambda support & terraform module#11

Open
tommaroh wants to merge 8 commits intomasterfrom
modernize-cleanup
Open

Refactor code, add lambda support & terraform module#11
tommaroh wants to merge 8 commits intomasterfrom
modernize-cleanup

Conversation

@tommaroh
Copy link

@tommaroh tommaroh commented Feb 5, 2025

Changes

  • Significant rewrite to this command-line tool for monitoring certificate expiration.
  • Running locally via Docker is supported, Makefile updated. ie make local
  • Added support for running as Lambda function, and notifying SNS topic.
  • Can still be run as a standalone binary (with no lambda/sns), now accepts a JSON file for configuration
    Example:
{
    "urls": [
        "www.google.com",
        "adhocteam.us",
        "asdadsd.zaz"
    ],
    "days": 30,
    "verbose": false,
    "topic": ""
}
  • Added Terraform module to run it on a cron via Cloudwatch Event trigger. When running in this mode the configuration is passed in at runtime, SNS topic is added directly via TF and other vars are user-modifiable.
  • Added Terraform example for how the module might be called. make lambda will build and package the zip.
  • Removed existing code for direct SMTP notifications.

Testing

I have only run this locally so far, not via Lambda.
Existing unit tests pass and program appears to be working as expected.
TF module passes format, etc.

> make local
rm -f certwatcher
go get && GOOS=linux go build -o certwatcher
terraform fmt -recursive -write=true terraform
go test
INFO[0000] check: 127.0.0.1 - certificate is ok
2025/02/05 09:13:21 http: TLS handshake error from 127.0.0.1:53519: read tcp 127.0.0.1:53518->127.0.0.1:53519: use of closed network connection
PASS
ok  	github.cms.gov/CMS-WDS/iam-certwatcher-lambda	0.644s
 -- Tests Complete --

go run main.go -f cfg_example.json
ERRO[0000] failed host check asdadsd.zaz - dial tcp: lookup asdadsd.zaz: no such host
INFO[0000] check: www.google.com - certificate is ok
INFO[0000] check: adhocteam.us - certificate is ok

Copy link

@dvogel dvogel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear improvements here so no reason to hold this back. I left some comments with considerations for follow-up work though.

if err := check(url, "443", cfg.Days); err != nil {
msg := fmt.Sprintf("failed host check %s - %s", url, err)
log.Errorf(msg)
failures = append(failures, msg)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a priority for this PR necessarily, but a general comment on the approach going forward:

Returning these failures up the call stack seems appropriate to me but also incongruent with the log.Errorf(msg) call. It feels to me like that is functionality tied to the main entrypoint and should be moved up there. Alternatively perhaps both main and handle could pass in a list of interfaces that each failure should be passed to. In the main case maybe that would just be a logger while in the handle case it would be a logger and SNS.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I had to rethink how to handle the errors and wasn't sure at first if it should be a script exit code with notifications elsewhere or handled in-app. I like the idea of going through the same flow in either local or lambda case, and just modifying the notification implementation based on interfaces, will try to get that implemented.

mkdir -p $(BUILD_DIR)
GOOS=linux GOARCH=amd64 go build -o $(BUILD_DIR)/$(APPNAME)
build:
go get && GOOS=linux go build -o $(APPNAME)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does go get not need GOOS=linux? If it is run on macos, does it download macos modules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't seem to make a difference, but I will switch it to before the get as that seems more correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From go help get:

In earlier versions of Go, 'go get' was used to build and install packages.
Now, 'go get' is dedicated to adjusting dependencies in go.mod.

I'm not sure, but I don't think bare go get does anything at all? In any event, Go will automatically download dependencies if they're missing. In CI, it can be nice to use go mod download on its own to have a separate step for caching the dependencies, but there's no point in breaking downloading dependencies and building into two commands in one step here.

key = var.s3_object
source = var.lambda_local_path
etag = filemd5(var.lambda_local_path)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be done just as well with a local file upload. Uploading the binary to S3 doesn't seem to accomplish the same type of transparency or reproducibility as uploading javascript or python lambda code to S3 does.

resource "aws_iam_policy" "lambda-policy" {
name = "certwatcher-exec"
path = "/"
description = "Policy to allow certwatcher to "
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete description.

name = "certwatcher"
display_name = "certwatcher"
kms_master_key_id = "alias/aws/sns"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two strategic thoughts, kinda over-lapping:

  • With the focus on the sustainability of the internal infrastructure, it might be worth mapping out sooner rather than later how notifications should work. e.g. If there is going to be multiple notification sources then maybe the SNS topic should be defined separately, with a resource policy allowing publication by certwatcher + whatever else needs to send notifications.
  • Since CloudWatch metric alarms are likely to be part of the picture, it would make sense to start out with a custom-managed KMS key because CloudWatch doesn't have access to the standard AWS-managed SNS KMS key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants