cribl-storage-tool

A command-line tool for managing data across diverse storage solutions. As data lakes and data tiering become the norm, accessing data from multiple clouds and storage platforms can be challenging. The cribl-storage-tool helps you streamline these operations with ease.

Introduction

The cribl-storage-tool is designed to simplify the management of data in heterogeneous storage environments. Whether you're working with S3, Azure Blob Storage, or other storage services, this tool provides a consistent interface to manage, list, and organize your data.

Why & How

Managing data across multiple storage systems is increasingly complex. The cribl-storage-tool was built to:

Simplify Access: Provide a unified command line interface to work with various storage solutions.
Improve Efficiency: Automate common tasks such as listing, transferring, and managing storage objects.
Ensure Compatibility: Support multiple cloud providers and on-premises storage solutions through extensible design.

Features

Multi-Cloud Support: Seamlessly integrate with AWS S3 and other storage platforms.
Command Line Interface: Fast and efficient tool designed for automation and scripting.
Extensible Architecture: Easily add support for new storage&data solutions as your needs evolve.

Installation

Clone the repository and install the necessary dependencies: Run this command in your terminal and it will install the binary file

./scripts/install.sh

Usage

The tool is divided into multiple commands for interacting with different storage components. Below are examples for the two primary commands: listing S3 buckets and setting up IAM roles.

S3 List Command ./cribl-storage-tool s3 list -h

 Usage:
 cribl-storage-tool s3 list [flags]

 Flags:
 -b, --bucket-file string   Path to JSON file containing S3 bucket names (optional)
 -f, --filter string        Filter bucket names containing the specified substring (optional)
 -h, --help                 help for list
 -o, --output string        Output format: text, json, or names (default "text")
 -p, --profile string       AWS profile to use for authentication (optional)
 -x, --regex string         Filter bucket names matching the specified regular expression (optional)
 -r, --region string        AWS region to target (optional)

IAM Setup Command Search it /cribl-storage-tool iam setup -h

cribl-storage-tool iam setup [flags]

Flags:
-a, --account string            AWS Account ID to trust (required if --cribl-worker-arn not provided)
-s, --action string             Action type for the IAM role (default: search) (default "search")
-b, --bucket strings            Name of the S3 bucket to grant access (can specify multiple)
-f, --bucket-file string        Path to JSON file containing S3 bucket names (optional)
--cribl-worker-arn string   Cribl worker ARN (e.g., arn:aws:iam::ACCOUNT:role/WORKSPACE-WORKERGROUP)
-e, --external-id string        External ID for the trust relationship (optional)
-h, --help                      help for setup
-p, --profile string            AWS profile to use for authentication (optional)
-z, --region string             AWS region to target (optional)
-r, --role string               Name of the IAM role to create or update (default "CrossAccountAccessRole")
-g, --workergroup string        Worker group name (default: default) (default "default")
-w, --workspace string          Workspace name (default: main) (default "main")

Examples:

Lets go ahead and use my power account goatshipansible to list all the s3 buckets ./cribl-storage-tool s3 list --profile goatshipansible

Listing S3 Buckets:
 - aws-cloudtrail-logs-55555-55555
 - aws-security-data-lake-us-east-1-55555
 - aws-security-data-lake-us-east-2-55555
 - aws-security-data-lake-us-west-1-55555
 - aws-security-data-lake-us-west-2-55555
 - badcoffee
 - ckoamplifybucket
 - ckoamplifybucketparquet
 - config-bucket-55555
 - criblcoffeeroute53
 - 55555-55555-55555-9d8c-442f-af10-55555
 - o11ys3bucket
 - seclake-customsource

Now that i have all my buckets listed i want to be able to search on Cribl Search for the buckets badcoffee and ckoamplifybucket for account we will be using the cribl account trust relationship if you have that handy. In this case for my tenant: 47111295931415 the workspace i'm going to link the bucket to is -w contractors .

Since I like pi im going to specify my external-id with the flag -e 31415 + the role name -r elbcoffeee here is the completed command for badcoffee bucket:

./cribl-storage-tool iam setup --account 4711129531415 --profile goatshipansible --bucket badcoffee -e 314515 -r elbcoffee --workspace contractors

you will see a stream of logs and if successful {"level":"info","command":"iam_setup","time":"2025-02-27T13:48:26-05:00","message":"IAM trust relationship setup completed successfully"}

Speaking of stream lets go ahead and edit the command for stream to send data to the bucket from Stream: ./cribl-storage-tool iam setup --account 4711129531415 --profile goatshipansible --bucket badcoffee -e 314515 -r elbcoffee --workspace contractors --workergroup default --action send

Using the filter || regex

./cribl-storage-tool s3 list --profile goatshipansible --filter lake

zamorofthat@29JH7X-pi cribl-storage-tool % ./cribl-storage-tool s3 list --profile goatshipansible --filter lake

Listing S3 Buckets:
 - aws-security-data-lake-us-east-1-55555
 - aws-security-data-lake-us-east-2-55555
 - aws-security-data-lake-us-west-1-55555
 - aws-security-data-lake-us-west-2-55555
 - seclake-customsource

./cribl-storage-tool s3 list --profile goatshipansible --regex "lake.*"\

zamorofthat@29JH7X-pi cribl-storage-tool % ./cribl-storage-tool s3 list --profile goatshipansible --regex "lake.*"

Listing S3 Buckets:
 - aws-security-data-lake-us-east-1-55555
 - aws-security-data-lake-us-east-2-55555
 - aws-security-data-lake-us-west-1-55555
 - aws-security-data-lake-us-west-2-55555
 - seclake-customsource

Create a bucket file to loop over in bash

cribl-storage-tool s3 list -p goatshipansible -o names > goats.txt 
badcoffee
criblcoffeeroute53
criblcompetitorsbucket
kerno-samples-cdd9cc33-9d8c-442f-af10-6e07196e0d71
seclake-customsource

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
cmd		cmd
pkg		pkg
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
buckets.json		buckets.json
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cribl-storage-tool

Table of Contents

Introduction

Why & How

Features

Installation

Usage

Examples:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cribl-storage-tool

Table of Contents

Introduction

Why & How

Features

Installation

Usage

Examples:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages