Skip to content

Commit 177bfba

Browse files
authored
Merge pull request #367 from keboola/vb/DMD-1014/kbc_llm_docs
docs: add documentation for kbc llm commands (BETA)
2 parents 06784ef + cd0b2e3 commit 177bfba

4 files changed

Lines changed: 234 additions & 0 deletions

File tree

cli/commands/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,10 @@ kbc help local create row
9090
| [kbc dbt generate profile](/cli/commands/dbt/generate/profile/) | Generate profiles for use with dbt. |
9191
| [kbc dbt generate sources](/cli/commands/dbt/generate/sources/) | Generate sources for use with dbt. |
9292
| [kbc dbt generate env](/cli/commands/dbt/generate/env/) | Generate environment variables for use with dbt. |
93+
| | |
94+
| **[kbc llm](/cli/commands/llm/) (BETA)** | **Export project data to AI-optimized format.** |
95+
| [kbc llm init](/cli/commands/llm/init/) | Initialize a new local directory for LLM export. |
96+
| [kbc llm export](/cli/commands/llm/export/) | Export project data to AI-optimized twin format. |
9397

9498
## Aliases
9599

cli/commands/llm/export/index.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
---
2+
title: LLM Export Command
3+
permalink: /cli/commands/llm/export/
4+
---
5+
6+
* TOC
7+
{:toc}
8+
9+
<div class="alert alert-info" role="alert">
10+
<strong>BETA:</strong> The LLM commands are currently in beta. Features and output format may change.
11+
</div>
12+
13+
**Export project data to AI-optimized twin format directory structure.**
14+
15+
```
16+
kbc llm export [flags]
17+
```
18+
19+
The command must be run in a directory initialized with [kbc llm init](/cli/commands/llm/init/).
20+
21+
## Description
22+
23+
The twin format is designed for AI assistants to understand and work with Keboola projects directly from Git repositories. The export includes:
24+
25+
- **Bucket and table metadata** with schema information
26+
- **Transformation configurations** with platform detection
27+
- **Component configurations** organized by type
28+
- **Job execution history** and statistics
29+
- **Lineage graph** showing data flow dependencies
30+
- **Optional data samples** (controlled by flags)
31+
32+
The export creates output files containing JSON with inline documentation (`_comment`, `_purpose`, `_update_frequency` fields) to help AI assistants understand the data structure.
33+
34+
### Security Features
35+
36+
- **Public repository detection** - Automatically detects if the directory is a public Git repository
37+
- **Sample export disabled by default** - Data samples must be explicitly enabled with `--with-samples`
38+
- **Encrypted secrets** - Fields starting with `#` are encrypted in the output
39+
40+
## Options
41+
42+
`-H, --storage-api-host <string>`
43+
: Keboola instance URL, e.g., "connection.keboola.com"
44+
45+
`-t, --storage-api-token <string>`
46+
: Storage API token from your project
47+
48+
`-f, --force`
49+
: Skip confirmation when directory contains existing files
50+
51+
`--with-samples`
52+
: Include table data samples in the export
53+
54+
`--sample-limit <int>`
55+
: Maximum number of rows per table sample (default: 100, max: 1000)
56+
57+
`--max-samples <int>`
58+
: Maximum number of tables to sample (default: 50, max: 100)
59+
60+
[Global Options](/cli/commands/#global-options)
61+
62+
## Output Structure
63+
64+
The export creates the following directory structure:
65+
66+
```
67+
.
68+
├── buckets/ # Bucket and table metadata
69+
│ └── index.json
70+
├── transformations/ # Transformation configurations
71+
├── components/ # Component configurations by type
72+
├── jobs/ # Job execution history
73+
│ ├── recent/
74+
│ └── by-component/
75+
├── indices/ # Query indices and lookups
76+
│ └── queries/
77+
├── ai/ # AI assistant guides
78+
├── samples/ # Table data samples (if --with-samples)
79+
├── lineage.json # Data flow dependencies
80+
└── metadata.json # Project metadata
81+
```
82+
83+
## Examples
84+
85+
### Basic Export
86+
87+
```
88+
➜ kbc llm export
89+
90+
[1/5] Getting default branch...
91+
Using branch: Main (ID: 1234)
92+
[2/5] Fetching project data from APIs...
93+
Fetched: 5 buckets, 23 tables, 150 jobs
94+
[3/5] Processing data (lineage, platforms, sources)...
95+
Processed: 5 buckets, 23 tables, 8 transformations, 45 lineage edges
96+
[4/5] Generating twin format output...
97+
[5/5] Skipping samples (not requested)
98+
Twin format exported to: /path/to/project
99+
Export completed successfully.
100+
```
101+
102+
### Export with Data Samples
103+
104+
```
105+
➜ kbc llm export --with-samples --sample-limit 50 --max-samples 20
106+
107+
[1/5] Getting default branch...
108+
Using branch: Main (ID: 1234)
109+
[2/5] Fetching project data from APIs...
110+
Fetched: 5 buckets, 23 tables, 150 jobs
111+
[3/5] Processing data (lineage, platforms, sources)...
112+
Processed: 5 buckets, 23 tables, 8 transformations, 45 lineage edges
113+
[4/5] Generating twin format output...
114+
[5/5] Fetching and generating table samples...
115+
Twin format exported to: /path/to/project
116+
Export completed successfully.
117+
```
118+
119+
## Next Steps
120+
121+
- [LLM Init](/cli/commands/llm/init/)
122+
- [All Commands](/cli/commands/)

cli/commands/llm/index.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
title: LLM Commands (BETA)
3+
permalink: /cli/commands/llm/
4+
---
5+
6+
* TOC
7+
{:toc}
8+
9+
<div class="alert alert-info" role="alert">
10+
<strong>BETA:</strong> The LLM commands are currently in beta. Features and output format may change.
11+
</div>
12+
13+
**Export project data to AI-optimized format for use with AI assistants and LLMs.**
14+
15+
The `kbc llm` commands create a "twin format" representation of your Keboola project,
16+
designed for AI assistants to understand and work with your data pipelines.
17+
18+
```
19+
kbc llm [command]
20+
```
21+
22+
## Workflow
23+
24+
1. **Initialize** - Run `kbc llm init` to set up the local directory
25+
2. **Export** - Run `kbc llm export` to generate AI-optimized project data
26+
27+
## Available Commands
28+
29+
|---
30+
| Command | Description
31+
|-|-|-
32+
| [kbc llm init](/cli/commands/llm/init/) | Initialize a new local directory for LLM export. |
33+
| [kbc llm export](/cli/commands/llm/export/) | Export project data to AI-optimized twin format. |
34+
35+
## Next Steps
36+
37+
- [LLM Init](/cli/commands/llm/init/)
38+
- [LLM Export](/cli/commands/llm/export/)
39+
- [All Commands](/cli/commands/)

cli/commands/llm/init/index.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: LLM Init Command
3+
permalink: /cli/commands/llm/init/
4+
---
5+
6+
* TOC
7+
{:toc}
8+
9+
<div class="alert alert-info" role="alert">
10+
<strong>BETA:</strong> The LLM commands are currently in beta. Features and output format may change.
11+
</div>
12+
13+
**Initialize a new local directory for LLM export.**
14+
15+
```
16+
kbc llm init [flags]
17+
```
18+
19+
The command must be run in an empty directory.
20+
21+
This command creates the local manifest and metadata directory (`.keboola/`) without pulling any data from Keboola Connection.
22+
Use [kbc llm export](/cli/commands/llm/export/) after initialization to generate the AI-optimized project data.
23+
24+
If the command is run without options, it will start an interactive dialog asking for:
25+
- URL of the [stack](https://help.keboola.com/overview/#stacks), for example, `connection.keboola.com`.
26+
- [Storage API token](https://help.keboola.com/management/project/tokens/) to your project.
27+
- Allowed [branches](https://help.keboola.com/tutorial/branches/) to work with.
28+
29+
## Options
30+
31+
`-H, --storage-api-host <string>`
32+
: Keboola instance URL, e.g., "connection.keboola.com"
33+
34+
`-t, --storage-api-token <string>`
35+
: Storage API token from your project
36+
37+
`-b, --branches <string>`
38+
: Comma-separated list of branch IDs or name globs (use "*" for all)
39+
40+
`--allow-target-env`
41+
: Allow usage of `KBC_PROJECT_ID` and `KBC_BRANCH_ID` environment variables for future operations
42+
43+
[Global Options](/cli/commands/#global-options)
44+
45+
## Examples
46+
47+
```
48+
➜ kbc llm init
49+
50+
Please enter the Keboola Storage API host, e.g., "connection.keboola.com".
51+
? API host: connection.north-europe.azure.keboola.com
52+
53+
Please enter the Keboola Storage API token. Its value will be hidden.
54+
? API token: ***************************************************
55+
56+
Please select which project's branches you want to use with this CLI.
57+
? Allowed project's branches: only main branch
58+
59+
Created metadata directory ".keboola".
60+
Created manifest file ".keboola/manifest.json".
61+
Created file ".env.local" - it contains the API token, keep it local and secret.
62+
Created file ".env.dist" - an ".env.local" template.
63+
Created file ".gitignore" - to keep ".env.local" local.
64+
```
65+
66+
## Next Steps
67+
68+
- [LLM Export](/cli/commands/llm/export/)
69+
- [All Commands](/cli/commands/)

0 commit comments

Comments
 (0)