Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 80 additions & 105 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,21 @@ A Python package for code analysis and sandbox.

This library can be used to create pipelines that filter code generated by GenAI code models, and for guarding the execution of generated code.

## Quick start
[Design](docs/arch.rst)

To install the library, choose one of the following methods:
[Static Code Analysis](docs/analysis.rst)

```
pip install guardx

```
[Sandbox via seccomp](docs/seccomp.rst)

## Dev: Create a python virtual env
## Quick start

This is recommended.
### Create a python virtual env

```bash
python -m venv .venv
source .venv/bin/activate
```


### Initialization

The library container images must be built before importing and using the library.

```bash
guardx init
```

**Note:** Depending on your system, you may need to run as `sudo .venv/bin/guardx init`.

**Podman:** GuardX uses the docker python package to communicate with containers.
Expand All @@ -42,81 +30,13 @@ podman machine inspect --format '{{.ConnectionInfo.PodmanSocket.Path}}'
export DOCKER_HOST=unix://<your_podman_socket_location>
```

### Test using provided example
### Method 1: via CLI

```bash
python example.py --file example_gen_code.py
```
pip install guardx

### Library Usage

Here is an example of how to use this library in your code.

```python
from guardx import Guardx
from guardx.analysis import AnalysisType

python_code = """<your code here>"""

g = Guardx(config_path="./resources/config.yaml")

# To analyze code
result = g.analyze(python_code, {AnalysisType.DETECT_SECRET, AnalysisType.UNSAFE_CODE})
print(result)

# To execute code in sandbox with a default security policy
result = g.execute(python_code).get_docker_result()
print(result)

# To execute code with global variables passed into the sandbox
globals_dict = {"x": 10, "y": 20}
result = g.execute(python_code, globals=globals_dict).get_docker_result()
print(result)
```

#### Passing Global Variables to Sandbox Execution

You can pass global variables into the sandbox execution environment using the `globals` parameter. This is useful for:
- Providing prior execution state
- Passing configuration or context data
- Simulating stateful execution across multiple code snippets

```python
# Example: Using globals for stateful execution
code1 = "counter = 1"
result1 = g.execute(code1)

# Continue with prior state
code2 = "counter += 1; print(counter)"
result2 = g.execute(code2, globals={"counter": 1})

# Example: Passing complex data structures
code = "result = sum(numbers) * multiplier"
globals_dict = {
"numbers": [1, 2, 3, 4, 5],
"multiplier": 2
}
result = g.execute(code, globals=globals_dict)

# Example: Passing configuration
code = "result = data['value'] * config['multiplier']"
globals_dict = {
"data": {"value": 100},
"config": {"multiplier": 2.5, "threshold": 10}
}
result = g.execute(code, globals=globals_dict)
```

**Important Notes:**
- Global variables must be **JSON-serializable** (strings, numbers, lists, dicts, booleans, None)
- **Do NOT include `__builtins__`** - it is automatically provided by the executor
- Non-serializable objects (functions, classes, modules, file handles) will be automatically filtered out with a warning
- Only the serializable values will be passed to the sandbox

## Development

## Install from a branch

**git+https** (using a [github personal access token](https://help.github.com/articles/creating-an-access-token-for-command-line-use/)):
```bash
pip install git+https://github.com/ibm/guardx.git@{branch/tag}
Expand All @@ -128,18 +48,20 @@ pip install git+https://github.com/ibm/guardx.git@{branch/tag}
pip install git+ssh://git@github.com/ibm/guardx.git@${branch/tag}
```

## Setting up the development environment
The library container images must be built before importing and using the library.

**git clone**:
```bash
guardx init
```
#### Test using provided example

```bash
git clone git@github.com:ibm/guardx.git
make -C guardx init
make -C guardx install
python example.py --file example_gen_code.py
```

### Method 2: for development

### Install pre-requisites
#### Install pre-requisites

```bash
git clone git@github.com:ibm/guardx.git
Expand All @@ -148,7 +70,7 @@ make init
```
**Note:** This installs Poetry. Make sure to configure your PATH to access poetry.

### Install dependencies
#### Install dependencies

To install the dev dependencies (editable mode):

Expand All @@ -158,7 +80,7 @@ make install/dev

**Note:** To add additional dependencies, use `poetry add "package"`. For help, `poetry add -h`.

### Build the library container images
#### Build the library container images

```bash
make containers/docker
Expand All @@ -171,7 +93,7 @@ make containers/podman
**Note:** Fresh build takes 5-10 minutes. Make sure to update the GuardX config file
in resources/config.yaml to match built image name and tag.

### Testing
#### Testing

Test modules are created under the `tests` directory.

Expand All @@ -183,7 +105,7 @@ make test

**Note:** To enable logging, set `log_cli = true` in `tests/pytest.ini`.

### Code Linting
#### Code Linting

Before checking in any code for the project, please lint the code. This can be done using:

Expand All @@ -202,14 +124,67 @@ cd docs
make html
```

## Seccomp policy category
## Library Usage

Here is an example of how to use this library in your code.

```python
from guardx import Guardx
from guardx.analysis import AnalysisType

python_code = """<your code here>"""

g = Guardx(config_path="./resources/config.yaml")

# To analyze code
result = g.analyze(python_code, {AnalysisType.DETECT_SECRET, AnalysisType.UNSAFE_CODE})
print(result)

# To execute code in sandbox with a default security policy
result = g.execute(python_code).get_docker_result()
print(result)

# To execute code with global variables passed into the sandbox
globals_dict = {"x": 10, "y": 20}
result = g.execute(python_code, globals=globals_dict).get_docker_result()
print(result)
```

### Passing Global Variables to Sandbox Execution

You can pass global variables into the sandbox execution environment using the `globals` parameter. This is useful for:
- Providing prior execution state
- Passing configuration or context data
- Simulating stateful execution across multiple code snippets

```python
# Example: Using globals for stateful execution
code1 = "counter = 1"
result1 = g.execute(code1)

# Continue with prior state
code2 = "counter += 1; print(counter)"
result2 = g.execute(code2, globals={"counter": 1})

Set the seccomp policy category in `resources/config.yaml`.
Categories description below:
# Example: Passing complex data structures
code = "result = sum(numbers) * multiplier"
globals_dict = {
"numbers": [1, 2, 3, 4, 5],
"multiplier": 2
}
result = g.execute(code, globals=globals_dict)

- memory: only allow rt\_sigaction, exit\_group, munmap, read stdin, write stdout, write stderr
- nonet: disallow network related syscalls
- crit\_syscalls: disallow syscalls associated with known CVEs or used as launchpad to carry out attacks.
- log: logs all syscalls to auditd.log
- unconfined: no seccomp
# Example: Passing configuration
code = "result = data['value'] * config['multiplier']"
globals_dict = {
"data": {"value": 100},
"config": {"multiplier": 2.5, "threshold": 10}
}
result = g.execute(code, globals=globals_dict)
```

**Important Notes:**
- Global variables must be **JSON-serializable** (strings, numbers, lists, dicts, booleans, None)
- **Do NOT include `__builtins__`** - it is automatically provided by the executor
- Non-serializable objects (functions, classes, modules, file handles) will be automatically filtered out with a warning
- Only the serializable values will be passed to the sandbox
21 changes: 15 additions & 6 deletions docs/analysis.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,23 @@
Security Analysis
=================

GuardX uses `Bandit <https://bandit.readthedocs.io/en/latest/>`_ for static security analysis of Python code and identify security issues before execution.
GuardX provides a common interface for statically scanning Python code for safety and security issues.

Analysis can be invoked like so::

result = guardx.Guardx().analyze(python_code, {AnalysisType.UNSAFE_CODE, AnalysisType.DETECT_SECRET})

**AnalysisType.UNSAFE_CODE:**
Runs `Bandit <https://bandit.readthedocs.io/en/latest/>`_ for static security analysis of Python code and identify security issues before execution.

**AnalysisType.DETECT_SECRET:**
Runs `detect-secrets <https://github.com/Yelp/detect-secrets>`_ checks for *secrets* in code.

By default, all tests in Bandit and detect-secrets are run. Details of the tests are listed below
and can be found on the respective project's website.

Bandit Tests `ref <https://bandit.readthedocs.io/en/latest/plugins/index.html>`_
---------------------------------------------------------------------------------
================================================================================

Bandit includes tests for various security vulnerabilities. Full list `here <https://bandit.readthedocs.io/en/latest/plugins/index.html#complete-test-plugin-listing>`_

Expand Down Expand Up @@ -74,10 +87,6 @@ Bandit supports YAML configuration files:
Detect-Secrets
==============


`detect-secrets <https://github.com/Yelp/detect-secrets>`_ checks for the following *secrets* in code


**Secret Types**
* API keys and tokens
* Private keys (RSA, SSH, PGP)
Expand Down
4 changes: 2 additions & 2 deletions docs/arch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ GuardX analyzes AI generated code snippets using:
* Intent profiles
Based on syscall analysis of "similar code", GuardX will try to match the required permissions of the generated code snippet to a set of predefined intents

1. Execution
2. Execution
------------

Execution of AI generated code poses a high security risk for application users and data owners. To enable this GuardX orchestrates sand boxed containers or VMs (future). Containers can be sand boxed with seccomp policies which are derived from intent profiles. GuardX takes care of instantiating the execution environment, applying appropriate security profiles and verifying the output.
Execution of AI generated code poses a high security risk for application users and data owners. To enable this, GuardX uses sandboxed containers or VMs (future). AI generated code is run inside the sandboxed container. Inside the container, the process used to execute the AI generated code is further sandboxed using seccomp policies. These seccomp policies are derived from intent profiles of the code or selected from a list of seccomp profiles such as `no network,` `strict,` and `memory only.` GuardX takes care of instantiating the execution environment, applying appropriate security profiles and verifying the output.

15 changes: 15 additions & 0 deletions docs/seccomp.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Seccomp policy category
=======================

The process (Python) running the AI code inside the container is further constrained with seccomp.

Set the seccomp policy category in `resources/config.yaml`.
Categories description below:

- memory: only allow rt\_sigaction, exit\_group, munmap, read stdin, write stdout, write stderr
- nonet: disallow network related syscalls
- crit\_syscalls: disallow syscalls associated with known CVEs or used as launchpad to carry out attacks.
- strict: allows read(), write(), _exit(), and sigreturn()
- log: logs all syscalls to auditd.log
- unconfined: no seccomp

Loading