Skip to content

Commit e564e92

Browse files
SK-2813: Python SDK v2 — code quality, security hardening, and message fixes (#242)
* SK-2833: Add backward-compatible deprecation shims for update_log_level and FileUploadRequest (#244) * SK-2813: Fix and clean up SDK sample files (#243)
1 parent bbeeeaf commit e564e92

38 files changed

Lines changed: 2980 additions & 1036 deletions

CHANGELOG.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,59 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [2.0.2] - 2026-05-06
6+
### Added
7+
- Dict context support for Conditional Data Access.
8+
9+
## [2.0.1] - 2026-04-29
10+
### Fixed
11+
- Fern client re-initialisation on token refresh.
12+
13+
## [2.0.0] - 2025-11-11
14+
### Added
15+
- Multi-vault and multi-connection support via fluent builder (`Skyflow.builder()`).
16+
- New typed request and response classes for all vault operations (`InsertRequest`, `GetRequest`, `UpdateRequest`, `DeleteRequest`, `QueryRequest`, `DetokenizeRequest`, `TokenizeRequest`, `FileUploadRequest`).
17+
- Detect API: `deidentify_text`, `reidentify_text`, `deidentify_file`, and `get_detect_run`.
18+
- File upload support via `vault().upload_file()`.
19+
- Flexible credential types: API key, static bearer token, service account credentials string, credentials file path, and `SKYFLOW_CREDENTIALS` environment variable.
20+
- `SkyflowError` now includes `http_code`, `grpc_code`, `http_status`, `request_id`, and `details` fields.
21+
- `set_log_level()` on the client for runtime log level changes.
22+
23+
### Changed
24+
- Complete rewrite of the SDK public API. See [docs/migrate_to_v2.md](docs/migrate_to_v2.md) for migration instructions.
25+
26+
## [1.16.0] - 2025-09-23
27+
### Fixed
28+
- Remote disconnect error in vault operations.
29+
30+
## [1.15.8] - 2025-09-30
31+
### Fixed
32+
- Retry logic when `continue_on_error` is set to `true` in insert.
33+
34+
## [1.15.7] - 2025-09-23
35+
### Fixed
36+
- Retry handling for errors in insert method.
37+
38+
## [1.15.6] - 2025-09-22
39+
### Fixed
40+
- Added retry logic for transient errors.
41+
42+
## [1.15.5] - 2025-09-18
43+
### Fixed
44+
- Remote disconnected errors in vault operations.
45+
46+
## [1.15.4] - 2025-09-12
47+
### Fixed
48+
- Retry on exception during vault requests.
49+
50+
## [1.15.3] - 2025-09-12
51+
### Fixed
52+
- Retry on exception during vault requests.
53+
54+
## [1.15.2] - 2025-09-12
55+
### Fixed
56+
- Retry on connection error in insert method.
57+
558
## [1.15.1] - 2023-12-07
659
## Fixed
760
- Not receiving tokens when calling Get with options tokens as true.

README.md

Lines changed: 55 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Skyflow Python SDK
22

3+
> **This is the current, recommended version of the Skyflow SDK.** V2.1.0 brings flexible auth, multi-vault support, native data types, and rich error diagnostics.
4+
>
5+
> Migrating from v1? See the **[Migration Guide](https://github.com/skyflowapi/skyflow-python/blob/main/docs/migrate_to_v2.md)** for step-by-step instructions. V1 is in maintenance mode and will reach End of Life on October 31, 2026.
6+
37
The Skyflow Python SDK is designed to help with integrating Skyflow into a Python backend.
48

59
## Table of Contents
@@ -703,18 +707,65 @@ options = {
703707

704708
Embed context values into a bearer token during generation so you can reference those values in your policies. This enables more flexible access controls, such as tracking end-user identity when making API calls using service accounts, and facilitates using signed data tokens during detokenization.
705709

706-
Generate bearer tokens containing context information using a service account with the context_id identifier. Context information is represented as a JWT claim in a Skyflow-generated bearer token. Tokens generated from such service accounts include a context_identifier claim, are valid for 60 minutes, and can be used to make API calls to the Data and Management APIs, depending on the service account's permissions.
710+
Generate bearer tokens containing context information using a service account with the `context_id` identifier. Context information is represented as a JWT claim in a Skyflow-generated bearer token. Tokens generated from such service accounts include a `context_identifier` claim, are valid for 60 minutes, and can be used to make API calls to the Data and Management APIs, depending on the service account's permissions.
711+
712+
The `ctx` parameter accepts either a **string** or a **dict**:
713+
714+
**String context** — use when your policy references a single context value:
715+
716+
```python
717+
options = {'ctx': 'user_12345'}
718+
token, _ = generate_bearer_token(filepath, options)
719+
```
720+
721+
**Dict context** — use when your policy needs multiple context values for conditional data access. Each key in the dict maps to a Skyflow CEL policy variable under `request.context.*`:
722+
723+
```python
724+
options = {
725+
'ctx': {
726+
'role': 'admin',
727+
'department': 'finance',
728+
'user_id': 'user_12345',
729+
}
730+
}
731+
token, _ = generate_bearer_token(filepath, options)
732+
```
733+
734+
With the dict above, your Skyflow policies can reference `request.context.role`, `request.context.department`, and `request.context.user_id` to make conditional access decisions.
735+
736+
Dict keys must contain only alphanumeric characters and underscores (`[a-zA-Z0-9_]`). Invalid keys will raise a `SkyflowError`.
707737

708738
> [!TIP]
709-
> See the full example in the samples directory: [token_generation_with_context_example.py](samples/service_account/token_generation_with_context_example.py)
710-
> See [docs.skyflow.com](https://docs.skyflow.com) for more details on authentication, access control, and governance for Skyflow.
739+
> See the full example in the samples directory: [token_generation_with_context_example.py](samples/service_account/token_generation_with_context_example.py)
740+
> See Skyflow's [context-aware authorization](https://docs.skyflow.com) and [conditional data access](https://docs.skyflow.com) docs for policy variable syntax like `request.context.*`.
711741
712742
#### Generate signed data tokens: `generate_signed_data_tokens(filepath, options)`
713743

714744
Digitally sign data tokens with a service account's private key to add an extra layer of protection. Skyflow generates data tokens when sensitive data is inserted into the vault. Detokenize signed tokens only by providing the signed data token along with a bearer token generated from the service account's credentials. The service account must have the necessary permissions and context to successfully detokenize the signed data tokens.
715745

746+
The `ctx` parameter on signed data tokens also accepts either a **string** or a **dict**, using the same format as bearer tokens:
747+
748+
```python
749+
# String context
750+
options = {
751+
'ctx': 'user_12345',
752+
'data_tokens': ['dataToken1', 'dataToken2'],
753+
'time_to_live': 90,
754+
}
755+
756+
# Dict context
757+
options = {
758+
'ctx': {
759+
'role': 'analyst',
760+
'department': 'research',
761+
},
762+
'data_tokens': ['dataToken1', 'dataToken2'],
763+
'time_to_live': 90,
764+
}
765+
```
766+
716767
> [!TIP]
717-
> See the full example in the samples directory: [signed_token_generation_example.py](samples/service_account/signed_token_generation_example.py)
768+
> See the full example in the samples directory: [signed_token_generation_example.py](samples/service_account/signed_token_generation_example.py)
718769
> See [docs.skyflow.com](https://docs.skyflow.com) for more details on authentication, access control, and governance for Skyflow.
719770
720771
## Logging
Lines changed: 64 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,14 @@
11
from skyflow.error import SkyflowError
22
from skyflow import Env, Skyflow, LogLevel
33
from skyflow.utils.enums import DetectEntities, MaskingMethod, DetectOutputTranscriptions
4-
from skyflow.vault.detect import DeidentifyFileRequest, TokenFormat, Transformations, DateTransformation, Bleep, FileInput
4+
from skyflow.vault.detect import (
5+
DeidentifyFileRequest,
6+
TokenFormat,
7+
Transformations,
8+
DateTransformation,
9+
Bleep,
10+
FileInput,
11+
)
512

613
"""
714
* Skyflow Deidentify File Example
@@ -11,6 +18,7 @@
1118
* spreadsheets, presentations, structured text.
1219
"""
1320

21+
1422
def perform_file_deidentification():
1523
try:
1624
# Step 1: Configure Credentials
@@ -23,7 +31,7 @@ def perform_file_deidentification():
2331
'vault_id': '<YOUR_VAULT_ID>', # Replace with your vault ID
2432
'cluster_id': '<YOUR_CLUSTER_ID>', # Replace with your cluster ID
2533
'env': Env.PROD, # Deployment environment
26-
'credentials': credentials
34+
'credentials': credentials,
2735
}
2836

2937
# Step 3: Configure & Initialize Skyflow Client
@@ -36,70 +44,66 @@ def perform_file_deidentification():
3644

3745
# Step 4: Create File Object
3846
file_path = '<FILE_PATH>' # Replace with your file path
39-
file = open(file_path, 'rb')
40-
# Step 5: Configure Deidentify File Request with all options
41-
deidentify_request = DeidentifyFileRequest(
42-
file=FileInput(file), # File to de-identify (can also provide a file path)
43-
entities=[DetectEntities.SSN, DetectEntities.CREDIT_CARD], # Entities to detect
44-
allow_regex_list=['<YOUR_REGEX_PATTERN>'], # Optional: Patterns to allow
45-
restrict_regex_list=['<YOUR_REGEX_PATTERN>'], # Optional: Patterns to restrict
46-
47-
# Token format configuration
48-
token_format=TokenFormat(
49-
vault_token=[DetectEntities.SSN], # Use vault tokens for these entities
50-
),
51-
52-
# Optional: Custom transformations
53-
# transformations=Transformations(
54-
# shift_dates=DateTransformation(
55-
# max_days=30,
56-
# min_days=10,
57-
# entities=[DetectEntities.DOB]
58-
# )
59-
# ),
60-
61-
# Output configuration
62-
output_directory='<OUTPUT_DIRECTORY_PATH>', # Where to save processed file
63-
wait_time=15, # Max wait time in seconds (max 64)
64-
65-
# Image-specific options
66-
output_processed_image=True, # Include processed image in output
67-
output_ocr_text=True, # Include OCR text in response
68-
masking_method=MaskingMethod.BLACKBOX, # Masking method for images
69-
70-
# PDF-specific options
71-
pixel_density=15, # Pixel density for PDF processing
72-
max_resolution=2000, # Max resolution for PDF
7347

74-
# Audio-specific options
75-
output_processed_audio=True, # Include processed audio
76-
output_transcription=DetectOutputTranscriptions.PLAINTEXT_TRANSCRIPTION, # Transcription type
77-
78-
# Audio bleep configuration
79-
80-
# bleep=Bleep(
81-
# gain=5, # Loudness in dB
82-
# frequency=1000, # Pitch in Hz
83-
# start_padding=0.1, # Padding at start (seconds)
84-
# stop_padding=0.2 # Padding at end (seconds)
85-
# )
86-
)
87-
88-
# Step 6: Call deidentifyFile API
89-
response = skyflow_client.detect().deidentify_file(deidentify_request)
48+
# Step 5: Configure Deidentify File Request and call API
49+
with open(file_path, 'rb') as file:
50+
deidentify_request = DeidentifyFileRequest(
51+
file=FileInput(file), # File to de-identify (can also provide a file path)
52+
entities=[DetectEntities.SSN, DetectEntities.CREDIT_CARD], # Entities to detect
53+
allow_regex_list=['<YOUR_REGEX_PATTERN>'], # Optional: Patterns to allow
54+
restrict_regex_list=['<YOUR_REGEX_PATTERN>'], # Optional: Patterns to restrict
55+
# Token format configuration
56+
token_format=TokenFormat(
57+
vault_token=[DetectEntities.SSN], # Use vault tokens for these entities
58+
),
59+
# Optional: Custom transformations
60+
# transformations=Transformations(
61+
# shift_dates=DateTransformation(
62+
# max_days=30,
63+
# min_days=10,
64+
# entities=[DetectEntities.DOB]
65+
# )
66+
# ),
67+
# Output configuration
68+
output_directory='<OUTPUT_DIRECTORY_PATH>', # Where to save processed file
69+
wait_time=15, # Max wait time in seconds (max 64)
70+
# Image-specific options
71+
output_processed_image=True, # Include processed image in output
72+
output_ocr_text=True, # Include OCR text in response
73+
masking_method=MaskingMethod.BLACKBOX, # Masking method for images
74+
# PDF-specific options
75+
pixel_density=15, # Pixel density for PDF processing
76+
max_resolution=2000, # Max resolution for PDF
77+
# Audio-specific options
78+
output_processed_audio=True, # Include processed audio
79+
output_transcription=DetectOutputTranscriptions.PLAINTEXT_TRANSCRIPTION, # Transcription type
80+
# Audio bleep configuration
81+
# bleep=Bleep(
82+
# gain=5, # Loudness in dB
83+
# frequency=1000, # Pitch in Hz
84+
# start_padding=0.1, # Padding at start (seconds)
85+
# stop_padding=0.2 # Padding at end (seconds)
86+
# )
87+
)
88+
89+
# Step 6: Call deidentifyFile API
90+
response = skyflow_client.detect().deidentify_file(deidentify_request)
9091

9192
# Handle Successful Response
92-
print("\nDeidentify File Response:", response)
93+
print('\nDeidentify File Response:', response)
9394

9495
except SkyflowError as error:
9596
# Handle Skyflow-specific errors
96-
print('\nSkyflow Error:', {
97-
'http_code': error.http_code,
98-
'grpc_code': error.grpc_code,
99-
'http_status': error.http_status,
100-
'message': error.message,
101-
'details': error.details
102-
})
97+
print(
98+
'\nSkyflow Error:',
99+
{
100+
'http_code': error.http_code,
101+
'grpc_code': error.grpc_code,
102+
'http_status': error.http_status,
103+
'message': error.message,
104+
'details': error.details,
105+
},
106+
)
103107
except Exception as error:
104108
# Handle unexpected errors
105109
print('Unexpected Error:', error)

0 commit comments

Comments
 (0)