Skip to content

Commit 0490e6e

Browse files
SK-2813: Merge branch 'main' into saileshwar/SK-2813-python-v2-code-clean-up-and-fixes
2 parents bbeeeaf + 7ce51fb commit 0490e6e

15 files changed

Lines changed: 723 additions & 250 deletions

File tree

README.md

Lines changed: 51 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -703,18 +703,65 @@ options = {
703703

704704
Embed context values into a bearer token during generation so you can reference those values in your policies. This enables more flexible access controls, such as tracking end-user identity when making API calls using service accounts, and facilitates using signed data tokens during detokenization.
705705

706-
Generate bearer tokens containing context information using a service account with the context_id identifier. Context information is represented as a JWT claim in a Skyflow-generated bearer token. Tokens generated from such service accounts include a context_identifier claim, are valid for 60 minutes, and can be used to make API calls to the Data and Management APIs, depending on the service account's permissions.
706+
Generate bearer tokens containing context information using a service account with the `context_id` identifier. Context information is represented as a JWT claim in a Skyflow-generated bearer token. Tokens generated from such service accounts include a `context_identifier` claim, are valid for 60 minutes, and can be used to make API calls to the Data and Management APIs, depending on the service account's permissions.
707+
708+
The `ctx` parameter accepts either a **string** or a **dict**:
709+
710+
**String context** — use when your policy references a single context value:
711+
712+
```python
713+
options = {'ctx': 'user_12345'}
714+
token, _ = generate_bearer_token(filepath, options)
715+
```
716+
717+
**Dict context** — use when your policy needs multiple context values for conditional data access. Each key in the dict maps to a Skyflow CEL policy variable under `request.context.*`:
718+
719+
```python
720+
options = {
721+
'ctx': {
722+
'role': 'admin',
723+
'department': 'finance',
724+
'user_id': 'user_12345',
725+
}
726+
}
727+
token, _ = generate_bearer_token(filepath, options)
728+
```
729+
730+
With the dict above, your Skyflow policies can reference `request.context.role`, `request.context.department`, and `request.context.user_id` to make conditional access decisions.
731+
732+
Dict keys must contain only alphanumeric characters and underscores (`[a-zA-Z0-9_]`). Invalid keys will raise a `SkyflowError`.
707733

708734
> [!TIP]
709-
> See the full example in the samples directory: [token_generation_with_context_example.py](samples/service_account/token_generation_with_context_example.py)
710-
> See [docs.skyflow.com](https://docs.skyflow.com) for more details on authentication, access control, and governance for Skyflow.
735+
> See the full example in the samples directory: [token_generation_with_context_example.py](samples/service_account/token_generation_with_context_example.py)
736+
> See Skyflow's [context-aware authorization](https://docs.skyflow.com) and [conditional data access](https://docs.skyflow.com) docs for policy variable syntax like `request.context.*`.
711737
712738
#### Generate signed data tokens: `generate_signed_data_tokens(filepath, options)`
713739

714740
Digitally sign data tokens with a service account's private key to add an extra layer of protection. Skyflow generates data tokens when sensitive data is inserted into the vault. Detokenize signed tokens only by providing the signed data token along with a bearer token generated from the service account's credentials. The service account must have the necessary permissions and context to successfully detokenize the signed data tokens.
715741

742+
The `ctx` parameter on signed data tokens also accepts either a **string** or a **dict**, using the same format as bearer tokens:
743+
744+
```python
745+
# String context
746+
options = {
747+
'ctx': 'user_12345',
748+
'data_tokens': ['dataToken1', 'dataToken2'],
749+
'time_to_live': 90,
750+
}
751+
752+
# Dict context
753+
options = {
754+
'ctx': {
755+
'role': 'analyst',
756+
'department': 'research',
757+
},
758+
'data_tokens': ['dataToken1', 'dataToken2'],
759+
'time_to_live': 90,
760+
}
761+
```
762+
716763
> [!TIP]
717-
> See the full example in the samples directory: [signed_token_generation_example.py](samples/service_account/signed_token_generation_example.py)
764+
> See the full example in the samples directory: [signed_token_generation_example.py](samples/service_account/signed_token_generation_example.py)
718765
> See [docs.skyflow.com](https://docs.skyflow.com) for more details on authentication, access control, and governance for Skyflow.
719766
720767
## Logging
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
from skyflow.error import SkyflowError
2+
from skyflow import Env, Skyflow, LogLevel
3+
from skyflow.utils.enums import DetectEntities, MaskingMethod, DetectOutputTranscriptions
4+
from skyflow.vault.detect import DeidentifyFileRequest, TokenFormat, Transformations, DateTransformation, Bleep, FileInput
5+
from concurrent.futures import ThreadPoolExecutor
6+
7+
"""
8+
* Skyflow Deidentify File Example
9+
*
10+
* This sample demonstrates how to use all available options for deidentifying files
11+
* using an asynchronous approach.
12+
* Supported file types: images (jpg, png, etc.), pdf, audio (mp3, wav), documents,
13+
* spreadsheets, presentations, structured text.
14+
"""
15+
16+
def perform_file_deidentification_async():
17+
try:
18+
# Step 1: Configure Credentials
19+
credentials = {
20+
'path': '/path/to/credentials.json' # Path to credentials file
21+
}
22+
23+
# Step 2: Configure Vault
24+
vault_config = {
25+
'vault_id': '<YOUR_VAULT_ID>', # Replace with your vault ID
26+
'cluster_id': '<YOUR_CLUSTER_ID>', # Replace with your cluster ID
27+
'env': Env.PROD, # Deployment environment
28+
'credentials': credentials
29+
}
30+
31+
# Step 3: Configure & Initialize Skyflow Client
32+
skyflow_client = (
33+
Skyflow.builder()
34+
.add_vault_config(vault_config)
35+
.set_log_level(LogLevel.INFO) # Use LogLevel.ERROR in production
36+
.build()
37+
)
38+
39+
# Step 4: Create File Object
40+
file_path = '<FILE_PATH>' # Replace with your file path
41+
42+
deidentify_request = DeidentifyFileRequest(
43+
file=FileInput(file_path=file_path), # File to de-identify
44+
# entities=[DetectEntities.SSN, DetectEntities.CREDIT_CARD], # Entities to detect
45+
allow_regex_list=['<YOUR_REGEX_PATTERN>'], # Optional: Patterns to allow
46+
restrict_regex_list=['<YOUR_REGEX_PATTERN>'], # Optional: Patterns to restrict
47+
48+
# Token format configuration
49+
token_format=TokenFormat(
50+
vault_token=[DetectEntities.SSN], # Use vault tokens for these entities
51+
),
52+
53+
# Optional: Custom transformations
54+
# transformations=Transformations(
55+
# shift_dates=DateTransformation(
56+
# max_days=30,
57+
# min_days=10,
58+
# entities=[DetectEntities.DOB]
59+
# )
60+
# ),
61+
62+
# Output configuration
63+
output_directory='<OUTPUT_DIRECTORY_PATH>', # Where to save processed file
64+
wait_time=15, # Max wait time in seconds (max 64)
65+
66+
# Image-specific options
67+
output_processed_image=True, # Include processed image in output
68+
output_ocr_text=True, # Include OCR text in response
69+
masking_method=MaskingMethod.BLACKBOX, # Masking method for images
70+
71+
# PDF-specific options
72+
pixel_density=15, # Pixel density for PDF processing
73+
max_resolution=2000, # Max resolution for PDF
74+
75+
# Audio-specific options
76+
output_processed_audio=True, # Include processed audio
77+
output_transcription=DetectOutputTranscriptions.PLAINTEXT_TRANSCRIPTION, # Transcription type
78+
79+
# Audio bleep configuration
80+
81+
# bleep=Bleep(
82+
# gain=5, # Loudness in dB
83+
# frequency=1000, # Pitch in Hz
84+
# start_padding=0.1, # Padding at start (seconds)
85+
# stop_padding=0.2 # Padding at end (seconds)
86+
# )
87+
)
88+
89+
# Create a thread pool executor
90+
executor = ThreadPoolExecutor(max_workers=1)
91+
92+
future = executor.submit(
93+
lambda: skyflow_client.detect().deidentify_file(deidentify_request)
94+
)
95+
96+
def handle_response(future):
97+
exception = future.exception()
98+
if exception is not None:
99+
if isinstance(exception, SkyflowError):
100+
# Handle Skyflow-specific errors
101+
print('\nSkyflow Error:', {
102+
'http_code': exception.http_code,
103+
'grpc_code': exception.grpc_code,
104+
'http_status': exception.http_status,
105+
'message': exception.message,
106+
'details': exception.details
107+
})
108+
else:
109+
# Handle unexpected errors
110+
print('Unexpected Error:', exception)
111+
return
112+
113+
# Handle Successful Response
114+
result = future.result()
115+
print("\nDeidentify File Response:", result)
116+
117+
future.add_done_callback(handle_response)
118+
119+
executor.shutdown(wait=True)
120+
121+
except Exception as error:
122+
# Handle unexpected errors
123+
print('Unexpected Error:', error)
124+

samples/service_account/signed_token_generation_example.py

Lines changed: 41 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -18,42 +18,54 @@
1818
credentials_string = json.dumps(skyflow_credentials)
1919

2020

21-
options = {
22-
'ctx': 'CONTEXT_ID',
23-
'data_tokens': ['DATA_TOKEN1', 'DATA_TOKEN2'],
24-
'time_to_live': 90, # in seconds
25-
}
21+
# Approach 1: Signed data tokens with string context
22+
def get_signed_tokens_with_string_context():
23+
options = {
24+
'ctx': 'user_12345',
25+
'data_tokens': ['DATA_TOKEN1', 'DATA_TOKEN2'],
26+
'time_to_live': 90, # in seconds
27+
}
28+
try:
29+
data_token, signed_data_token = generate_signed_data_tokens(file_path, options)
30+
return data_token, signed_data_token
31+
except Exception as e:
32+
print(f'Error: {str(e)}')
2633

27-
def get_signed_bearer_token_from_file_path():
28-
# Generate signed bearer token from credentials file path.
29-
global bearer_token
3034

35+
# Approach 2: Signed data tokens with JSON object context (dict)
36+
# Each key maps to a Skyflow CEL policy variable under request.context.*
37+
# For example: request.context.role == "analyst" and request.context.department == "research"
38+
def get_signed_tokens_with_object_context():
39+
options = {
40+
'ctx': {
41+
'role': 'analyst',
42+
'department': 'research',
43+
'user_id': 'user_67890',
44+
},
45+
'data_tokens': ['DATA_TOKEN1', 'DATA_TOKEN2'],
46+
'time_to_live': 90,
47+
}
3148
try:
32-
if not is_expired(bearer_token):
33-
return bearer_token
34-
else:
35-
data_token, signed_data_token = generate_signed_data_tokens(file_path, options)
36-
return data_token, signed_data_token
37-
49+
data_token, signed_data_token = generate_signed_data_tokens(file_path, options)
50+
return data_token, signed_data_token
3851
except Exception as e:
39-
print(f'Error generating token from file path: {str(e)}')
52+
print(f'Error: {str(e)}')
4053

4154

42-
def get_signed_bearer_token_from_credentials_string():
43-
# Generate signed bearer token from credentials string.
44-
global bearer_token
45-
55+
# Approach 3: Signed data tokens from credentials string
56+
def get_signed_tokens_from_credentials_string():
57+
options = {
58+
'ctx': 'user_12345',
59+
'data_tokens': ['DATA_TOKEN1', 'DATA_TOKEN2'],
60+
'time_to_live': 90,
61+
}
4662
try:
47-
if not is_expired(bearer_token):
48-
return bearer_token
49-
else:
50-
data_token, signed_data_token = generate_signed_data_tokens_from_creds(credentials_string, options)
51-
return data_token, signed_data_token
52-
63+
data_token, signed_data_token = generate_signed_data_tokens_from_creds(credentials_string, options)
64+
return data_token, signed_data_token
5365
except Exception as e:
54-
print(f'Error generating token from credentials string: {str(e)}')
55-
66+
print(f'Error: {str(e)}')
5667

57-
print(get_signed_bearer_token_from_file_path())
5868

59-
print(get_signed_bearer_token_from_credentials_string())
69+
print("String context:", get_signed_tokens_with_string_context())
70+
print("Object context:", get_signed_tokens_with_object_context())
71+
print("Creds string:", get_signed_tokens_from_credentials_string())

samples/service_account/token_generation_with_context_example.py

Lines changed: 37 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,13 @@
1818
}
1919
credentials_string = json.dumps(skyflow_credentials)
2020

21-
options = {'ctx': '<CONTEXT_ID>'}
2221

23-
def get_bearer_token_with_context_from_file_path():
24-
# Generate bearer token with context from credentials file path.
22+
# Approach 1: Bearer token with string context
23+
# Use a simple string identifier when your policy references a single context value.
24+
# In your Skyflow policy, reference this as: request.context
25+
def get_bearer_token_with_string_context():
2526
global bearer_token
27+
options = {'ctx': 'user_12345'}
2628

2729
try:
2830
if not is_expired(bearer_token):
@@ -31,14 +33,40 @@ def get_bearer_token_with_context_from_file_path():
3133
token, _ = generate_bearer_token(file_path, options)
3234
bearer_token = token
3335
return bearer_token
36+
except Exception as e:
37+
print(f'Error generating token: {str(e)}')
38+
39+
40+
# Approach 2: Bearer token with JSON object context (dict)
41+
# Use a dict when your policy needs multiple context values for conditional data access.
42+
# Each key maps to a Skyflow CEL policy variable under request.context.*
43+
# For example: request.context.role == "admin" and request.context.department == "finance"
44+
def get_bearer_token_with_object_context():
45+
global bearer_token
46+
options = {
47+
'ctx': {
48+
'role': 'admin',
49+
'department': 'finance',
50+
'user_id': 'user_12345',
51+
}
52+
}
3453

54+
try:
55+
if not is_expired(bearer_token):
56+
return bearer_token
57+
else:
58+
token, _ = generate_bearer_token(file_path, options)
59+
bearer_token = token
60+
return bearer_token
3561
except Exception as e:
36-
print(f'Error generating token from file path: {str(e)}')
62+
print(f'Error generating token: {str(e)}')
3763

3864

65+
# Approach 3: Bearer token with string context from credentials string
3966
def get_bearer_token_with_context_from_credentials_string():
40-
# Generate bearer token with context from credentials string.
4167
global bearer_token
68+
options = {'ctx': 'user_12345'}
69+
4270
try:
4371
if not is_expired(bearer_token):
4472
return bearer_token
@@ -47,9 +75,9 @@ def get_bearer_token_with_context_from_credentials_string():
4775
bearer_token = token
4876
return bearer_token
4977
except Exception as e:
50-
print(f"Error generating token from credentials string: {str(e)}")
51-
78+
print(f"Error generating token: {str(e)}")
5279

53-
print(get_bearer_token_with_context_from_file_path())
5480

55-
print(get_bearer_token_with_context_from_credentials_string())
81+
print("String context:", get_bearer_token_with_string_context())
82+
print("Object context:", get_bearer_token_with_object_context())
83+
print("Creds string:", get_bearer_token_with_context_from_credentials_string())

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
if sys.version_info < (3, 8):
99
raise RuntimeError("skyflow requires Python 3.8+")
10-
current_version = '2.0.0.dev0+f7d26df'
10+
current_version = '2.0.2'
1111

1212
setup(
1313
name='skyflow',

0 commit comments

Comments
 (0)