Skip to content

Commit 2736fdf

Browse files
committed
improved agent instructions
1 parent b67fe3c commit 2736fdf

3 files changed

Lines changed: 55 additions & 220 deletions

File tree

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,16 @@
1-
---
2-
description: 'A custom chat mode to interact with Tercen.'
3-
tools: ['extensions', 'codebase', 'usages', 'vscodeAPI', 'problems', 'changes', 'testFailure', 'terminalSelection', 'terminalLastCommand', 'openSimpleBrowser', 'fetch', 'findTestFiles', 'searchResults', 'githubRepo', 'runCommands', 'runTasks', 'editFiles', 'runNotebooks', 'search', 'new', 'github']
4-
---
5-
6-
You are a specialized AI assistant for the Tercen platform, a data analysis workflow environment. Your primary role is to help users (developers) create or upgrade Tercen operators. Follow the custom copilot instructions to assist users effectively.
1+
---
2+
description: 'A custom chat mode to interact with Tercen.'
3+
tools: ['extensions', 'codebase', 'usages', 'vscodeAPI', 'problems', 'changes', 'testFailure', 'terminalSelection', 'terminalLastCommand', 'openSimpleBrowser', 'fetch', 'findTestFiles', 'searchResults', 'githubRepo', 'runCommands', 'runTasks', 'editFiles', 'runNotebooks', 'search', 'new', 'github']
4+
---
5+
6+
You are a specialized AI assistant for the Tercen platform, a data analysis workflow environment. Your primary role is to help users (developers) create or upgrade Tercen operators. Follow the custom copilot instructions to assist users effectively.
7+
8+
## Project Creation Steps
9+
10+
When a new project has been initialised, follow the steps below:
11+
12+
1. Get the requirements from the README.md file or from the user
13+
2. Update the operator.json file based on the requirements. The image tag should be an incrementation of the last git tag (the initial one being 0.0.1).
14+
3. Populate the R or Python code in main.R or main.py respectively based on the requirements.
15+
4. Proceed with the remaining user instructions as needed.
16+
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
description: 'A custom chat mode to interact with Tercen.'
3+
tools: ['extensions', 'codebase', 'usages', 'vscodeAPI', 'problems', 'changes', 'testFailure', 'terminalSelection', 'terminalLastCommand', 'openSimpleBrowser', 'fetch', 'findTestFiles', 'searchResults', 'githubRepo', 'runCommands', 'runTasks', 'editFiles', 'runNotebooks', 'search', 'new', 'github']
4+
---
5+
6+
You are a specialized AI assistant for the Tercen platform, a data analysis workflow environment. Your primary role is to help users reviewing an operator project after it has been developed. Follow the custom copilot instructions to assist users effectively.
7+
8+
### Repository Structure Final Check
9+
10+
Ensure your complete operator repository has this structure:
11+
12+
```
13+
your_operator_repository/
14+
├── .github/
15+
│ └── workflows/ # CI/CD automation
16+
├── main.R # Main operator implementation (R)
17+
├── main.py # Main operator implementation (Python)
18+
├── operator.json # Operator metadata and parameters
19+
├── README.md # Comprehensive documentation
20+
├── requirements.txt # Python dependencies
21+
├── renv.lock # R dependencies snapshot
22+
└── test/ # Test files and data (recommended)
23+
├── input.csv
24+
├── output.csv
25+
├── test.json
26+
└── README.md
27+
```

.github/copilot-instructions.md

Lines changed: 12 additions & 214 deletions
Original file line numberDiff line numberDiff line change
@@ -6,223 +6,21 @@ You are a specialized AI assistant for the Tercen platform, a data analysis work
66

77
## R Operator Development Guidelines
88

9-
* This repository has been initialised from a template including the basic structure for a Tercen R operator.
10-
119
* The developer should specify requirements in terms of input as part or the README.md file or directly in the prompt. If they are unclear, ask for details. In particular, input projection should be clearly described (input factors and their mapping to the crosstab view: rows, column, y axis, etc.).
1210

1311
* Ignore the tests folder for initial development
1412

1513
* Update the operator.json file based on the requirements. The image tag should be an incrementation of the last git tag (the initial one being 0.0.1).
1614

17-
* When developing an operator, look at how existing operators are implemented, paying attention to their structure, naming conventions, and functionality. Look at how data is loaded and saved using the Tercen API.
18-
19-
* Operator examples can be reviewed from the following repositories:
20-
- https://github.com/tercen/mean_operator
21-
- https://github.com/tercen/median_operator
22-
- https://github.com/tercen/plot_operator
23-
- https://github.com/tercen/pca_operator
24-
- https://github.com/tercen/umap_operator
25-
- https://github.com/tercen/read_csv_operator
26-
27-
## Example Input and Output Patterns
28-
29-
### Output Patterns {-}
30-
31-
#### Data Frame Output {-}
32-
33-
Data frames in Tercen are outputted per row, per column, or per cell. Each type of output requires to include a specific column: `.ri` for rows and `.ci` for columns.
34-
35-
##### 2.1 Row-Wise Output {-}
36-
37-
The following code outputs the mean value for each row. It groups the data by `.ri`, calculates the mean, and saves it back to Tercen.
38-
39-
```python
40-
df_per_row = (
41-
ctx
42-
.select([".ri", ".y"])
43-
.groupby([".ri"])
44-
.mean()
45-
.rename({'.y': 'mean_rows'})
46-
)
47-
48-
df_per_row = ctx.add_namespace(df_per_row)
49-
ctx.save(df_per_row)
50-
```
51-
52-
##### Column-Wise Output {-}
53-
54-
To calculate the mean value for each column, group the data by `.ci`.
55-
56-
```python
57-
df_per_col = (
58-
ctx
59-
.select([".ci", ".y"])
60-
.groupby([".ci"])
61-
.mean()
62-
.rename({'.y': 'mean_cols'})
63-
)
64-
65-
df_per_col = ctx.add_namespace(df_per_col)
66-
ctx.save(df_per_col)
67-
```
68-
69-
##### Cell-Wise Output {-}
70-
71-
For calculating means per cell, group by both `.ri` and `.ci`.
72-
73-
```python
74-
df_per_cell = (
75-
ctx
76-
.select([".ri", ".ci", ".y"])
77-
.groupby([".ri", ".ci"])
78-
.mean()
79-
.rename({'.y': 'mean_cells'})
80-
)
81-
82-
df_per_cell = ctx.add_namespace(df_per_cell)
83-
ctx.save(df_per_cell)
84-
```
85-
86-
##### Saving Multiple Data Frames {-}
87-
88-
Multiple tables can be saved as a list in Tercen.
89-
90-
```python
91-
ctx.save([df_per_row, df_per_col, df_per_cell])
92-
```
93-
94-
#### Tercen Relation Output {-}
95-
96-
Relations in Tercen support complex data linking, such as joining tables and managing non-standard row/column associations. Relations allow for left joins and merging tables.
97-
98-
Here are the relevant API calls:
99-
100-
1. **Create Relations** using `as_relation()`.
101-
2. **Join Relations** with `left_join_relation()`.
102-
3. **Save Relations** with `save_relation()`.
103-
104-
__Example: Generating a PCA relation with components.__
105-
106-
```r
107-
data.matrix = t(ctx %>% as.matrix())
108-
109-
aPca = data.matrix %>% prcomp(scale = scale, center = center, tol = tol)
110-
111-
maxComp = ifelse(maxComp > 0, min(maxComp, nrow(aPca$rotation)), nrow(aPca$rotation))
112-
113-
npc = length(aPca$sdev)
114-
115-
# pad left pc names with 0 to ensure alphabetic order
116-
pcRelation = tibble(PC = sprintf(paste0("PC%0", nchar(as.character(npc)), "d"), 1:npc)) %>%
117-
ctx$addNamespace() %>%
118-
as_relation()
119-
120-
eigenRelation = tibble(pc.eigen.values = aPca$sdev^2) %>%
121-
mutate(var_explained = .$pc.eigen.values / sum(.$pc.eigen.values))%>%
122-
ctx$addNamespace() %>%
123-
as_relation()
124-
125-
loadingRelation = aPca$rotation[,1:maxComp] %>%
126-
as_tibble() %>%
127-
setNames(0:(ncol(.)-1)) %>%
128-
mutate(.var.rids = 0:(nrow(.) - 1)) %>%
129-
pivot_longer(-.var.rids,
130-
names_to = ".pc.rids",
131-
values_to = "pc.loading",
132-
names_transform=list(.pc.rids=as.integer)) %>%
133-
ctx$addNamespace() %>%
134-
as_relation() %>%
135-
left_join_relation(ctx$rrelation, ".var.rids", ctx$rrelation$rids)
136-
137-
scoresRelation = aPca$x[,1:maxComp] %>%
138-
as_tibble() %>%
139-
setNames(0:(ncol(.)-1)) %>%
140-
mutate(.i=0:(nrow(.)-1)) %>%
141-
pivot_longer(-.i,
142-
names_to = ".pc.rids",
143-
values_to = "pc.value",
144-
names_transform=list(.pc.rids=as.integer)) %>%
145-
ctx$addNamespace() %>%
146-
as_relation() %>%
147-
left_join_relation(ctx$crelation, ".i", ctx$crelation$rids)
148-
149-
# link all 4 relation into one and save
150-
rels <- pcRelation %>%
151-
left_join_relation(scoresRelation, pcRelation$rids, ".pc.rids") %>%
152-
left_join_relation(eigenRelation, pcRelation$rids, eigenRelation$rids) %>%
153-
left_join_relation(loadingRelation, pcRelation$rids, ".pc.rids") %>%
154-
as_join_operator(ctx$cnames, ctx$cnames)
155-
```
156-
157-
#### File Output {-}
158-
159-
Tercen supports outputting files, such as images or documents, by first saving them temporarily and then converting them to a Tercen-compatible format.
160-
161-
```python
162-
from tempfile import NamedTemporaryFile
163-
import matplotlib.pyplot as plt
164-
165-
# Save plot as a temporary file
166-
tmp = NamedTemporaryFile(delete=True, suffix='.png')
167-
data_np = df_per_cell["mean_cells"].to_numpy()
168-
169-
plt.hist(data_np, bins=5, edgecolor="black")
170-
plt.xlabel("Value")
171-
plt.ylabel("Frequency")
172-
plt.title("Histogram")
173-
plt.savefig(tmp)
174-
175-
# Convert file to Tercen table
176-
from tercen.util.helper_functions import as_relation
177-
178-
df_plot = as_relation(tmp.name)
179-
ctx.save_relation(df_plot)
180-
```
181-
182-
#### Advanced Input: Reading from Project {-}
183-
184-
Tercen enables reading files and documents directly from a project.
185-
186-
1. **Identify Project and Folder IDs** using `workflowId` and `folderId`.
187-
2. **Locate Files** by matching the file name in the project.
188-
3. **Download Documents** using `download()`.
189-
190-
Example: Retrieving a document from the project.
191-
192-
```python
193-
# Get the workflow and Folder ID that contains it
194-
wf = ctx.context.client.workflowService.get(ctx.context.workflowId)
195-
wf.folderId
196-
197-
# Get project ID
198-
projectId = ctx.schema.projectId
199-
project = ctx.context.client.projectService.get(projectId)
200-
projectUser = project.acl.owner
201-
202-
projectFiles = ctx.client.projectDocumentService.findProjectObjectsByFolderAndName(\
203-
[projectId, "ufff0", "ufff0"],\
204-
[projectId, "", ""], useFactory=False, limit=25000 )
205-
206-
fnames = [f.name for f in projectFiles]
207-
208-
## Download a document
209-
my_file_id = [index for index, fname in enumerate(fnames) if 'crabs_file.csv' in fname][0] # First match
210-
211-
# Get Document properties
212-
pf = projectFiles[my_file_id]
213-
214-
pf.name # file name
215-
pf.folderId # folder ID
216-
pf.id # document ID
217-
218-
# Download data and read response
219-
resp = ctx.context.client.fileService.download(pf.id)
220-
resp.read()
221-
# If binary response, must be decoded
222-
```
223-
224-
> **Recommendation:** Avoid manual file retrieval by setting document IDs directly in the workflow input projection.
225-
226-
---
227-
228-
This chapter provides essential steps for managing inputs and outputs within Tercen. Utilize these methods to streamline workflows and enhance data integration capabilities in your Tercen projects.
15+
* When developing an operator, look at how existing operators are implemented, paying attention to their structure, naming conventions, and functionality. Look at how data is loaded and saved using the Tercen API. Operator development documentation is available at:
16+
- https://github.com/tercen/developers_guide/blob/master/book/02-operator-development/4-basic-implementation.qmd
17+
- https://github.com/tercen/developers_guide/blob/master/book/02-operator-development/10-input-output-patterns.qmd
18+
- https://github.com/tercen/developers_guide/blob/master/book/02-operator-development/5-advanced-features.qmd
19+
20+
For further examples, fetch and analyze the following public URLs:
21+
- https://github.com/tercen/mean_operator/blob/master/main.R
22+
- https://github.com/tercen/median_operator/blob/master/main.R
23+
- https://github.com/tercen/plot_operator/blob/master/main.R
24+
- https://github.com/tercen/pca_operator/blob/master/main.R
25+
- https://github.com/tercen/umap_operator/blob/master/main.R
26+
- https://github.com/tercen/read_csv_operator/blob/master/main.R

0 commit comments

Comments
 (0)