Skip to content

Commit a1e6db9

Browse files
authored
Python Data Provider documentation
2 parents 94db6b9 + 56f24e7 commit a1e6db9

File tree

10 files changed

+326
-182
lines changed

10 files changed

+326
-182
lines changed

documentation/.vuepress/components/LinkableChoices.vue

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,11 @@ export default {
9393
}
9494
}
9595
96+
h3 {
97+
font-size: 1rem;
98+
text-align: center;
99+
}
100+
96101
img {
97102
width: 35px;
98103
height: 35px;
Lines changed: 22 additions & 0 deletions
Loading
Lines changed: 26 additions & 0 deletions
Loading
Lines changed: 4 additions & 0 deletions
Loading
Lines changed: 68 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,95 @@
1-
# Inserting data into DebiAI
1+
# Inserting Data into DebiAI
22

3-
Being a data visualization application, providing the project data to DebiAI is a required step.
3+
As a data visualization application, providing project data to DebiAI is a required step.
44

55
## Requirements
66

7-
### A DebiAI instance
7+
### A Running DebiAI Instance
88

9-
You will need to have a running DebiAI instance to insert you project data to. (see [Installation](../introduction/gettingStarted/installation/README.md))
9+
You need a running DebiAI instance to insert your project data. (See [Installation](../introduction/gettingStarted/installation/README.md))
1010

11-
### Data
11+
### Data Format Requirements
1212

13-
The data you want to analyze with DebiAI will need to respect a specific format.
13+
The data you want to analyze in DebiAI must follow a specific format.
1414

15-
- **CSV like format**
15+
- **CSV-like Format**
1616

17-
If your data can be represented in an array like format, adding them to DebiAI will be easy. The data can also support different levels of nesting (see [unfolding columns](../dashboard/unfolding/)).
17+
If your data is structured in an array-like format, adding it to DebiAI is straightforward. DebiAI also supports different levels of nesting (see [Unfolding Columns](../dashboard/unfolding/)).
1818

19-
- **Data types**
19+
- **Supported Data Types**
2020

2121
DebiAI supports the following data types:
2222

2323
- `num`: numerical values
2424
- `str`: string values
2525
- `bool`: boolean values
26-
- `array`: array of values (see [unfolding columns](../dashboard/unfolding/))
27-
- `dict`: dictionary of values (see [unfolding columns](../dashboard/unfolding/))
26+
- `array`: arrays of values (see [Unfolding Columns](../dashboard/unfolding/))
27+
- `dict`: dictionary objects (see [Unfolding Columns](../dashboard/unfolding/))
28+
- `None`: missing values
2829

29-
Dates are supported by DebiAI, you can provide them as strings.
30+
Dates are supported and should be provided as strings.
3031

31-
- **Missing values**
32+
- **Handling Missing Values**
3233

33-
DebiAI supports data with missing values (`None`, `NaN` or `null` values) since 0.29.0. The missing values will be displayed as `null` by widgets that support them. Statistics about missing values will be displayed in the dashboard.
34+
Since version 0.29.0, DebiAI supports missing values (`None`, `NaN`, or `null`). Widgets that support missing values will display them as `null`, and statistics about missing data will be available in the dashboard.
3435

35-
- **Samples size**
36+
- **Sample Size Limitations**
3637

37-
It is not recommended to provide more than 2.000.000 samples, as it will take a long time to process. We are working on improving this limit.
38+
Providing more than **2,000,000 samples** is not recommended, as it may significantly increase processing time. We are actively working on improving this limitation.
3839

39-
## There is currently two ways to insert data into DebiAI:
40+
## Methods for Inserting Data into DebiAI
4041

41-
- ### [Python module](pythonModule/README.md#python-module)
42+
There are currently two ways to insert data into DebiAI:
4243

43-
The main way to add provide the project data to the application is through the DebiAI Python module.
44-
The module was designed to be used directly in your Python workflow, to add model results directly after its evaluation for example.
44+
<img src="/debiai_architecture.png" alt="DebiAI architecture" width="400"/>
4545

46-
- ### [Data providers](dataProviders/README.md#data-providers)
46+
<LinkableChoices :choices="[
47+
{
48+
title: '1. Data Providers',
49+
description: 'Make DebiAI directly access your project data',
50+
imageLink: '/getStarted/data.svg',
51+
elementIdDestination: '_1-data-providers-recommended'
52+
},
53+
{
54+
title: '2. Python Module',
55+
description: 'Directly insert data from your Python workflow',
56+
imageLink: '/install/python.svg',
57+
elementIdDestination: '_2-python-module'
58+
}
59+
]"
60+
/>
4761

48-
A DebiAI data provider is a REST service that will expose your project to DebiAI.
49-
DebiAI will directly ask for the data from your project making the data loading process very quick and customizable. Unlike the DebiAI Python module, the provided data won't have to be duplicated in the DebiAI application.
62+
### **1. [Data Providers](dataProviders/README.md#data-providers) (Recommended)**
5063

51-
Making a data provider is the most efficient way to make your project data accessible to DebiAI, no matter the data base that your project is using.
64+
A **DebiAI Data Provider** is a service that exposes your project data to DebiAI. This method allows DebiAI to directly retrieve metadata from your project, making data loading **fast** and **customizable**.
65+
66+
**Key benefits**:
67+
68+
- No need to upload or duplicate data in DebiAI.
69+
- Always up to date with the latest project data.
70+
- Works with any files or databases used by your project.
71+
72+
⚠️ **Limitations**:
73+
74+
- Requires a custom implementation to expose your data.
75+
76+
To simplify implementation, you can use the [DebiAI Data Provider Python module](https://github.com/debiai/easy-data-provider).
77+
78+
### **2. [Python Module](pythonModule/README.md#python-module)**
79+
80+
You can also insert data directly from your Python workflow using the [DebiAI Python module](https://github.com/debiai/py-debiai). This is useful for integrating new data or model results immediately after generation.
81+
82+
**Key benefits**:
83+
84+
- Easier to implement.
85+
86+
⚠️ **Limitations**:
87+
88+
- Requires data duplication in DebiAI, increasing load time.
89+
- Data updates must be done manually.
90+
91+
While easier to implement, this method is less efficient than using a Data Provider.
92+
93+
---
94+
95+
By following the recommended **Data Provider** approach, you ensure an optimized project data integration with DebiAI.
Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,50 @@
1-
# Data providers
1+
# Data Providers
22

3-
Making a data provider is the most efficient way to make your project data accessible to DebiAI.
3+
Creating a **Data Provider** is the most efficient way to make your project data accessible to DebiAI.
44

5-
A data provider is a service that you create that can respond to the data requests of DebiAI. This service can be made in **any language**, can use **any kind of databases** and can be hosted on **any platform** as long at the DebiAI data-provider's API is respected. So unlike the [Debiai Python module](../pythonModule/README.md), your project data won't be duplicated in DebiAI and **DebiAI will always analyze the latest data**.
5+
A **Data Provider** is a service that responds to DebiAI's data requests. It can be implemented in **any language**, use **any database**, and be hosted on **any platform**, as long as it follows the **DebiAI Data Provider API**.
66

7-
### How does it work?
7+
Unlike the [DebiAI Python module](../pythonModule/README.md), this method **does not duplicate data** in DebiAI, ensuring that DebiAI always analyzes the latest version of your data.
88

9-
DebiAI will ask your data provider to return the data that it needs to display the dashboard:
9+
## How It Works
1010

11-
- Available Project lists
12-
- Available data IDs
13-
- Project data
14-
- Available models and results
15-
- Data selections (optional)
11+
DebiAI queries your Data Provider to retrieve information for the dashboard:
1612

17-
DebiAI will also be able to send the data selections made by the user.
13+
- **Project lists:** available projects
14+
- **Data IDs:** available samples
15+
- **Project data:** actual data used for analysis
16+
- **Model results:** available models and outputs (optional)
17+
- **Data selections:** user-defined data selections (optional)
1818

19-
### Pros and cons
19+
Additionally, DebiAI can send data selections made by the user back to the provider:
2020

21-
- **Pros**:
22-
- DebiAI will always analyze the latest data
23-
- Your data will not be duplicated in DebiAI
24-
- You can use any languages and databases
25-
- You can host your data provider on any platform
26-
- Better for long term projects
27-
- **Cons**:
28-
- You need to create a data provider (you can start with our [data provider templates](./quickStart.md#creation-of-a-data-provider))
29-
- You need to respect the DebiAI data-provider's API (we made it as simple as possible)
21+
- **Project deletion**
22+
- **Model deletion**
23+
- **Selection creation and deletion**
3024

31-
### The API
25+
## Pros & Cons
3226

33-
The Data-providers API as been described with OpenAPI 3.0.
27+
**Pros**:
3428

35-
- [Data-providers API Swagger documentation](https://petstore.swagger.io/?url=https://raw.githubusercontent.com/debiai/data-provider-nodejs-template/main/data-provider-API.yaml)
36-
- [Data-providers API yaml file](https://github.com/debiai/data-provider-nodejs-template/blob/main/data-provider-API.yaml).
29+
- **Always up to date** – DebiAI always analyzes the latest data.
30+
- **No data duplication** – Saves storage space.
31+
- **Flexibility** – Works with any programming language and database.
32+
- **Platform-independent** – Can be hosted anywhere.
33+
- **Ideal for middle to long-term projects**.
3734

38-
### Speed
35+
⚠️ **Cons**:
3936

40-
The speed at which your data loads into DebiAI depends on how quickly your data provider can provide them. So it depends on the size of the data and the speed of your database.
37+
- Requires an initial **custom implementation**, but it's a one-time setup. To simplify implementation, you can use the [DebiAI Data Provider Python module](https://github.com/debiai/easy-data-provider).
4138

42-
The quicker your data provider is, the quicker your data will be available in DebiAI.
39+
## Performance Considerations
4340

44-
### Getting started
41+
The **speed of data loading** into DebiAI depends on how quickly your Data Provider responds. This is influenced by:
4542

46-
To create your first data provider, read our [Quick start](quickStart/README.md).
43+
- **Data size** – Larger datasets take longer to load.
44+
- **Database performance** – A fast database speeds up response times.
4745

48-
::: warning Limitations
49-
- The interface between data-providers and DebiAI is not yet stable, so the API is likely to change in the future.
50-
:::
46+
Optimizing your Data Provider ensures **faster** data retrieval in DebiAI.
47+
48+
## Getting Started
49+
50+
To create your first Data Provider, check out our [Quick Start Guide](quickStart/README.md).

0 commit comments

Comments
 (0)