Prisma join performances

**Is your feature request related to a problem? Please describe.**
I'm working on a large-ish IOT project with a non trivial amount of data. The project uses Prisma as a DAL on AWS Fargate, with an Aurora (Postgres) database. I've noticed that for some of the data retrieval needs of the project, I had to fallback to raw SQL for performance reasons. Some of the queries that I try to do with the prisma client end up crashing the Prisma server without returning any data.

**Describe the solution you'd like**
I would like a way to retrieve data from different tables (call it relationships or join) using Prisma or Prisma2 (which I haven't tried for this project yet) in an efficient way, one that doesn't crash the server and doesn't take more than 30s to run.

**Describe alternatives you've considered**
Raw SQL/ Low level tools (knex,pg) which defeats the point a little.
In a graphQL server context, Overriding resolvers provided by nexus-prisma.

**Additional context**
I'll provide as much information as I'm allowed to.
This is a simplified version of the datastructure. The missing fields are mostly strings and irrelevant to the issue, and each table has createdAt and updatedAt fields defined in the datamodel.

```graphQL
type Device {
    id: ID! @id
    deviceUpdates: [DeviceUpdate!]!
}
type DeviceUpdate {
    id: ID! @id
    device: Device!
    sensorUpdates: [SensorUpdate!]! @relation(onDelete: CASCADE)
}

type SensorUpdate {
    id: ID! @id
    sensor: Sensor!
    deviceUpdate: DeviceUpdate!
}
type Sensor {
    id: ID! @id
    sensorUpdate: [SensorUpdate!]
}
```
Two of those tables are 'growing', `deviceUpdate` and `sensorUpdate`,  they get a considerable amount of new entries regularly.
The `device` table is expected to have on average thousands of entries (will scale up to 50000 entries).
On average each device makes 10 updates a day, so the`deviceUpdate` table roughly grows by the number of devices * 10 every day.
The `sensorUpdate` table is between 1 and 5 times the size of the DeviceUpdate.
The `sensor` table is roughly a hundred entries.

The type of queries that i’m trying to run looks like this :
```graphQL
query{
  devices(first: 10){
    id
    deviceUpdates(first: 100){
      id
      sensorUpdates{
        id
        sensor{
          id
        }
      }
    }
  }
}
```
with potentially more query parameters, such as filtering and ordering.
This type of queries takes *ages* to complete, and very often they end up crashing the prisma server in most cases.
The data retrieval can be expressed with the following SQL queries :

Very slow query (minutes):

```SQL
SELECT *
FROM "Device" d
LEFT JOIN "DeviceUpdate" du ON d.id = du.device
LEFT JOIN "SensorUpdate" su ON du.id = 'su.deviceUpdate'
LEFT JOIN "Sensor" s ON su.id = 's.sensorUpdate'
WHERE d.id IN(...)
WHERE s.id IN(...); 
```
But the same result can be achieved in a much more performant way.
Fast query (seconds) : 

```sql
SELECT*
FROM "Sensor" s
INNER JOIN "SensorUpdate" su ON su.sensor = s.id
INNER JOIN "DeviceUpdate" du ON du.id = su."deviceUpdate"
INNER JOIN "Device" d ON d.id = du.device AND d.id IN(...)
WHERE s.id IN(...);
```
I'm sure it's possible to write a more performant SQL query, or maybe to play around with indexes to achieve desired performances. But I don't see a way to do that with Prisma.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prisma join performances #4744

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prisma join performances #4744

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions