single 3090 OOM

The original CodeGeeX using this script failed(out of memory in 3900X(24 core)+32GB RAM+3090)
```
# With quantization (with more than 15GB RAM)
bash ./scripts/test_inference_quantized.sh <GPU_ID> ./tests/test_prompt.txt
```
so I switch to codegeex-fastertransformer, it seems still OOM
```
Traceback (most recent call last):
  File "api.py", line 105, in <module>
    if not codegeex.load(ckpt_path=args.ckpt_path):
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 413, in load
    self.cuda()
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 430, in cuda
    self.weights._map(lambda w: w.contiguous().cuda(self.device))
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 177, in _map
    w[i] = func(w[i])
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 430, in <lambda>
    self.weights._map(lambda w: w.contiguous().cuda(self.device))
RuntimeError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 24.00 GiB total capacity; 23.11 GiB already allocated; 0 bytes free; 23.11 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentat
ion.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

single 3090 OOM #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

single 3090 OOM #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions