Hi! I have a query that I want to run over individual documents, each with their own index. There are too many documents to put into memory at the same time, and so I would like to clear CUDA memory between each document search. The RAG model for each document is either initiated with RAGPretrainedModel.from_pretrained (if the doc hasn't been indexed before), or with RAGPretrainedModel.from_index (if the doc has been previously indexed).
Below is pseudocode of the document search loop:
model_path = ... # same model for all documents
doc_ids = [...] # list of docs
index_paths = {doc_id: make_index_path(doc_id) for doc_id in doc_ids} # output locations of RAG.index for each doc_id
results = dict()
for doc_id, index_path in index_paths.items():
document = get_document(doc_id)
if not os.path.exists(index_path):
RAG = RAGPretrainedModel.from_pretrained(model_path, index_root=index_root)
print(f"Indexing document {doc_id}...")
print(f"len(document {doc_id}): {len(document)}")
RAG.index(
collection=[d['content'] for d in document],
document_ids=[d['document_id'] for d in document],
index_name=f"doc_{doc_id}",
max_document_length=512,
split_documents=False,
use_faiss=True,
)
else:
print(f"Index for document {doc_id} already exists")
RAG = RAGPretrainedModel.from_index(index_path)
results[doc_id] = RAG.search(
query=query,
index_name=f"doc_{doc_id}",
k=search_k,
)
# WANT TO DELETE THIS RAG MODEL AND CLEAR GPU MEMORY HERE
...
I can't figure out a way to clear the RAGPretrainedModels' memory between documents. Deleting the RAG model and calling torch.cuda.empty_cache() doesn't release anything. Is there a way to delete the model and release everything it's keeping in GPU without having to kill the process?
Hi! I have a query that I want to run over individual documents, each with their own index. There are too many documents to put into memory at the same time, and so I would like to clear CUDA memory between each document search. The RAG model for each document is either initiated with RAGPretrainedModel.from_pretrained (if the doc hasn't been indexed before), or with RAGPretrainedModel.from_index (if the doc has been previously indexed).
Below is pseudocode of the document search loop:
I can't figure out a way to clear the RAGPretrainedModels' memory between documents. Deleting the RAG model and calling torch.cuda.empty_cache() doesn't release anything. Is there a way to delete the model and release everything it's keeping in GPU without having to kill the process?