Skip to content

TimeOutError when updating Dataset with large files #31

@kbrueckmann

Description

@kbrueckmann

Dataverse version: 6.3 build v6.3+10797_dls
Python: 3.12

When updating a dataset with the add_file() and update()-methods the uploader falls back to the native API and then works perfectly for smaller files (ca. 50 MB). But if I try to do the same for large files (ca. 20 GB), I get a TimeOutError.

Using the API and curl directly works fine, though; so it doesn't look like it's connected to a timeout of our payara/server. Did anyone have the same problem and has any ideas for a solution?

Here is the complete error message:

Traceback (most recent call last):
  File "...updateDataset.py", line 108, in <module>
    update_dataset(dataset, ".", args.desc if args.desc else "", categories if args.categories else ["Data"])
  File "...updateDataset.py", line 68, in update_dataset
    dataset.update()
  File "...venv/lib/python3.12/site-packages/easyDataverse/dataset.py", line 237, in update
    update_dataset(
  File "...venv/lib/python3.12/site-packages/easyDataverse/uploader.py", line 128, in update_dataset
    _uploadFiles(
  File "...venv/lib/python3.12/site-packages/easyDataverse/uploader.py", line 91, in _uploadFiles
    dvuploader.upload(
  File "...venv/lib/python3.12/site-packages/dvuploader/dvuploader.py", line 120, in upload
    asyncio.run(
  File "...venv/lib/python3.12/site-packages/nest_asyncio.py", line 30, in run
    return loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...venv/lib/python3.12/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
           ^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/futures.py", line 203, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/usr/lib/python3.12/asyncio/tasks.py", line 314, in __step_run_and_handle_result
    result = coro.send(None)
             ^^^^^^^^^^^^^^^
  File "...venv/lib/python3.12/site-packages/dvuploader/nativeupload.py", line 78, in native_upload
    responses = await asyncio.gather(*tasks)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/tasks.py", line 385, in __wakeup
    future.result()
  File "/usr/lib/python3.12/asyncio/tasks.py", line 316, in __step_run_and_handle_result
    result = coro.throw(exc)
             ^^^^^^^^^^^^^^^
  File "...venv/lib/python3.12/site-packages/dvuploader/nativeupload.py", line 211, in _single_native_upload
    async with session.post(endpoint, data=formdata) as response:
  File "...venv/lib/python3.12/site-packages/aiohttp/client.py", line 1355, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "...venv/lib/python3.12/site-packages/aiohttp/client.py", line 686, in _request
    await resp.start(conn)
  File "...venv/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1014, in start
    with self._timer:
  File "...venv/lib/python3.12/site-packages/aiohttp/helpers.py", line 719, in __exit__
    raise asyncio.TimeoutError from None
TimeoutError

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions