Skip to content

Upload fails with large file, works with smaller ones (nondescriptive readerror) #47

@JulianRein

Description

@JulianRein

Hi there,

I am using easyDataverse and thus pydvuploader.
However, when uploading a "larger" 2.1 GB file (fastq.gz), the upload stalls and fails with a "ReadError" and nothing else.
It works for smaller files in the same directory.
It also fails when renaming the file to remove the .fastq.gz
Likely unrelated: wierd filenames like "index.html?C=N;O=D" also fail. Here, renaming helps.

Any idea how to solve this?
I attached the (very long) error log, which unfortunately does not help me at all. (to avoid confusion: streamlit is the name of my conda-env, it does not run as an actual webapp at the moment, but in an .ipynb within positron.
However, running it in a standard console throws the same error -> both example runs attached)

My code (besides the console run):

`
import dvuploader as dv

path = "testProjectX/SDFSF_3232423_24R388-1_S143_L000_R2_001.fastq.gz" (large file)
"#path = "uploadCode.tmp" WORKED (small file)"
"# path = "testProjectX/exampleHtml2" # WORKED (small file) "
"# path = "testProjectX/index.html?C=D;O=D" did NOT work (small file)"
"# path = "testProjectX/R2fastqCopied.fastq.gz" NOT working either... takes too long?" (large file)
"# path = "testProjectX/R2fastqCopied" # Nope..." (large file)

"# Add file individually"
files = [
dv.File(directory_label="fastq", filepath=path, tab_ingest = False),
]

dvuploader = dv.DVUploader(files=files)
dvuploader.upload(
api_token=dataset.API_TOKEN,
dataverse_url=dataset.DATAVERSE_URL,
persistent_id=dataset.p_id,
n_parallel_uploads=1, # Whatever your instance can handle
`

longErrorPyDVuploader.txt

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions