Skip to content

chunk boundaries for zarr.create(17, chunks=4) #977

@sshleifer

Description

@sshleifer

Suppose I created

import zarr
arr = zarr.create(17, chunks=4)

Would it be safe to have 1 thread write to arr[12:16] while another wrote to arr[16:]?

I.e is the last element in its own chunk of size 1?

Why I care

I have each worker holding a strangely shaped 1D array, and am trying to verify the following code for determining the maximum/best chunk size (I assume (possibly incorrectly) that uneven chunk sizes is not supported and that I always want to take the max allowable chunk size even if it's wierd).

from math import gcd
from typing import list
from functools import reduce

def find_gcd(list):
    x = reduce(gcd, list)
    return x

def chunk_size(array_shapes: List[int]):
    if len(array_shapes) <= 2:
        return array_shapes[0]
    return find_gcd(array_shapes[:-1])

def test_chunk_size():
    assert chunk_size([7, 7, 7, 6]) == 7
    assert chunk_size([7, 7, 7, 7]) == 7
    assert chunk_size([6, 6, 6, 4]) == 6
    assert chunk_size([6, 6, 1, 1]) == 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions