Releases: ROCm/k8s-device-plugin
Releases · ROCm/k8s-device-plugin
v1.31.0.9
What's Changed
- group partitions based on domain and location_id fields by @biluriuday in #162
- exit dp with non-zero exit code if the driver is not loaded by @biluriuday in #170
- [Fix] fix TensorFlow GPU example to sync with latest image by @yansun1996 in #171
- use libdrm as fallback for fetching product name by @biluriuday in #172
Full Changelog: v1.31.0.8...v1.31.0.9
amd-gpu-helm-0.21.0
A Helm chart for deploying Kubernetes AMD GPU device plugin
v1.31.0.8
What's Changed
- Platform support: RedHat OpenShift 4.19
- Update Helm Chart to 0.20.0 to point to release 1.31.0.7 by @sriram-30 in #138
- modify gpu allocator logic to avoid fragmentation of unused GPUs by @biluriuday in #141
- Bump rocm-docs-core from 1.18.2 to 1.21.1 in /docs/sphinx by @dependabot[bot] in #140
Full Changelog: v1.31.0.7...v1.31.0.8
amd-gpu-helm-0.20.0
A Helm chart for deploying Kubernetes AMD GPU device plugin
v1.31.0.7
What's Changed
- Node labeller vram label support for partitions by @sriram-30 in #108
- Bump rocm-docs-core from 1.17.0 to 1.17.1 in /docs/sphinx by @dependabot in #109
- Bump rocm-docs-core from 1.17.1 to 1.18.1 in /docs/sphinx by @dependabot in #113
- Sphinx config and updated docs for device plugin by @AMD-melliott in #114
- Change SimpleHealthCheck to use
/sys/class/kfdby @fluidnumerics-joe in #116 - Bump rocm-docs-core from 1.18.1 to 1.18.2 in /docs/sphinx by @dependabot in #118
- Update rhubi-based image labels for OpenShift certification by @yansun1996 in #120
- Support updateStrategy for helm chart daemonsets by @jaeyung1001 in #121
- Fix k8s-node-labeller cleanup on node labels by @yansun1996 in #123
- Device Plugin and Node Labeller support for gpu partitions by @sriram-30 in #117
- update example device-plugin yaml file by @biluriuday in #124
- add fallback mechanism in case allocator init fails by @biluriuday in #125
- Add documentation for new gpu-partition related node labeller arguments by @sriram-30 in #126
New Contributors
- @sriram-30 made their first contribution in #108
- @AMD-melliott made their first contribution in #114
- @fluidnumerics-joe made their first contribution in #116
- @jaeyung1001 made their first contribution in #121
- @biluriuday made their first contribution in #124
Full Changelog: v1.31.0.6...v1.31.0.7
amd-gpu-helm-0.19.0
A Helm chart for deploying Kubernetes AMD GPU device plugin
v1.31.0.6
v1.31.0.5
What's Changed
- Bump rocm-docs-core from 1.15.0 to 1.17.0 in /docs/sphinx by @dependabot in #106
- exporter endpoint svc to check for gpu health by @spraveenio in #100
- remove pulse from base images, add timeout grpc req by @spraveenio in #107
Full Changelog: v1.31.0.4...v1.31.0.5
v1.31.0.4
What's Changed
- node labeller "-product-name" failing for some platforms on k8s by @spraveenio in #104
- Update mounts of labeller in helm chart by @amdlin in #105
Full Changelog: v1.31.0.3...v1.31.0.4
amd-gpu-helm-0.18.0
A Helm chart for deploying Kubernetes AMD GPU device plugin