Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,6 @@ test/samples/
# Internal design notes and AI planning artifacts - not for distribution
funnelcake.md
docs/superpowers/
.claude/settings.local.json
*.profdata
*.profraw
71 changes: 71 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Changelog

All notable changes to funnelcake are recorded here. Format follows
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project does
not yet follow a tagged release cadence, so changes accumulate under
**Unreleased** until a tag is cut.

## [Unreleased]

### Added
- `FUSED_ERR_OUT_OF_MEMORY` (-5) hard-error code returned by
`fused_scaler_init` and `fused_hdr_init` when allocation of the internal
state struct fails. Previously these paths returned the misleading
`FUSED_ERR_NO_STEPS`.
- `FUSED_LOG_INFO` (2) log level for low-frequency status / diagnostic
messages. Routed through the existing `log_warnings` config so callers
that install a `FUSED_LOG_CALLBACK` can filter info out by inspecting
the `level` argument while still receiving warnings.

### Changed
- The "tone map LUTs generated" diagnostic in `fused_tonemap_generate_luts`
now logs at `FUSED_LOG_INFO` instead of `FUSED_LOG_WARN`. Callers that
previously suppressed it via `log_warnings = FUSED_LOG_SUPPRESS` keep
the same behavior; callback-based loggers can now keep warnings while
dropping info.
- The "no SIMD support detected" notice from both init functions now
routes through `fused_log(&ctx->log_warnings, FUSED_LOG_WARN, …)`
instead of writing to `stderr` directly, so users with a configured
log target see it where they expect.
- `fused_scaler_free` and `fused_hdr_free` now also reset
`effective_width` and `effective_height` on the context, matching the
reset of the other result fields.

### Fixed
- **HDR init memory leak**: `fused_hdr_init` could leak `state->sdr_temp[i]`
buffers if SDR-only steps had been allocated successfully and a later
step (the 1:1 tonemap output, or the "no valid steps" check) failed.
The error paths called `fused_hdr_free(ctx)` while `ctx->_internal` was
still NULL, skipping the cleanup of `state`'s sdr_temp pointers, and
then `free(state)` released the struct without freeing those buffers.
Init now attaches `state` to `ctx->_internal` immediately after
allocation so any subsequent error path goes through `fused_hdr_free`
and releases everything.
- The misaligned-source warning emitted by `fused_scaler_run` and
`fused_hdr_run` used a process-wide `static int warned` flag, so the
first context to encounter a misaligned source silenced the warning
for every other context in the process (and the flag was not
thread-safe). Each context now owns its own `src_misaligned_warned`
flag inside its internal state.
- The misaligned-source warning was missing a trailing newline.

### Documentation
- `docs/API.md`: documented `FUSED_LOG_INFO`, `FUSED_ERR_OUT_OF_MEMORY`,
and the relationship between log levels and the routing config.

---

## Conventions

- **Added** — new public API surface (functions, constants, struct fields).
- **Changed** — non-breaking behavioral changes to existing API.
- **Deprecated** — APIs scheduled for removal.
- **Removed** — APIs that have been deleted.
- **Fixed** — bug fixes that don't change documented behavior.
- **Security** — vulnerability fixes.
- **Performance** — measurable speed/memory wins, with a one-line summary
of the workload and the delta.
- **Documentation** — doc-only changes worth noting.

Group breaking changes under their own **Breaking** subsection and call
out the impact on callers.
56 changes: 56 additions & 0 deletions INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,59 @@ Artifacts are written to `dist/` as per-architecture tarballs containing
release archives are built with `CC=clang LTO=0` so the static library contains
standard object files suitable for downstream linkers that do not understand
Clang LTO bitcode.

## macOS Release Artifacts

To build a native macOS release archive:

./scripts/build-macos.sh

The script must be run on macOS with Xcode command line tools installed. It
writes `dist/funnelcake-macos-<arch>.tar.gz` containing `libfunnelcake.a`,
`include/funnelcake.h`, the README, install notes, and `BUILD_INFO`.

## Windows Release Artifacts

To build Windows release archives:

./scripts/build-windows.sh # bash / MSYS2 / Git Bash
scripts\build-windows.ps1 # native PowerShell 5.1+

By default the script builds every Windows target whose toolchain is available.
MinGW-w64 artifacts contain `libfunnelcake.a`; MSVC artifacts contain
`funnelcake.lib`. Both package layouts include `include/funnelcake.h`, the
README, install notes, and `BUILD_INFO`.

For MinGW-w64 only:

./scripts/build-windows.sh --mingw
scripts\build-windows.ps1 -Mingw

For `x86_64` MinGW, install tools that provide `x86_64-w64-mingw32-gcc` and
`x86_64-w64-mingw32-ar`. For Windows on ARM64 MinGW, install tools that provide
`aarch64-w64-mingw32-gcc` and `aarch64-w64-mingw32-ar`.

For MSVC only, run from a Visual Studio developer shell where `cl.exe` and
`lib.exe` are in `PATH`:

./scripts/build-windows.sh --msvc
scripts\build-windows.ps1 -Msvc

Both MinGW and MSVC builds use the normal source selection (AVX2 on `x86_64`,
NEON on `aarch64`/ARM64). The NEON kernels guard on `__aarch64__ || _M_ARM64`,
so MSVC ARM64 picks up the same SIMD coverage as the MinGW cross-compile.

### Windows ARM64 only

For a Windows-on-ARM64 build without touching the x86_64 paths, use the
dedicated PowerShell driver:

scripts\build-windows-arm64.ps1 # MinGW + MSVC, whichever is available
scripts\build-windows-arm64.ps1 -Mingw # cross-compile via aarch64-w64-mingw32-gcc
scripts\build-windows-arm64.ps1 -Msvc # native ARM64 MSVC

The `-Msvc` path requires an "ARM64 Native Tools Command Prompt for VS" or
an equivalent Developer PowerShell with `VSCMD_ARG_TGT_ARCH=arm64`; the
script refuses to run if the shell is not configured for ARM64. Both
artifact layouts mirror `build-windows.ps1`: a per-toolchain `dist/`
package plus a `.zip.sha256`.
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ TEST_OPT = -O2
UNAME_M := $(shell uname -m)
UNAME_S := $(shell uname -s)

ifeq ($(UNAME_S),Windows_NT)
CFLAGS_BASE += -D__USE_MINGW_ANSI_STDIO=1
endif

# Normalize FreeBSD's "amd64" to "x86_64" so the SIMD-selection blocks
# below match. FreeBSD/arm64 already reports "aarch64".
ifeq ($(UNAME_S),FreeBSD)
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -620,7 +620,8 @@ benchmark comparison; without it the library and headers install but

The scalar fallback is correct on all platforms but significantly slower.
On hardware without AVX2, NEON, or RVV, the library logs a one-time notice
to stderr at first init.
through the configured `log_warnings` channel at first init (default:
stderr).


## HDR10 support
Expand Down
23 changes: 18 additions & 5 deletions docs/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,9 +221,9 @@ struct means write to stderr.

| Field | Type | Description |
|-------|------|-------------|
| `target` | `int` | One of the `FUSED_LOG_*` constants. |
| `target` | `int` | One of the `FUSED_LOG_*` target constants. |
| `file` | `FILE *` | Used when `target == FUSED_LOG_FILE`. Must be a valid open file. |
| `callback` | `void (*)(int level, const char *msg, void *ctx)` | Used when `target == FUSED_LOG_CALLBACK`. `level` is `FUSED_LOG_ERROR` or `FUSED_LOG_WARN`. |
| `callback` | `void (*)(int level, const char *msg, void *ctx)` | Used when `target == FUSED_LOG_CALLBACK`. `level` is one of `FUSED_LOG_ERROR`, `FUSED_LOG_WARN`, `FUSED_LOG_INFO`. |
| `callback_ctx` | `void *` | Passed through opaquely as the `ctx` argument to `callback`. |

Log target constants:
Expand All @@ -236,6 +236,15 @@ Log target constants:
| `FUSED_LOG_SUPPRESS` | 3 | Discard all messages |
| `FUSED_LOG_CALLBACK` | 4 | Call `config.callback` |

Log level constants (passed to callbacks; stderr/stdout/file targets emit
every message regardless of level):

| Constant | Value | Meaning |
|----------|-------|---------|
| `FUSED_LOG_ERROR` | 0 | Hard error — init failed, no resources allocated. Routed via `log_errors`. |
| `FUSED_LOG_WARN` | 1 | Partial success or fallback — request still produced output. Routed via `log_warnings`. |
| `FUSED_LOG_INFO` | 2 | Low-frequency status / diagnostic. Routed via `log_warnings`; filter on `level` in a callback to drop. |


## Scale Step Flags

Expand Down Expand Up @@ -515,6 +524,7 @@ are valid, and `fused_scaler_run` must not be called.
| `FUSED_ERR_NO_STEPS` | -2 | No valid step flags remain after filtering (all were rejected or none were set). |
| `FUSED_ERR_BAD_DIMENSIONS` | -3 | `src_width` or `src_height` is <= 0, or too small for the requested steps. |
| `FUSED_ERR_BAD_ALIGNMENT` | -4 | `src_y_stride` or `src_uv_stride` is not 32-byte aligned. |
| `FUSED_ERR_OUT_OF_MEMORY` | -5 | Allocation of internal state failed; output buffers (if any were allocated earlier in init) have already been released. |


## Alignment Requirements
Expand Down Expand Up @@ -678,9 +688,12 @@ scaler.log_warnings.callback = my_log;
scaler.log_warnings.callback_ctx = my_logger_instance;
```

The `level` argument to the callback is `FUSED_LOG_ERROR` (0) or
`FUSED_LOG_WARN` (1). The `msg` string is a complete formatted message;
do not call `fused_scaler_*` functions from within the callback.
The `level` argument to the callback is `FUSED_LOG_ERROR` (0),
`FUSED_LOG_WARN` (1), or `FUSED_LOG_INFO` (2). Info-level messages
(e.g. "tone map LUTs generated") share the `log_warnings` config — to
keep warnings but drop info, install a callback and filter on `level`.
The `msg` string is a complete formatted message; do not call
`fused_scaler_*` functions from within the callback.


## HDR10 API Reference
Expand Down
11 changes: 11 additions & 0 deletions funnelcake.pc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
prefix=/usr/local
exec_prefix=${prefix}
libdir=/usr/local/lib
includedir=/usr/local/include

Name: funnelcake
Description: SIMD YUV scaler with HDR/SDR tonemapping
Version: 0.1.0
Cflags: -I${includedir}
Libs: -L${libdir} -lfunnelcake
Libs.private: -lm
8 changes: 8 additions & 0 deletions include/funnelcake.h
Original file line number Diff line number Diff line change
Expand Up @@ -140,14 +140,22 @@ extern "C" {
#define FUSED_ERR_NO_STEPS (-2) /* no valid step flags set after filtering */
#define FUSED_ERR_BAD_DIMENSIONS (-3) /* src_width/height <= 0 or too small */
#define FUSED_ERR_BAD_ALIGNMENT (-4) /* strides not 32-byte aligned */
#define FUSED_ERR_OUT_OF_MEMORY (-5) /* allocation of internal state failed */


/* --------------------------------------------------------------------------
* Log levels
*
* Levels are passed to FUSED_LOG_CALLBACK callbacks (which can filter on
* them); stderr/stdout/file targets emit every message regardless of level.
* Routing of info-level diagnostics shares the warnings logger config, so
* callers that want to drop info but keep warnings should install a callback
* and filter by level.
* -------------------------------------------------------------------------- */

#define FUSED_LOG_ERROR 0
#define FUSED_LOG_WARN 1
#define FUSED_LOG_INFO 2 /* low-frequency diagnostic / status messages */


/* --------------------------------------------------------------------------
Expand Down
Binary file added libfunnelcake.1.dylib
Binary file not shown.
89 changes: 89 additions & 0 deletions scripts/build-macos.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
DIST_DIR="${REPO_ROOT}/dist"
BUILD_DATE="$(date -u +%Y%m%dT%H%M%SZ)"
HOST_OS="$(uname -s)"
HOST_ARCH="$(uname -m)"

usage() {
cat <<'EOF'
Usage: scripts/build-macos.sh

Build a native macOS release archive. This script must run on macOS with
Xcode command line tools installed.
EOF
}

if [ "${1:-}" = "-h" ] || [ "${1:-}" = "--help" ]; then
usage
exit 0
fi

if [ "$#" -gt 0 ]; then
echo "error: unknown option: $1" >&2
usage >&2
exit 1
fi

require_tool() {
if ! command -v "$1" >/dev/null 2>&1; then
echo "error: required tool not found: $1" >&2
exit 1
fi
}

if [ "${HOST_OS}" != "Darwin" ]; then
echo "error: macOS artifacts require a Darwin host" >&2
exit 1
fi

require_tool clang
require_tool make
require_tool shasum
require_tool tar

mkdir -p "${DIST_DIR}"

package_dir="funnelcake-macos-${HOST_ARCH}"

echo "==> Building macOS ${HOST_ARCH} artifact"
(
cd "${REPO_ROOT}"
make clean
make lib CC=clang LTO=0 UNAME_S=Darwin UNAME_M="${HOST_ARCH}"
)

rm -rf "${DIST_DIR}/${package_dir}"
mkdir -p "${DIST_DIR}/${package_dir}/include"
cp "${REPO_ROOT}/libfunnelcake.a" "${DIST_DIR}/${package_dir}/"
cp "${REPO_ROOT}/include/funnelcake.h" "${DIST_DIR}/${package_dir}/include/"
cp "${REPO_ROOT}/README.md" "${REPO_ROOT}/INSTALL.md" "${DIST_DIR}/${package_dir}/"

{
printf '%s\n' "name=funnelcake"
printf '%s\n' "target_os=macos"
printf '%s\n' "target_arch=${HOST_ARCH}"
printf '%s\n' "compiler=clang"
printf '%s\n' "lto=0"
printf '%s\n' "build_date=${BUILD_DATE}"
} > "${DIST_DIR}/${package_dir}/BUILD_INFO"

rm -f "${DIST_DIR}/${package_dir}.tar.gz" "${DIST_DIR}/${package_dir}.tar.gz.sha256"
(
cd "${DIST_DIR}"
tar -czf "${package_dir}.tar.gz" "${package_dir}"
shasum -a 256 "${package_dir}.tar.gz" > "${package_dir}.tar.gz.sha256"
)

(
cd "${REPO_ROOT}"
make clean
)

echo ""
echo "Artifacts written to ${DIST_DIR}:"
echo " ${package_dir}.tar.gz"
echo " ${package_dir}.tar.gz.sha256"
Loading