Skip to content

desugar-64/imla

Repository files navigation

Imla - (Experimental) GPU-Accelerated Blurring for Android Jetpack Compose UI

Caution

This is not a library, and it is not production-ready. Imla is a for-fun experiment exploring an alternative Jetpack Compose rendering approach — capturing the Compose root and effect layers through OpenGL and HardwareBuffers. Don't ship it.

Description

Imla (Ukrainian for "Haze", pronounced [ˈimlɑ] (eem-lah)) is an experimental GPU-accelerated backdrop blur for Jetpack Compose. It captures the Compose root into a HardwareBuffer, imports it zero-copy as an OpenGL ES texture, and runs blur, tint, noise, clip, and progressive-mask passes that sample that shared backdrop. Targets Android 6 (API 23) and up.

Features

  • Backdrop blur — separable Gaussian, adjustable radius, gamma-correct (blurred in linear light to avoid dark seams).
  • Progressive blur — a brush mask drives blur strength per pixel (crisp → blurred gradients).
  • Shape masking — clip a layer to any Shape outline.
  • Tinting — color composited over the blurred backdrop.
  • Noise blend — frosted-glass grain, lightness-driven (strongest in mid-tones, eased off in shadows and highlights).
  • Composite rendering — stack independent effect layers; each samples the shared backdrop and composites by z-order.
  • Rotation-correct — 3-axis graphicsLayer rotation stays aligned with the sampled backdrop.
  • Hardware-accelerated — HardwareBuffer capture and zero-copy GL import, on-GPU end-to-end.
  • Supports Android 6 (API 23) onwards.

Showcase

Each tile below is a single Modifier.effectLayer { … } sampling one shared backdrop — rendered on a calibration grid so the effect is easy to read.

Imla effects showcase on a calibration grid


Blur
backdropBlur(radius)

Tint
tint(color)

Frosted noise
noise(alpha)

Progressive blur
crisp top → blurred bottom
backdropBlur(radius, progressiveMask)

Shape mask
arbitrary clip outline
clip(shape)

Rotation
3-axis tilt stays aligned
graphicsLayer { rotationX/Y/Z }

Composite
stack independent blur layers — each effect samples the shared backdrop and composites by z-order

Usage

Wrap a screen — or any region — in ImlaHost, then declare blur regions with the effectGroup / effectLayer modifiers. No renderer or OpenGL objects are passed to children; they register through the host.

ImlaHost {
    Box(
        modifier = Modifier
            .fillMaxSize()
            .effectGroup() // the backdrop that effect layers sample from
    ) {
        FeedContent()

        TopBar(
            modifier = Modifier.effectLayer {
                backdropBlur(radius = 12.dp)
                tint(Color.White.copy(alpha = 0.1f))
                noise(alpha = 0.15f)
                clip(RoundedCornerShape(16.dp))
            }
        )
    }
}

Public API: ImlaHost, Modifier.effectGroup(), Modifier.effectLayer { ... }, EffectLayerScope, and EffectLayerBoundsProvider.

Architecture

ImlaHost wraps everything inside it into a single GPU-backed surface and renders all content and effects through it. Children never touch the renderer — they register through the host's SceneRegistry.

flowchart TB
    Host["ImlaHost { }"]
    Content["content() — your screen UI"]
    Group["Modifier.effectGroup()<br/>backdrop the layers sample"]
    Layer["Modifier.effectLayer { }<br/>blur · tint · noise · clip"]
    Registry[("SceneRegistry")]
    Renderer["SceneRenderer"]
    Surface["single output SurfaceView<br/>API 29+ · SurfaceControl (zero-copy)<br/>API 23–28 · blit into Surface"]

    Host --> Content
    Host --> Surface
    Content --> Group
    Content --> Layer
    Group -. registers .-> Registry
    Layer -. registers .-> Registry
    Registry --> Renderer
    Renderer -- presents --> Surface
Loading

The output is a single SurfaceView on all supported APIs (Compose's AndroidExternalSurface is itself backed by a SurfaceView). What differs is how the renderer presents into it:

  • API 29+ — the SurfaceView is driven via SurfaceControl; the renderer hands HardwareBuffers straight to SurfaceFlinger with no present pass or eglSwapBuffers (zero-copy).
  • API 23–28 — there is no SurfaceControl, so the renderer blits into the SurfaceView's Surface (obtained through AndroidExternalSurface).

The root content is captured once per frame and shared by every effect layer that samples it.

Per-frame flow

sequenceDiagram
    participant UI as Compose main thread
    participant CAP as Capture thread
    participant GL as GL thread
    participant OUT as Output (SurfaceControl / EGL)
    UI->>UI: record content into a GraphicsLayer (draw pass)
    UI->>UI: vsync-align capture (postOnAnimation)
    UI->>CAP: HardwareRenderer.syncAndDraw → HardwareBuffer
    CAP-->>UI: HardwareBuffer ready (async)
    UI->>GL: hand off latest-only snapshot, requestRender
    GL->>GL: import HardwareBuffer zero-copy (EGLImage)
    Note over GL: noise (once) → root draw →<br/>per slot: [stencil clip] prepare → separable blur →<br/>composite (tint · noise · mask fused) → content
    GL->>OUT: present — API 29+ SurfaceControl (zero-copy) · else blit + swap
Loading

HardwareBuffers end-to-end

The Compose root is drawn into a single shared HardwareBuffer (a RenderNode rendered single-buffered off the main thread), and every effect layer samples that one backdrop. Capture lives in SingleBufferRenderer; the buffer is leased to the GL thread via BufferLease.

The GL thread imports it zero-copy with eglCreateImageFromHardwareBuffer + glEGLImageTargetTexture2DOES — no glReadPixels round-trip — in OpenGLHardwareBufferTexture2D, driven by CapturedFrameImporter.

On API 29+ the finished HardwareBuffer goes straight to SurfaceFlinger (see ScenePresenter and the buffer ring SceneHwBufferRing), so pixels stay in GPU/shared memory across capture → effects → present. This path is the main thing being experimented with here, and is still being tuned.

Rendering Abstraction

The OpenGL abstractions are reused from a sister experiment, desugar-64/android-opengl-renderer — a playground for learning graphics and OpenGL, with helpers for setting up GL data and calls.

The renderer is fully dynamic: it pushes vertex data every frame. Flexible, but it adds overhead — a future optimization target.

Performance notes

  • The separable Gaussian blur pass is the dominant GPU cost; it scales with the blur radius and the on-screen area being blurred.
  • Per-frame cost is driven mostly by render-pass / FBO switches rather than the blur math itself, so timing varies by scene.

Known issues and limitations

  • Expect bugs and untested edge cases. This is a for-fun experiment, not a hardened library — it has rough edges and unhandled cases. In particular, it has not been tested hosting content that itself contains other SurfaceViews (video players, maps, camera previews, nested Imla hosts); surface-compositing scenarios like these may misbehave.
  • HardwareBuffer capture is not always the fastest path. Across the 4 devices I tested on, the hardware-rendering capture path (RenderNodeHardwareBuffer → zero-copy GL import) is fast on 3. On the 4th, a low-end budget Android 13 tablet, it runs slower than capturing the UI into an external texture. The cause is not yet pinned down.
  • No atlas grouping yet. Effect layers render as separate draws/passes rather than batched into a shared atlas. It's the main remaining GL-side optimization, deferred because it has to integrate with the bucketed capture buffers used for resizable content.
  • Dynamic per-frame vertex push. The renderer uploads vertex data every frame (see Rendering Abstraction) — flexible, but more overhead than a static/instanced path.
  • Convoluted internal architecture. The code grew through iteration and is more tangled than it needs to be — capture/import/present spread across many small types, with seams left over from past experiments. It works, but expect rough edges.

Roadmap

  • Render the Compose root through OpenGL behind a single host surface;
  • Capture and draw child content as GL textures;
  • Validate rotated slot geometry and root-space backdrop sampling;
  • Real separable Gaussian blur passes;
  • Progressive masks, stencil clips, tint, noise, and cumulative backdrop sampling;
  • Atlas grouping for effect layers (needs integration with the bucketed capture buffers).

To explore

  • Present via hardware Bitmap, no SurfaceView. Wrap the result HardwareBuffer as a hardware Bitmap (Bitmap.wrapHardwareBuffer, API 29+) and draw it in the Compose layout. It then composites through the normal Android (HWUI) path, so the host need not redirect the whole layout through our renderer. Open question: is it faster than the SurfaceView present, especially on low-end devices?
  • Compute-shader effect path. Run the separable blur as a GLES 3.1 compute shader instead of fragment passes into FBOs, skipping rasterization and some FBO ping-pong. Open questions: faster on mobile GPUs? compute-support / min-API floor?

Contributing

This project is open to suggestions and contributions. Feel free to open issues or submit pull requests on GitHub.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Development Updates

For project development updates and history, refer to this Twitter thread.

About

Hardware-Accelerated Real-time Blur Effect for Android Jetpack Compose

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors