Add optional 'fast save' option when saving PNG files#117943
Add optional 'fast save' option when saving PNG files#117943Akosmo wants to merge 1 commit intogodotengine:masterfrom
Conversation
f151d9b to
c7bb896
Compare
|
I came across #116826 and I see that the header file is also added as an include to |
There was a problem hiding this comment.
Tested locally, it works as expected. Code looks good to me.
Compared to #91263, this is not as fast and files produced are larger, but it has almost no binary size overhead (< 1 KB), since we're exposing an existing option in libpng instead of introducing a new library (however small it is). I would still like to get QOI merged as it's still a good deal faster than this PR, making real-time speeds much more likely to be a reality for many projects.
Even if we do merge QOI support later on, I think this PR still has merit on its own as it allows games to opt into faster screenshot taking with a popular image format. This makes sense when you still want a format that can easily be shared on social media. For example, Steam, Discord, etc. don't support QOI file uploads with previewing, but accept PNGs that may be recompressed to smaller lossy formats.
A set of frames in each format for validation, just so you can see the output looks identical: frame_1250.zip
Some feedback:
- The
editor/movie_writer/png/optimize_speedproject setting should betrueby default, as most users will want to favor speed over compression efficiency (since PNG frames are often thrown away after encoding a video from them). It's a small behavior change, but one that I consider acceptable as we don't guarantee MovieWriter outputs to be bit-for-bit identical across releases.
Benchmark
PC specifications
- CPU: AMD Ryzen 9 9950X3D
- GPU: NVIDIA GeForce RTX 5090
- RAM: 64 GB (2×32 GB DDR5-6000 CL30)
- SSD: Solidigm P44 Pro 2 TB
- OS: Linux (Fedora 43)
Using a release export template build with production=yes lto=full. All tests are done with branches rebased on top of master 4a919ad.
Time taken to record 30 seconds of https://github.com/Calinou/godot-movie-maker-demo at the default window size (1152×648), V-Sync disabled:
| Format | Encoding time | File size of all frames |
|---|---|---|
PNG (master) |
5m37s (8% of real-time) | 1294 MB |
| PNG with Optimize Speed (this PR) | 56s (53% of real-time) | 1950 MB |
| QOI | 19s (157% of real-time) | 1533 MB |
I'd like to test fpng for completeness, but I can't compile it due to #101268 (comment) which still applies on my Fedora 43 system.
dd1039b to
792a38a
Compare
|
This should be good. Need more thoughts on this. Will squash on approval |
There was a problem hiding this comment.
Looks good to me. I tested it and .png saving works as expected. The measured improvements in saving speed look great, given that this is a fairly minor change I think for now this should be prioritised over the other performance-improving movie writer PRs.
Could you squash the commits here, as per the contribution guidelines?
Closes: #51868 (unless the results are not satisfactory enough)
Supersedes: #74324
Related to: godotengine/godot-proposals#14534
This PR adds a parameter to functions that save PNG images, that sets libpng's
flagstoPNG_IMAGE_FLAG_FAST. This emphasizes speed over compression, allowing images to be saved faster, at the cost of a slighly bigger file size. This tradeoff is especially significant when saving multiple PNG files, like with MovieWriter, which has been in need of a speed boost.Read more in libpng's code.
Initially, my idea was to just make this a project setting, however, in the PR which this one supersedes, it was recommended to make this a function parameter instead (I had to make it a project setting for MovieWriter).
To maintain expected behavior, I set the default value of the parameter to false, including for MovieWriter.
My motivation for this PR is that my app needs a faster MovieWriter. My only option is to use lossless image sequences, and the only option Godot offers at the moment, PNG, is incredibly slow. I've been waiting for different solutions, but I decided to try making one myself.
As you can see in the benchmarks below, it's not that big of a difference (I recommend others to try benchmarking as well, since in the past others have gotten much better results), but I think it's still worth it, especially considering how slow PNG MovieWriter is at the moment.
Here are my benchmarks:
PC specs: Windows 10 64-bit, SSD, 16 GB RAM 2400MHz, dedicated Radeon RX 580 8 GB, Intel Core i5-9600KF CPU at 3.70GHz
Project settings: Forward+ (I think D3D12)
Executable used: godot.windows.editor.dev.x86_64.console
Single image benchmarks
Tests 1a (1080p without PNG optimization)
Time: 417ms + 393ms + 388ms + 394ms + 388ms = 396ms average
Size: 89.257 KB + 89.229 KB + 89.227 KB + 89.300 KB + 89.331 KB = 89.2688 KB average
Tests 1b (1080p with PNG optimization)
Time: 124ms + 133ms + 138ms + 121ms + 125ms = 128.2ms average
Size: 134.905 KB + 134.87 KB + 134.87 KB + 134.822 KB + 134.822 KB = 134.8578 KB average
All generated images were loaded afterwards in GDScript in the tests above. I had no errors, neither opening them externally, but further tests are encouraged.
MovieWriter benchmarks
All tests below done at 60 FPS, with a duration of 10 seconds, displaying a moving shader.
Test 1a (720p without PNG optimization)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:01:21 (12% of real-time speed).
CPU render time: 0.09 seconds (average: 0.14 ms/frame)
GPU render time: 0.38 seconds (average: 0.63 ms/frame)
Encoding time: 71.22 seconds (average: 118.70 ms/frame)
Size of all frames: 12.4 MB (13.7 MB on disk)
Test 1b (720p with PNG optimization)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:00:26 (38% of real-time speed).
CPU render time: 0.09 seconds (average: 0.14 ms/frame)
GPU render time: 0.37 seconds (average: 0.61 ms/frame)
Encoding time: 16.49 seconds (average: 27.49 ms/frame)
Size of all frames: 21.9 MB (23.1 MB on disk)
Test 2a (1080p without PNG optimization)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:04:28 (3% of real-time speed).
CPU render time: 0.08 seconds (average: 0.14 ms/frame)
GPU render time: 0.79 seconds (average: 1.31 ms/frame)
Encoding time: 190.38 seconds (average: 317.31 ms/frame)
Size of all frames: 107 MB (108 MB on disk)
Test 2b (1080p with PNG optimization)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:01:57 (8% of real-time speed).
CPU render time: 0.08 seconds (average: 0.14 ms/frame)
GPU render time: 0.80 seconds (average: 1.34 ms/frame)
Encoding time: 39.22 seconds (average: 65.37 ms/frame)
Size of all frames: 95.8 MB (97 MB on disk) (not sure why this is smaller)
Test 3a (720p without PNG optimization - more complex shader)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:06:40 (2% of real-time speed).
CPU render time: 0.09 seconds (average: 0.15 ms/frame)
GPU render time: 0.61 seconds (average: 1.02 ms/frame)
Encoding time: 389.90 seconds (average: 649.84 ms/frame)
Size of all frames: 314 MB (315 MB on disk)
Test 3b (720p with PNG optimization - more complex shader)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:01:22 (12% of real-time speed).
CPU render time: 0.09 seconds (average: 0.14 ms/frame)
GPU render time: 0.60 seconds (average: 1.00 ms/frame)
Encoding time: 72.36 seconds (average: 120.61 ms/frame)
Size of all frames: 388 MB (390 MB on disk)
Test 4a (1080p without PNG optimization - more complex shader)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:13:14 (1% of real-time speed).
CPU render time: 0.08 seconds (average: 0.14 ms/frame)
GPU render time: 1.28 seconds (average: 2.13 ms/frame)
Encoding time: 714.14 seconds (average: 1190.23 ms/frame)
Size of all frames: 556 MB (557 MB on disk)
Test 4b (1080p with PNG optimization - more complex shader)
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:03:27 (4% of real-time speed).
CPU render time: 0.08 seconds (average: 0.14 ms/frame)
GPU render time: 1.27 seconds (average: 2.11 ms/frame)
Encoding time: 128.58 seconds (average: 214.30 ms/frame)
Size of all frames: 748 MB (749 MB on disk)
--
Last tests for now: did test 3 again. Compiled with
production=yes dev_build=no debug_symbols=no optimize=speed_trace3a:
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:03:07 (5% of real-time speed).
CPU render time: 0.03 seconds (average: 0.06 ms/frame)
GPU render time: 1.25 seconds (average: 2.08 ms/frame)
Encoding time: 182.51 seconds (average: 304.19 ms/frame)
Same size
3b:
600 frames at 60 FPS (movie length: 00:00:10:00), recorded in 00:00:35 (28% of real-time speed).
CPU render time: 0.04 seconds (average: 0.06 ms/frame)
GPU render time: 1.26 seconds (average: 2.10 ms/frame)
Encoding time: 30.37 seconds (average: 50.62 ms/frame)
Same size
Also single image, same build:
fast_save = falseTime = ~128ms, size = ~207 KB
fast_save = trueTime = ~52m, size = ~234 KB
Overall better results on these last tests.
Final notes:
This is definitely out of my comfort zone, but I really want an improvement to the MovieWriter. I believe this can be helpful to many others as well, since I'm far from the only one who has had issues with MovieWriter. Please let me know if I have missed anything, and I'll try my best to solve it.