One of my #video works was exhibited a few months ago, and it was horribly jittery/bad - I supposed at the time it was an incompatibility between the #ffmpeg I used to encode it and the #apple laptop used to play it back.
Tonight I finally found (and fixed) the cause in my own #opengl code: it turns out that
glfwSwapBuffers(...);
glReadPixels(...);
is the wrong order to do those two things in, and will lead to jumbled frame order in the output stream or worse (either undefined behaviour as allowed by the specification, or a driver bug, I haven't dug deeper to know which is the case). The correct order is:
glReadPixels(...);
glfwSwapBuffers(...);
Probably the galaxybrain in this whole thing would be to use an off-screen framebuffer object instead of the default framebuffer, then all windowing system buffer swapping issues are irrelevant.
The broken video: https://mathr.co.uk/meshwalk/.broken/meshwalk-2.0.mp4 (250MB)
The fixed version: https://mathr.co.uk/meshwalk/meshwalk-2.0.mp4 (340MB)
The fixed lossless #rgb version which is 10x the size but doesn't look blurry (I'm looking at you, #yuv420p chroma subsampling): https://mathr.co.uk/meshwalk/meshwalk-2.0.mov (3GB)
I should probably have called the fixed version "meshwalk-2.1", but I'm too lazy to redo the title screen.
Working on meshwalk-3.0, 360 video with spatial sound (1st order Ambisonics so far, maybe 3rd order later), using spherical Voronoi cells for the walking mesh. The movement of the mesh in this version is partly based on the idea of potential wells for crystal formation, and is realized by updating a collection of nearest neighbours to each node using neighbours of neighbours (and hoping that it doesn't move so fast that that algorithm would break).
uploaded meshwalk-3.0 video to youtube (I know, bad google, use ad blocker, recommend me a peertube instance or so that would let me upload >1GB highres 360 vids with spatial sound and supports performant in-browser playback? I should probably self-host..)
wrote a blog post about meshwalk-3.0:
https://mathr.co.uk/blog/2018-11-28_meshwalk-3.0.html
source code in my dr1 repository:
https://code.mathr.co.uk/dr1/blob/HEAD:/meshwalk/meshwalk-3.0.c
my GPU struggles a bit rendering Voronoi cells at with 1000 nodes at 8192x4096 (low framerate, 200 nodes was ok), maybe I should upload a spatial data structure... or maybe I overflowed a cache and uploading more data will only make it worse...
Alternatively, how about rendering a low resolution version using naive O(N W H), then refining it using a few O(W H) passes - if the initial pass is high resolution enough to have at least one pixel for each cell (and cells are not too long and thin, hmmm), doubling the resolution needs only the local information around each pixel. Could store (nearest cell center position and index) in a vec4 in each texel, and do a final colouring pass from index to RGB.
With 4000 nodes I needed to increase the neighbourhood from 8 (this image is with 16) to avoid the breaking/overlap as described in the blog post. I suspect an alternative solution is to reduce the node movement speed when I increase the node count, so that it is relative to the cell size.
With this high node count and speed there are interesting emergent dynamics with locally-dense pockets between less-dense regions.
But it doesn't run at a realtime framerate any more, my CPU takes 1m15s to simulate 1min at 60fps, before even doing any rendering. With rendering:
512x256: 47fps
1024x512: 46fps
2048x1024: 32fps
4096x2048: 8fps
8192x4096: 2fps
Reducing the neighbourhood count speeds simulation up enough to be realtime, but rendering is the bottleneck.
Meanwhile I optimized the sound implementation (which was overloading my CPU), dropping calculation of oscillators whose volume is less than -60dB.
meshwalk-3.1 video!
(better with headphones)
https://youtu.be/QxiEQUS5Xvo
Instead of generating the Voronoi diagram of 4000 cells at the target output resolution of 8192x4096, I now generate a coarse Voronoi diagram at 512x256, then refine it in 4 resolution-doubling passes, each of which uses a small local neighbourhood around each pixel, before a final colouring pass.
Now renders at 35fps, which is a big improvement from 1.3fps.
The bottleneck is now the sound, I can't render the 4k oscillators in real time. I wonder if I can use the magic of JACK (jackdmp version) multi-core support to split my embarrassingly parallel audio code into several JACK clients and merge their outputs at the input of Ambix. Need to read up on it, don't know how jackdmp works exactly...
@mathr looking forward to your findings.
docs say:
> jackdmp use a new client activation model that allows simultaneous client execution (on a smp machine) when parallel clients exist in the graph (client that have the same inputs). This activation model allows to better use available CPU on a smp machine, but also works on mono-processor machine.
I guess that means I can have 4 clients, each with no inputs, that will execute in parallel. Will try it!
@paul it works! meshwalk now creates 4 JACK clients each using ~30% of a CPU core for 1000 oscillators, no XRUNs.
Added compile-time control to set the preferred distance between nodes of different colours, relative to the average distance between nodes evenly spaced on the sphere. At 2.0 the structure is more uniform, at 0.5 it is more clustered with patches of high and low density.
However the mesh breaks more often when setting the distance too low, and the sound is too loud (and overloading CPU) when setting the distance too high.