One of my #video works was exhibited a few months ago, and it was horribly jittery/bad - I supposed at the time it was an incompatibility between the #ffmpeg I used to encode it and the #apple laptop used to play it back.
Tonight I finally found (and fixed) the cause in my own #opengl code: it turns out that
is the wrong order to do those two things in, and will lead to jumbled frame order in the output stream or worse (either undefined behaviour as allowed by the specification, or a driver bug, I haven't dug deeper to know which is the case). The correct order is:
Probably the galaxybrain in this whole thing would be to use an off-screen framebuffer object instead of the default framebuffer, then all windowing system buffer swapping issues are irrelevant.
The broken video: https://mathr.co.uk/meshwalk/.broken/meshwalk-2.0.mp4 (250MB)
The fixed version: https://mathr.co.uk/meshwalk/meshwalk-2.0.mp4 (340MB)
I should probably have called the fixed version "meshwalk-2.1", but I'm too lazy to redo the title screen.
meshwalk-3.0 preview image!
uploaded meshwalk-3.0 video to youtube (I know, bad google, use ad blocker, recommend me a peertube instance or so that would let me upload >1GB highres 360 vids with spatial sound and supports performant in-browser playback? I should probably self-host..)
wrote a blog post about meshwalk-3.0:
source code in my dr1 repository:
my GPU struggles a bit rendering Voronoi cells at with 1000 nodes at 8192x4096 (low framerate, 200 nodes was ok), maybe I should upload a spatial data structure... or maybe I overflowed a cache and uploading more data will only make it worse...
Alternatively, how about rendering a low resolution version using naive O(N W H), then refining it using a few O(W H) passes - if the initial pass is high resolution enough to have at least one pixel for each cell (and cells are not too long and thin, hmmm), doubling the resolution needs only the local information around each pixel. Could store (nearest cell center position and index) in a vec4 in each texel, and do a final colouring pass from index to RGB.
With 4000 nodes I needed to increase the neighbourhood from 8 (this image is with 16) to avoid the breaking/overlap as described in the blog post. I suspect an alternative solution is to reduce the node movement speed when I increase the node count, so that it is relative to the cell size.
With this high node count and speed there are interesting emergent dynamics with locally-dense pockets between less-dense regions.
But it doesn't run at a realtime framerate any more, my CPU takes 1m15s to simulate 1min at 60fps, before even doing any rendering. With rendering:
Reducing the neighbourhood count speeds simulation up enough to be realtime, but rendering is the bottleneck.
With a neighbourhood of 12, the simulation runs at 82fps, which is fast enough. Seems not to break, though I only simulated for a few minutes so far.
I thought I had a good algorithm to detect breakage (maximum distance between neighbours) but that doesn't work in all cases (it has to be very broken for that test to trigger).
Added compile-time control to set the preferred distance between nodes of different colours, relative to the average distance between nodes evenly spaced on the sphere. At 2.0 the structure is more uniform, at 0.5 it is more clustered with patches of high and low density.
However the mesh breaks more often when setting the distance too low, and the sound is too loud (and overloading CPU) when setting the distance too high.
(better with headphones)
Instead of generating the Voronoi diagram of 4000 cells at the target output resolution of 8192x4096, I now generate a coarse Voronoi diagram at 512x256, then refine it in 4 resolution-doubling passes, each of which uses a small local neighbourhood around each pixel, before a final colouring pass.
Now renders at 35fps, which is a big improvement from 1.3fps.
The bottleneck is now the sound, I can't render the 4k oscillators in real time. I wonder if I can use the magic of JACK (jackdmp version) multi-core support to split my embarrassingly parallel audio code into several JACK clients and merge their outputs at the input of Ambix. Need to read up on it, don't know how jackdmp works exactly...
@mathr looking forward to your findings.
> jackdmp use a new client activation model that allows simultaneous client execution (on a smp machine) when parallel clients exist in the graph (client that have the same inputs). This activation model allows to better use available CPU on a smp machine, but also works on mono-processor machine.
I guess that means I can have 4 clients, each with no inputs, that will execute in parallel. Will try it!
@paul it works! meshwalk now creates 4 JACK clients each using ~30% of a CPU core for 1000 oscillators, no XRUNs.