may start late this year (or even be cancelled) as I'm between renderers (the new one I'm working on is so much more efficient than the old ones that I don't want to use the old ones, and the new one isn't ready yet)

Mandelbrot set acceleration 

At a "hard" location suggested by Fractal Universe on the forums, the BLA method finished in 37seconds while when I stopped KF after an hour it had only reached 2% progress on the pixel iterations, and that was with the iteration count reduced from 2100100100 to 500100100 as otherwise it ran out of memory on my 32GB desktop...

This new method is magic, massive thanks to Zhuoran on the forum for dropping the two hints (rebasing when |Z+z|<|z| and using bivariate linear interpolation), albeit without great detail or source code for the latter.

Putting on hold proof of correctness using error bounds as I don't understand it enough (and hoping someone else can do it and/or teach me).

Next to investigate is applying the method to other escape time fractals like the Burning Ship. I suppose A and B would become 2x2 real matrices instead of complex numbers, not sure what matrix norm to use when calculating the safe radius. Needs to take into account distance from axis for abs-folding, as well as from origin for squaring nonlinearity...

Show thread

Mandelbrot set acceleration 

turns out the glitchy images I was getting were down to bad assumptions when looking up the data in the tables, adding extra data to the tables helped me diagnose where indexes where getting mixed up.... very lots of time wasted thinking I had bad "it's ok to do linear here" radius calculations....

rough benchmarks at 1920x1080:

"Flake" by Dinkydau
kf-2.15.4 : 5.1s
kf-2.15.5pre : 2.6s
bla : 0.5s

"Evolution of Trees" by Dinkydau
kf-2.15.4 : 1m21s
kf-2.15.5pre : 37s
bla : 2.4s

this is just with the linear approximation without error control, so results may turn out bad for some locations. the backward error analysis stuff is something I may revisit later (must remember to use relative error in output and absolute error in input: relative error in input is no good if inputs -> 0 like c does at the reference)

Show thread

Mandelbrot set acceleration 

Better is merging BLAs with the quadratic terms too so the backwards error radius can be computed. I think I'm doing it wrong because it's only valid for a narrow |c| range instead of all c in image...

Show thread

exhibition opening tonight!

Friday 26th November 2021 7pm

73 Tontine St
CT20 1JR,
United Kingdom

computer-based works by me
curated by @netzzz

Mandelbrot set acceleration 

Maybe it would work to have a merging strategy.

For example, 1-step BLA is easy to compute. Merging neighbouring BLAs is possible:

ZA z + ZB c = YA(XA z + XB c) + YB c




Z skip = X skip + Y skip


Z radius = min(X radius, Y radius / (|XA| + 1))

is probably wrong but I would need paper to work it out...

Then build tables of M 1-steps, M/2 2-steps, M/4 4-steps etc by merging neighbouring BLAs without overlaps. So the total table size is O(M), which equals the work needed to construct it. Much better than O(M^2)

When iterating pixels, try larger steps (with smaller radius) first if applicable, which means at most log M to search through. This was O(M) in the previous implementation. It may take O(log L) steps to skip L iterations which previously could have been 1 step, but the reduction in search time may be worth it.

Show thread

Mandelbrot set acceleration 

Seems the table gets fully populated even when computed on demand.

The backwards error control is expensive but seems to work (correct image first try); but skipping is pessimised: now takes 7mins vs 20secs with the earlier version with fudge factor to avoid bad images.

Wondering if there is some dynamic programming technique to apply to reduce required work from O(M^2) to O(M log M).

For example product_m^n 2z_i = product_1^n 2z_i / product_1^{m-1} 2 z_i. so a precomputed table of running products costing O(M) could reduce cost of sub products from O(n-m) to O(1).

Iteration of B -> 2 Z_i B + 1 starting from m until n gives
sum_k=m^{n+1} product_j=k^n 2Z_j
where the empty product gives 1.
Applying the previous identity, this is
Sum_k=m^n+1 product_j=1^n 2z_j / product_j=1^k-1 2z_j
(product_j=1^n 2z_j) (sum_k=m^n+1 1/product_j=1^k-1 2zj)
P[n] (S[n+1] - S[m-1])
P[n] = 2 Z_n P[n-1]
S[n] = S[n-1] + 1/P[n-1]
can be computed in O(M)

Show thread

Mandelbrot set acceleration 



|z| := cut off radius
m := starting iteration
h := image size in pixels
k := pixel spacing
A := 1
B := 0
D := 0
E := 0
F := 0
l := 0
r := 4
while (m + l < M)
Z := Z[m + l]
A' := 2 Z A
B' := 2 Z B + 1
D' := 2 Z D + A^2
E' := 2 (Z E + A B)
F' := 2 Z F + B^2
R := max{ |D|, |E|/2, |F| }
a := R h
b := R h^2 k
c := R h^3 k^2 - |B| h k
r' := (-b +/- sqrt(b^2 - 4 a c)) / 2 a
if r' < r
Table[m].insert_front{l, r, A, B)
if r < |z|
r := r'
A := A'
B := B'
D := D'
E := E'
F := F'
l := l + 1


- doing table computation up front with |z| = 0 takes O(M^2) time, which is infeasible for large M (even though it can be parallelized)

- doing table computation on demand makes parallelism much more complicated, but subsequent read-only accesses should be fast/simple - this is ok as long as Write Once Read Many applies)

something like this perhaps

T := atomic read Table[m]
if T == nullptr
acquire mutex lock for Table[m]
T := atomic read Table[m]
if T == nullptr
// we are the first thread
T = compute table
Table[m] := atomic write T
// other thread got there first
unlock mutex for Table[m]
use T

...(3/3) (maybe more later when I've actually implemented all of it)

Show thread

Mandelbrot set acceleration 


h := image size in pixels
k := pixel spacing
so c is bounded by h k

R (|z|^2 + |z| |c| + |c|^2) / (|B| |c|) < 1 / h
R h |z|^2 + R h |c| |z| + R h |c|^2 - |B| |c| < 0
which is quadratic in |z|, so we can solve for |z| < r

want to find r such that this is valid for any |c| < h k, so can compute only once per image. then for all |z| < r, error in z -> A z + B c is small.


C := view center (high precision)
M := reference count limit
N := iteration count limit
Zoom := magnifaction factor relative to [-2,2] x [-2,2]


Z[M] low precision array
Z := 0 (high precision)
m := 0
while (|Z| < 2 && m < M)
Z[m] := round(Z)
Z := Z^2 + c
m := m + 1


struct TableItem
int k
real r
complex A, B
Table[M] : list<TableItem>


c := pixel coordinates (relative to C) * pixel_spacing
z := 0
n := 0
m := 0
while (|Z[m] + z| < 2 && n < N)
// perturbed iteration
z := (2 Z[m] + z) z + c
n := n + 1
m := m + 1
// rebase
if (|Z[m] + z| < z)
z := Z[m] + z
m := 0
if no Table[m]
// compute Table[m] until r < |z|
search Table[m] for smallest T->r > |z|
if T->k > 0
z := T->A * z + T->B * c
n := n + T->k
m := m + T->k
// rebase
if (|Z[m] + z| < z)
z := Z[m] + z
m := 0


Show thread

Mandelbrot set acceleration 

Z -> Z^2 + C
high precision reference orbit

z -> (2 Z+ z) z + c
low precision deltas relative to high precision orbit

|Z + z| < |z|
when this occurs, replace z with Z + z and reset the reference iteration count to 0 (this avoids glitches)

z -> A z + B c
A -> 2 Z A
B -> 2 Z B + 1
bivariate linear approximation to skip many iterations at once

|Z + z| >> |z|
|z| << (|B| |c| - |Z|) / (|A| + 1)
stopping condition based on ball arithmetic

flaw: this needs an arbitrary "fudge factor", 0.1 works for Dinkydau's Flake, 0.01 is required for Dinkydau's Evolution of Trees

solution (untested so far)
z -> A z + B c + D z^2 + E zc + F c^2
D -> 2 Z D + A^2
E -> 2 (Z E + A B)
F -> 2 Z F + B^2
R = max{ |D|, |E|/2, |F| }
z = A z + B c + R (|z^2| + |zc| + |c^2|)
Taylor remainder bound

z = (1 + e_z) (A z + B c)
= A z + B (1 + e_c) c
e_z =R (|z^2| + |zc| + |c^2|) / (A z + B c)
e_c = (A z + B c) e_z / (B c)
= R (|z^2| + |zc| + |c^2|) / (B c)
Backward error

For good enough image, want (1 + e_c) c to be in the same pixel as c: (1 + e_c) c < c + pixel_spacing so e_c < pixel_spacing / c. Now c is bounded by image size in pixels * pixel spacing, so want e_c < 1 / (image size in pixels)


I found a nice reference for methods:

> Rigorous Analysis of Nonlinear Motion in Particle Accelerators
> Kyoko Makino
> PhD thesis
> Department of Physics and Astronomy
> Michigan State University
> 1998

It uses regular real box intervals (two endpoints), but wasn't too hard to convert to complex ball intervals (midpoint and radius).

My code:

I implemented knighty's root finding algorithm (in the context of period detection in the set) on top of the primitive operations from the thesis, and now I think I understand how it works a bit more. I have to say knighty's original code that I copy/pasted was rather impenetrable:

Show thread

I wrote exrmean because I found a way to emulate Fragmentarium-style multiple with : render a zoom out sequence with the zoom size set to 1 (so each image shows the same location).

With jitter enabled, the pseudorandom seed for subpixel offsets is incremented each frame, so the images will be slightly different.

Averaging them all (in linear light, as used in EXR) will give a final image with higher quality than any individual frame, without having needed to render a huge image and downscale it (the previous option).

Enable "reuse reference" to avoid calculating it each subframe (unfortunately the series approximation is recalculated, which is not optimal).

Probably better than using vast amounts of temporary space for EXR files in this method would be to extend KF to do the subframe rendering itself. That'd be much more work than this quick hack though.

Show thread

I updated with a new tool exrmean to find the of a sequence of image files. Only works with float16 (half precision) R G B data, other channels are ignored. All images must have the same dimensions. Accumulation is done in float32 (single precision), output is float16 again.

Show thread
Show older

Welcome to, an instance for discussions around cultural freedom, experimental, new media art, net and computational culture, and things like that.

<svg xmlns="" id="hometownlogo" x="0px" y="0px" viewBox="25 40 50 20" width="100%" height="100%"><g><path d="M55.9,53.9H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,53.9,55.9,53.9z"/><path d="M55.9,58.2H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,58.2,55.9,58.2z"/><path d="M55.9,62.6H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,62.6,55.9,62.6z"/><path d="M64.8,53.9c-0.7,0-1.3,0.6-1.3,1.3v8.8c0,0.7,0.6,1.3,1.3,1.3s1.3-0.6,1.3-1.3v-8.8C66,54.4,65.4,53.9,64.8,53.9z"/><path d="M60.4,53.9c-0.7,0-1.3,0.6-1.3,1.3v8.8c0,0.7,0.6,1.3,1.3,1.3s1.3-0.6,1.3-1.3v-8.8C61.6,54.4,61.1,53.9,60.4,53.9z"/><path d="M63.7,48.3c1.3-0.7,2-2.5,2-5.6c0-3.6-0.9-7.8-3.3-7.8s-3.3,4.2-3.3,7.8c0,3.1,0.7,4.9,2,5.6v2.4c0,0.7,0.6,1.3,1.3,1.3 s1.3-0.6,1.3-1.3V48.3z M62.4,37.8c0.4,0.8,0.8,2.5,0.8,4.9c0,2.5-0.5,3.4-0.8,3.4s-0.8-0.9-0.8-3.4C61.7,40.3,62.1,38.6,62.4,37.8 z"/><path d="M57,42.7c0-0.1-0.1-0.1-0.1-0.2l-3.2-4.1c-0.2-0.3-0.6-0.5-1-0.5h-1.6v-1.9c0-0.7-0.6-1.3-1.3-1.3s-1.3,0.6-1.3,1.3V38 h-3.9h-1.1h-5.2c-0.4,0-0.7,0.2-1,0.5l-3.2,4.1c0,0.1-0.1,0.1-0.1,0.2c0,0-0.1,0.1-0.1,0.1C34,43,34,43.2,34,43.3v7.4 c0,0.7,0.6,1.3,1.3,1.3h5.2h7.4h8c0.7,0,1.3-0.6,1.3-1.3v-7.4c0-0.2,0-0.3-0.1-0.4C57,42.8,57,42.8,57,42.7z M41.7,49.5h-5.2v-4.9 h10.2v4.9H41.7z M48.5,42.1l-1.2-1.6h4.8l1.2,1.6H48.5z M44.1,40.5l1.2,1.6h-7.5l1.2-1.6H44.1z M49.2,44.6h5.5v4.9h-5.5V44.6z"/></g></svg>