Follow

Installed 's implementation (minus kernel DKMS stuff) from their repository (something ROCm-3.8 or similar). The `clinfo` it bundled in /opt worked great, but `/usr/bin/clinfo` would not find the AMD , and so neither most other OpenCL applications. Solution was to create an text file in `/etc` pointing to the relevant shared library:

```
$ cat /etc/OpenCL/vendors/amdocl64.icd
/opt/rocm-3.8.0/opencl/lib/libamdocl64.so
```

· · Web · 2 · 0 · 1

It seems something is not happy when I do in a different , getting random that seems to go away if I run it on the main thread (which harms user interface responsiveness, but so be it).

Rough timings are that my AMD Radeon RX 580 GPU running OpenCL is about the same overall speed in this workload as my AMD Ryzen 2700X CPU running compiled C++, even with all the back and forth host<->device memory copies that I haven't optimized yet.

Show thread

A generic C++ build on Intel Core2Duo runs about half the speed as the POCL CPU OpenCL implementation. Maybe I can drop my attempts at (which benefit strongly from non-portable -march=native, as I haven't figured out runtime CPU detection and compiling multiple versions) and punt that to the runtime compiler(s). A fallback in case of no OpenCL might still be handy though...

Show thread

A build from clean of takes ~45mins wall-clock time on my ancient laptop. A large chunk of it is in `-DPASSA` of the `formula.cpp`, which corresponds to "perturbation with SIMD and derivatives". Will rip out all the SIMD things now and see how much it improves.

Show thread

Turns out I just needed to `make SIMD=0` with no code changes necessary. Now build from clean takes ~15mins, which is still long but most rebuilds during development hopefully don't need to recompile all of it.

Show thread
Sign in to participate in the conversation
post.lurk.org

Welcome to post.lurk.org, an instance for discussions around cultural freedom, experimental, new media art, net and computational culture, and things like that.