AFAICT this does not quite produce true binaries, but rather interprets C++ via Cling, is that right? And the docs only offer that C++-like speeds are achieved for PyPy. If there are any performance benchmarks for CPython3, I can't see find them. Thats the real question - few people combine Python and C++ just for the fun of it.
EDIT some benchmarks are available in this paper, linked from TFA: https://wlav.web.cern.ch/wlav/Cppyy_LavrijsenDutta_PyHPC16.p... But they don't really answer my question. The benchmarks seem to mostly look at the overhead of wrapping C++, rather than comparing to a Python implementation. There is some I/O involved in some of them, which is maybe not so interesting, and some of the benchmarks don't even have a pure CPython implementation. Where they do, speed is very close. But then the paper is from 2018, a lot may have changed.
I suggest having a look at the pyproject and src/faebryk/core/cpp.
[0] https://github.com/wjakob/nanobind [1] https://github.com/atopile/atopile
this only goes so far - if you try to eg bind O(10k) methods using nanobind (or pybind ofc) you will be compiling for a very long time. for example, i have a threadripper and with a single large TU (translation unit) it took about 60 minutes (because single TU <-> single thread). i had to "shard" my nanobind source to get down to a "reasonable" ~10 minutes.
If k means 1000 than O(10000) is the same as O(1). Perhaps you meant "approximately 10k"?
Thanks!
Then you sprinkle some cdef around etc and you get a bit faster again. You rewrite your algo a bit, so it's more "stateful C" style, which is not so much the Python way, and it gets a little faster. But not that much.
So then to make real gains you have to go into the weeds of what is going on. Look at the Cython bottlenecks, usually the spots where Cython has to revert to interacting with the Python interpreter. You may go down the rabbit holes of Cython directives, switching off things like overflow checks etc. IME this is a lot of trial and error and isn't always intuitive. All of this is done in a language that, by this point, is superficially similar to Python but might as well not be. The slowness comes no longer from algorithmic logic or Python semantics but from places where Cython escapes out to the Python interpreter.
At this point, C++ may offer a respite, if you are familiar with the language. Because performance tradeoffs are very obvious in code right in front of you. You get no head start in terms of Pythonic syntax, but otherwise you are writing pure C++ and its so much easier to reason with the performance.
I would imagine that very well written Cython is close in performance to C++ but for someone who knows a bit of C++ and only occasionally writes Cython, the former is much easier to make fast.
> AFAICT this does not quite produce true binaries, but rather interprets C++ via Cling, is that right?
yes but to be very clear: it's not designed to interpret arbitrary cpp but calls and ctors and field accesses. the point is binding. also it can use cling or clang-repl.
Cppyy – Automatic Python-C++ bindings - https://news.ycombinator.com/item?id=19848450 - May 2019 (22 comments)