everyone knows that writing assembly is a fool's errand
I think this is misrepresenting the advice. I would argue the following:
Writing your whole program in assembly typically won't result in faster code than C or Rust. This is because well-written, readable, maintainable assembly will usually be slower than what a compiler produces. Even if you try to be fairly clever the compiler will almost always do a better job unless you are taking the time to carefully profile every line that you write.
The compiler will evolve over time, your hand-written assembly will not. So even if your assembly is faster initially you will need to revisit it as hardware evolves.
Obviously you will need different assembly for every instruction set.
I don't think anyone ever said "don't try to optimize small sections of code you won't beat the compiler". Of course you can beat the compiler. But it will require significant upfront and maintenance cost to beat the compiler over time. That cost isn't worth it for 99.9% of code. But when applied judiciously it can be used for improvements where it matters.
The conclusion should be start by writing everything in a high level language. Then optimize your algorithms and eliminate performance bugs. Then once you have eliminated the low-hanging fruit consider spending the time to profile and optimize your hottest code in assembly.
I did a bunch of other experiments, which didn't make things faster:
Also particularly interesting what didn't work.
They have the blog post date in the title but I don't see it on the page. Header head nor bottom.
I think this is misrepresenting the advice. I would argue the following:
I don't think anyone ever said "don't try to optimize small sections of code you won't beat the compiler". Of course you can beat the compiler. But it will require significant upfront and maintenance cost to beat the compiler over time. That cost isn't worth it for 99.9% of code. But when applied judiciously it can be used for improvements where it matters.
The conclusion should be start by writing everything in a high level language. Then optimize your algorithms and eliminate performance bugs. Then once you have eliminated the low-hanging fruit consider spending the time to profile and optimize your hottest code in assembly.
Also particularly interesting what didn't work.
They have the blog post date in the title but I don't see it on the page. Header head nor bottom.