I have a similar build for similar reasons. It works great, though I use Windows, so no driver issues (VR introduces too much jank with Linux). Notes below.
CUDA is essential. Definitely the right call paying the Nvidia tax.
My Gigabyte 4090 works for LLM stuff without a second card, and has no coil whine I can hear. I use Alpaca 4-bit entirely in VRAM, and SDXL runs like a dream. I only have 48GB of RAM total, but VRAM is pretty much always the limiting factor (if I understand correctly, it works best when you have at least enough spare RAM as you have VRAM when you’re loading the model, but after that the computation is on the GPU if you have a 4090. Moving layers to the CPU/RAM drops performance fast). I have an A4000 in another machine that I was planning to add with a riser cable, and I just haven’t bothered because I didn’t end up needing it.
Leaving the upgrade path open is a solid choice. The space is so volatile that it’s impossible to predict what the requirements will be like in six months. They could even go down like they did when 4-bit happened.
I use an external DAC, so can’t speak to the whine there. They’re not that expensive though.
This is based on a misunderstanding of how prices are set. The price is set based on what the market can bear. Costs pretty much only determine if the thing is worth making, given that.
It’s the same reason rent doesn’t go down when property taxes do. I mention this not to tear you down, but because it’s a common argument for bad policy.