PSA: There is a super easy to use app called "GPT4ALL" which lets literally any idiot run an open source LLM on their local machine. (As long as you don't go fucking crazy and try to run a huuuuge model, that is)
Smaller models run at pretty good speeds on my RX 5700 XT. Yes, you heard that right, you can use Vulkan to run them on an AMD GPU
Tested with a 4Gb RTX 2050 laptop, works perfectly fine. Tested on a i5-3570 & RX 480 machine, also works fine.
What are your settings and setup?