FSR 4 has been in development for 9-12 months already, and one of the biggest focuses is improving battery life for handhelds
tomshardware.com
He specifically cited bad battery life on the ROG Ally and Lenovo Go, saying that getting only one hour of battery life isn't enough. The Steam Deck (especially the OLED model) does a lot better battery wise, but improving power efficiency should really help with any games that are maxing out the Deck's power.
You are viewing a single comment
Here is my view and a small timeline:
One technical reason for why FSR 1 isn't very good but works in everything is that FSR1 is the only one that just takes your current frame and upscales it, all the newer ones are all temporal - like TAA - and use data from multiple previous frames.
Very simplified, they "jiggle" the camera each frame to a different position so that they can gather extra data to use, but that requires being implemented in the game engine directly.
Kind of.
The big thing that actually defines FSR2 is that it has access to a bunch more data, particularly the depth buffer, motion vectors, and also, as you said, uses data from previous frames.
The camera jiggle is mostly just to avoid shimmering when the camera is stationary.
I am curious as to why they would offload any AI tasks to another chip? I just did a super quick search for upscaling models on GitHub (https://github.com/marcan/cl-waifu2x/tree/master/models) and they are tiny as far as AI models go.
Its the rendering bit that takes all the complex maths, and if that is reduced, that would leave plenty of room for running a baby AI. Granted, the method I linked to was only doing 29k pixels per second, but they said they weren't GPU optimized. (FSR4 is going to be fully GPU optimized, I am sure of it.)
If the rendered image is only 85% of a 4k image, that's ~1.2 million pixels that need to be computed and it still seems plausible to keep everything on the GPU.
With all of that blurted out, is FSR4 AI going to be offloaded to something else? It seems like there would be a significant technical challenges in creating another data bus that would also have to sync with memory and the GPU for offloading AI compute at speeds that didn't risk create additional lag. (I am just hypothesizing, btw.)
The thing with “AI” or better still, ML cores, is that they’re very specialized. Apple hasn’t been slapping ML cores in all of their cpus since the iPhone 8 because they are super powerful, it’s because they can do some things (that the hardware would have no problem doing anyway) by sipping power. You don’t have to think about AI as in the requirements for huge LLM like ChatGPT that require data centers, think about it like a hardware video decoder: This thing could play easily 1080p video! Or, going with raw cpu power rather than hardware decoding, 480p. It’s why you can watch hours of videos on your phone, but try doing anything that hits the cpu and the battery melts.
Edit: my example has been bothering me for days now. I want to clarify to avoid any possible misunderstanding that hardware video decoding has nothing to do with AI, it’s just another very specialized chip.
Well, Nvidia and Intel does that too, and I think Sony added an AI chip to the PS5 Pro for their new AI upscaler as well. We can already run AI calculations on our GPU without AI accleration, but that is not as fast. I have no numbers for you, only the logic that optimized software to use optimized AI chips should run more efficient and faster, without slowing down the regular GPU work. Intel is in this hybrid state, where they support both. One version of XESS can run on all GPUs, but that is worse than XESS specialized for Intel GPUs with their dedicated AI accelerators.
Those upscaler you linked are only upscaling non interactive video or single frames, right? An AI upscaler on live gameplay takes much more into consideration, like menus, specific parts of the image being background and such. These information are programmed into the game, so its drastically different approach from just images upscaling, which wouldn't be different than FSR 1 in such a case. But I have no clue about numbers and how it compares to a solution like that.
I don't think this is a decision they just made recently and probably was planning long before they even started on FSR 4, plus they were already working for 12 months or so on it (allegedly). I think AMD "needs" to do this AI offloading, because market demands it, traditional solution didn't workout as hoped and maybe in co operation with Valve, Microsoft and other vendors. On the other side, this AI acclerator could be used for anything else than upscaling as well, as Nvidia demonstrated.