FSR 4 has been in development for 9-12 months already, and one of the biggest focuses is improving battery life for handhelds

Steam Deck@sopuli.xyz – 128 points – 2 weeks ago

AMD plans for FSR4 to be fully AI-based — designed to improve quality and maximize power efficiency

He specifically cited bad battery life on the ROG Ally and Lenovo Go, saying that getting only one hour of battery life isn't enough. The Steam Deck (especially the OLED model) does a lot better battery wise, but improving power efficiency should really help with any games that are maxing out the Deck's power.

You are viewing a single comment

View all comments

Here is my view and a small timeline:

FSR 1 (Jun 2021): Post processing. Can be used with any game, any graphics card on any system. Quality is not very good, but developers do not need to support it in order being usable.
FSR 2 (Mar 2022): Analytical and Game specific. Analyzes the content of the ingame in order to produce better output than FSR 1. Can be used only with games that have integrated support for. Still system and graphics card agnostic.
FSR 3 (Sep 2023): Improved version of FSR 2. Therefore the previous point applies here too, but has a bit more features and should produce better quality. It was late on arrival and was controversial at launch.
FSR 4 (maybe 2025): AI and hardware dependent. Not much is known, but we can expect that it requires some form of AI chip on the GPU. We don't know if it will be usable with other GPUs that have such a chip or is restricted to AMD cards. As this is analytical, it requires games to support this, therefore its Game specific as well. It's expected to have superior quality over FSR 3, maybe rivaling XESS or even DSR. But it seems the focus is on low powered weaker hardware, where it would benefit the most.

One technical reason for why FSR 1 isn't very good but works in everything is that FSR1 is the only one that just takes your current frame and upscales it, all the newer ones are all temporal - like TAA - and use data from multiple previous frames.
Very simplified, they "jiggle" the camera each frame to a different position so that they can gather extra data to use, but that requires being implemented in the game engine directly.

Kind of.

The big thing that actually defines FSR2 is that it has access to a bunch more data, particularly the depth buffer, motion vectors, and also, as you said, uses data from previous frames.

The camera jiggle is mostly just to avoid shimmering when the camera is stationary.

I am curious as to why they would offload any AI tasks to another chip? I just did a super quick search for upscaling models on GitHub (https://github.com/marcan/cl-waifu2x/tree/master/models) and they are tiny as far as AI models go.

Its the rendering bit that takes all the complex maths, and if that is reduced, that would leave plenty of room for running a baby AI. Granted, the method I linked to was only doing 29k pixels per second, but they said they weren't GPU optimized. (FSR4 is going to be fully GPU optimized, I am sure of it.)

If the rendered image is only 85% of a 4k image, that's ~1.2 million pixels that need to be computed and it still seems plausible to keep everything on the GPU.

With all of that blurted out, is FSR4 AI going to be offloaded to something else? It seems like there would be a significant technical challenges in creating another data bus that would also have to sync with memory and the GPU for offloading AI compute at speeds that didn't risk create additional lag. (I am just hypothesizing, btw.)

The thing with “AI” or better still, ML cores, is that they’re very specialized. Apple hasn’t been slapping ML cores in all of their cpus since the iPhone 8 because they are super powerful, it’s because they can do some things (that the hardware would have no problem doing anyway) by sipping power. You don’t have to think about AI as in the requirements for huge LLM like ChatGPT that require data centers, think about it like a hardware video decoder: This thing could play easily 1080p video! Or, going with raw cpu power rather than hardware decoding, 480p. It’s why you can watch hours of videos on your phone, but try doing anything that hits the cpu and the battery melts.

Edit: my example has been bothering me for days now. I want to clarify to avoid any possible misunderstanding that hardware video decoding has nothing to do with AI, it’s just another very specialized chip.

Well, Nvidia and Intel does that too, and I think Sony added an AI chip to the PS5 Pro for their new AI upscaler as well. We can already run AI calculations on our GPU without AI accleration, but that is not as fast. I have no numbers for you, only the logic that optimized software to use optimized AI chips should run more efficient and faster, without slowing down the regular GPU work. Intel is in this hybrid state, where they support both. One version of XESS can run on all GPUs, but that is worse than XESS specialized for Intel GPUs with their dedicated AI accelerators.

Those upscaler you linked are only upscaling non interactive video or single frames, right? An AI upscaler on live gameplay takes much more into consideration, like menus, specific parts of the image being background and such. These information are programmed into the game, so its drastically different approach from just images upscaling, which wouldn't be different than FSR 1 in such a case. But I have no clue about numbers and how it compares to a solution like that.

I don't think this is a decision they just made recently and probably was planning long before they even started on FSR 4, plus they were already working for 12 months or so on it (allegedly). I think AMD "needs" to do this AI offloading, because market demands it, traditional solution didn't workout as hoped and maybe in co operation with Valve, Microsoft and other vendors. On the other side, this AI acclerator could be used for anything else than upscaling as well, as Nvidia demonstrated.

1 more...