IDirect3DDevice9::StretchRect is used to stretch-blit between video memory surfaces. It's implemented by calling IWineD3DSurfaceImpl_Blt, which itself will attempt IWineD3DSurfaceImpl_BltOverride to accelerate it. However, BltOverride will not accelerate the blit if neither surface is a swapchain or active render target 0, and will fall back to a sysmem->sysmem blit with possible conversions. This seems to happen in between every post-processing (screen-space shader) pass with Oblivion Graphics Extender and Morrowind Graphics Extender, which causes a severe loss in performance.
After some testing and talking with Henri, it seems a number of checks aren't needed if FBO blits are available, particularly the swapchain/active render target checks. The attached patch makes BltOverride check for FBO blits earlier, which helps it catch more cases where blits can be accelerated. It provides a significant improvement with the aforementioned programs.
I'm not sure of the full consequences of this move, however, particularly with earlier DX versions. Hence a request for comments.