On Tue, Dec 22, 2009 at 6:44 PM, Henri Verbeet hverbeet@gmail.com wrote:
2009/12/22 Stefan Dösinger stefan@codeweavers.com:
- conv = ((FVF & WINED3DFVF_POSITION_MASK) == WINED3DFVF_XYZRHW ) || (FVF & (WINED3DFVF_DIFFUSE | WINED3DFVF_SPECULAR));
hr = buffer_init(object, This, Size, Usage, WINED3DFMT_VERTEXDATA,
- Pool, GL_ARRAY_BUFFER_ARB, NULL, parent, parent_ops);
- Pool, GL_ARRAY_BUFFER_ARB, NULL, parent, parent_ops, conv);
This looks questionable, we use the FVF to determine that the buffer is going to need conversion, but don't pass that FVF to the buffer itself? Shouldn't this just use the existing code in buffer.c to determine when we need conversion in the first place, and just drop the VBO when the overhead becomes too large? Note that if we have EXT_vertex_array_bgra we don't need conversion for the color data in the first place.
The code mentions that when conversion is needed no VBO is created because conversion on the VBO memory in combination with uploading and drawStridedFast is slower than drawStridedSlow. The buffer object extensions discourage to perform much operations on buffer memory because typically it is uncached. Have you tried to perform conversion on a normal memory buffer and compared performance to doing the same on VBO memory? It might make sense to do the conversion on a normal memory buffer and memcpy that contents to a VBO? That way you still profit from the async uploads to the GPU and the conversion itself.
Roderick