2009/4/21 Tobias Jakobi liquid.acid@gmx.net:
+struct ps_np2fixup_t {
- unsigned char idx[MAX_FRAGMENT_SAMPLERS]; /* indices to the real constant */
- GLfloat* const_cache; /* constant cache for fast reloading (without readback) */
- WORD swz; /* bitfield used to determine if we have to swizzle the constant */
- WORD num_consts;
+};
- I'm sure I mentioned this before, but please don't add code without using it. - Why is this in struct ps_compiled_shader? This really looks like something that should be internal to the backend. - What is the point of const_cache? - Writing "GLfloat* const_cache;" is legal C, but really doesn't make a lot of sense. The * is part of the declarator rather than the base type. An example like "int* p, q;" should make this more obvious. - swz is redundant. "prog->np2Fixup_data->swz & (1 << i)" really isn't more efficient than "idx & 1"
- I'm sure I mentioned this before, but please don't add code without
using it.
I'll merge it with one of the other patches then.
- Why is this in struct ps_compiled_shader? This really looks like
something that should be internal to the backend.
No, that's definitely not the case. The struct's data is used both in GLSL and ARB mode (I'll submit the ARB patches later) - so it makes sense to put it into ps_compiled_shader.
For ARB mode this is essential since something like the prog_link struct from GLSL doesn't exist there. The original source of this struct is my port of the fixup code to ARB, but introducing the constant packing for GLSL makes more data necessary - data that should be shared.
I can post the next patchset for review if you don't believe me ;)
- What is the point of const_cache?
Reloading the np2fixup constants.
Before I was declaring vec2 uniforms to store the tex dimensions. Stefan found out that the compilers are inefficent and map a vec2 to a vec4.
So now I'm using vec4s only, but one vec4 is used for two textures. So x and y store width and height of texture i, and z and w store the width and height of texture j.
In load_np2fixup_constants I have to use glUniform4fvARB now to set the values of the uniform. I can't simply create a float[4], zero it, set the components that belong to the currently selected texture and upload it, since it might overwrite a texture dimension pair that was previously uploaded.
However while writing this, I have another idea to solve this without using the cache.
- Writing "GLfloat* const_cache;" is legal C, but really doesn't make
a lot of sense. The * is part of the declarator rather than the base type. An example like "int* p, q;" should make this more obvious.
Your example is bad coding practice IMHO - i would never declare variables with different types on the same line.
So your point is that I should write GLfloat *const_cache; Well, at least from my point of view that's purely a matter of taste. :)
- swz is redundant. "prog->np2Fixup_data->swz & (1 << i)" really isn't
more efficient than "idx & 1"
As far as I can see that doesn't do the same.
2009/4/22 Tobias Jakobi liquid.acid@gmx.net:
- Why is this in struct ps_compiled_shader? This really looks like
something that should be internal to the backend.
No, that's definitely not the case. The struct's data is used both in GLSL and ARB mode (I'll submit the ARB patches later) - so it makes sense to put it into ps_compiled_shader.
Sure, but it's still internal bookkeeping of the backends. Everything the backend needs from the outside world is the np2_fixup field from struct ps_compile_args.
For ARB mode this is essential since something like the prog_link struct from GLSL doesn't exist there.
It's not terribly hard to introduce something like that. That could mean adding a handle table to the ARB backend, or expanding prgId to be a pointer, but that's ok.
- Writing "GLfloat* const_cache;" is legal C, but really doesn't make
a lot of sense. The * is part of the declarator rather than the base type. An example like "int* p, q;" should make this more obvious.
Your example is bad coding practice IMHO - i would never declare variables with different types on the same line.
The point is that writing "GLfloat* const_cache;" implies the * is part of the type instead of the declarator. C doesn't work that way, it's misleading at best.
- swz is redundant. "prog->np2Fixup_data->swz & (1 << i)" really isn't
more efficient than "idx & 1"
As far as I can see that doesn't do the same.
Not in the current code, no. But if you store the index of the fixup parameter instead of the index of the packed uniform the parameter is stored in, you can derive the index of the packed uniform as (idx >> 1) and the position within that uniform as (idx & 1).
Henri Verbeet wrote:
2009/4/22 Tobias Jakobi liquid.acid@gmx.net:
- Why is this in struct ps_compiled_shader? This really looks like
something that should be internal to the backend.
No, that's definitely not the case. The struct's data is used both in GLSL and ARB mode (I'll submit the ARB patches later) - so it makes sense to put it into ps_compiled_shader.
Sure, but it's still internal bookkeeping of the backends. Everything the backend needs from the outside world is the np2_fixup field from struct ps_compile_args.
For ARB mode this is essential since something like the prog_link struct from GLSL doesn't exist there.
It's not terribly hard to introduce something like that. That could mean adding a handle table to the ARB backend, or expanding prgId to be a pointer, but that's ok.
So, is that a suggestion? I don't think that's for me to decide.
You should make clear if I should implement it this way or use another approach. From my point of view it makes sense to share the data, so putting this into ps_compiled_shader is my currently best option.
But if you say, that the current approach is a no-go - I'll try the pointer-approach.
So, how should I do it?
- Writing "GLfloat* const_cache;" is legal C, but really doesn't make
a lot of sense. The * is part of the declarator rather than the base type. An example like "int* p, q;" should make this more obvious.
Your example is bad coding practice IMHO - i would never declare variables with different types on the same line.
The point is that writing "GLfloat* const_cache;" implies the * is part of the type instead of the declarator. C doesn't work that way, it's misleading at best.
IMHO that's splitting hairs.
You could apply the same this for pointer casting, forbidding something like var = (TYPE*)var2; and instead enforce var = (TYPE *)var2;
However, const_cache is now gone - so this shouldn't be a problem anymore ;)
- swz is redundant. "prog->np2Fixup_data->swz & (1 << i)" really isn't
more efficient than "idx & 1"
As far as I can see that doesn't do the same.
Not in the current code, no. But if you store the index of the fixup parameter instead of the index of the packed uniform the parameter is stored in, you can derive the index of the packed uniform as (idx >>
- and the position within that uniform as (idx & 1).
I don't think I can follow you.
I tried some similar approach in the beginning, but it I think it failed if tex1 and tex3 need fixup, but tex2 does not. The approach didn't like "holes" in the indices. However we can't assume that all texture samples that need fixup are consecutive.
2009/4/22 Tobias Jakobi liquid.acid@gmx.net:
Henri Verbeet wrote:
Sure, but it's still internal bookkeeping of the backends. Everything the backend needs from the outside world is the np2_fixup field from struct ps_compile_args.
For ARB mode this is essential since something like the prog_link struct from GLSL doesn't exist there.
It's not terribly hard to introduce something like that. That could mean adding a handle table to the ARB backend, or expanding prgId to be a pointer, but that's ok.
So, is that a suggestion? I don't think that's for me to decide.
You should make clear if I should implement it this way or use another approach. From my point of view it makes sense to share the data, so putting this into ps_compiled_shader is my currently best option.
But if you say, that the current approach is a no-go - I'll try the pointer-approach.
So, how should I do it?
A handle table would probably make the most sense. A pointer would work for ARB, but for GLSL that's just extra indirection that isn't really needed.
The point is that writing "GLfloat* const_cache;" implies the * is part of the type instead of the declarator. C doesn't work that way, it's misleading at best.
IMHO that's splitting hairs.
You could apply the same this for pointer casting, forbidding something like var = (TYPE*)var2; and instead enforce var = (TYPE *)var2;
I think that's better, yes.
Not in the current code, no. But if you store the index of the fixup parameter instead of the index of the packed uniform the parameter is stored in, you can derive the index of the packed uniform as (idx >>
- and the position within that uniform as (idx & 1).
I don't think I can follow you.
I tried some similar approach in the beginning, but it I think it failed if tex1 and tex3 need fixup, but tex2 does not. The approach didn't like "holes" in the indices. However we can't assume that all texture samples that need fixup are consecutive.
Tex fixup parameter for tex1 would have idx 0, the one for tex3 would have idx 1. These would then be packed as PsamplerNP2Fixup0.xy and PsamplerNP2Fixup0.zw.
Henri Verbeet wrote:
A handle table would probably make the most sense. A pointer would work for ARB, but for GLSL that's just extra indirection that isn't really needed.
Sry, I have to correct myself.
It won't work that way.
Setting up the data of the np2 fixup data struct is done in shader_generate_glsl_declarations, where I don't have any access to a glsl_shader_prog_link structure. I might not even exist, since the entries of prog_link are all setup by set_glsl_shader_program.
Furthermore the np2 fixup data struct doesn't really belong in prog_link, it has nothing to do with the GLSL program object itself but is a feature of the fragment shader.
That does not apply to np2Fixup_location, which has to be part of the GLSL object since the uniform locations are based on the program and not the individual components.
That is not the case for the np2 fixup data, which is unique to the fragment shader - and doesn't change when creating a different vshader, fshader/pshader combination.
Not in the current code, no. But if you store the index of the fixup parameter instead of the index of the packed uniform the parameter is stored in, you can derive the index of the packed uniform as (idx >>
- and the position within that uniform as (idx & 1).
I don't think I can follow you.
I tried some similar approach in the beginning, but it I think it failed if tex1 and tex3 need fixup, but tex2 does not. The approach didn't like "holes" in the indices. However we can't assume that all texture samples that need fixup are consecutive.
Tex fixup parameter for tex1 would have idx 0, the one for tex3 would have idx 1. These would then be packed as PsamplerNP2Fixup0.xy and PsamplerNP2Fixup0.zw.
OK, I think I have figured out how to get it work.
I my previous approach didn't work so well, because I was adressing the uniforms in a different way. I also switched to a uniform array, to make it easier to update the texture dimensions in one go. Fewer GL calls that way.
Henri Verbeet wrote:
Tex fixup parameter for tex1 would have idx 0, the one for tex3 would have idx 1. These would then be packed as PsamplerNP2Fixup0.xy and PsamplerNP2Fixup0.zw.
OK, forget that as well.
Like I replied to Stefan about the uchar thing: The code is only active for FX cards which have a constant limit of 256. (or it could become active for cards below a FX)
With the approach without swz, this would mean changing the unsigned char array to a unsigned short one, which would double the size.
I tried to keep the whole struct as small as possible, the current size if now 20 bytes (16 bytes for the indices, and 4 bytes for swz and num_consts).
Adding the code for ARB would introduce 2 additional bytes (active bitflag, since we can figure out quite well if there are still enough free constants in ARB), but that's it.
Performancewise the current approach shouldn't be a problem. I think we should focus on size here, and that's why I don't think changing this will do any good.
Am Mittwoch, 22. April 2009 18:39:40 schrieb Henri Verbeet:
For ARB mode this is essential since something like the prog_link struct from GLSL doesn't exist there.
It's not terribly hard to introduce something like that. That could mean adding a handle table to the ARB backend, or expanding prgId to be a pointer, but that's ok.
Just add my 2c.
ARB doesn't need a prog_link structure because Vertex and Fragment programs are independent, so it would be bad to add some struct that defines a combination of vertex and pixel shader(and not needed for this patch's purpose).
What we'd end up with is a clone of ps_compiled_shader. Probably the best thing is to remove the GLuint prgId from ps_compiled_shader and replace it with a void *backend_priv_data to store per-shader data with backend specific contents.
</mode = pedantic>