Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

DarkBASIC Professional Discussion / A memblock moment - writing RGBA colours with integers

Author
Message
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 13th Aug 2014 13:17
I am having a memblock moment.... mental block whilst writing memblocks. Now I have always used the rgb() functions whilst creating memblock colours, then used a bitwise command to add alpha if I need to. This time, for once, I tried an intuitive alternative...mainly to see which was faster and I have run into something that doesn't seem to make sense (i.e. that challenges how I thought things work).

In the code below I am building a 1D permutation texture (for a shader). This simply takes a random value (0-255) in an array and writes it into a 256x1 texture. I use a simple integer multiplication to shift the bits to write it to RGBA, however, it is writing the wrong byte...in rather a nonsense manner. I'd guess this is something to do with casting from integer to dword? It is either this or I have been blisteringly stupid somewhere. I've got another way to make this work, but love to know why this doesn't.




GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

Libervurto
17
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 13th Aug 2014 14:48 Edited at: 13th Aug 2014 14:58
I haven't examined your code properly but two things that I always get tripped up on (from DBC at least I assume it is the same in DBP)

1. Remember to convert the bit-depth to bytes.
2. Colours are stored little-endian. (BGRA I think)

here is my function for drawing a dot on a memblock bitmap


Formerly OBese87.
Rudolpho
18
Years of Service
User Offline
Joined: 28th Dec 2005
Location: Sweden
Posted: 13th Aug 2014 15:14
Like Libervurto says, you've just got the wrong order (you're writing in ABGR, it expects BGRA). Also you've left a 1 out at the beginning of 16777216 (2^24) so your red (or actually alpha) value will not be written to the right place.
Bitshifting should be faster than multiplying by the way. Also beware that your approach will cause unexpected behaviour (ie. overwriting other bits) if your perm values fall outside the 0..255 range.


"Why do programmers get Halloween and Christmas mixed up?"
Libervurto
17
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 13th Aug 2014 15:15
Yes bitshifting is very handy. I wish DBC had it.

Formerly OBese87.
Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 13th Aug 2014 15:40
Yes, colour is stored as ARGB so, as Rudolpho says, your four versions are writing first to the low order byte (B), then to the second lowest (G), and so on.

I haven't tested but I'd expect bit shifting to be faster.



Powered by Free Banners
TheComet
16
Years of Service
User Offline
Joined: 18th Oct 2007
Location: I`m under ur bridge eating ur goatz.
Posted: 13th Aug 2014 17:58 Edited at: 13th Aug 2014 17:59
Quote: "I am building a 1D permutation texture"

what does that mean?

Use hex values and bitwise operations, everything becomes so much clearer.

Hex colour values are in the form of:


Examples:


Your code adapted:


GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 13th Aug 2014 18:39
Thanks everyone for the snappy answers. Wrong order certainly explains it.

I've amended TheComet's very helpful snippet. It seems the shifts should be in bits rather than bytes (I am sure you were testing me):



A 1D permutation texture is just a texture of height 1 filled with random numbers used to generate a pseudorandom number....or rather that what I THINK it is from what I've read Wouldn't 1D random texture be a nicer term. Jargon, jargon, jargon.

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 14th Aug 2014 02:00
Quote: "Wouldn't 1D random texture be a nicer term"


I'm sure it would be less precise. If you're trying to replicate Ken Perlin's use of permutations to get an efficient implementation of Perlin noise then it needs to be a permutation not merely random.



Powered by Free Banners
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 14th Aug 2014 23:56
GG,

Very good guess, this is exactly how I am torturing myself right now. After several days I have managed to get something that looks like perlin noise, but for some bizarre reason won't form a seamless texture.

Currently trying to get my grad vectors to repeat periodically which should theoretically do it...however, 18 hours of trying it in different ways has not resulted in any joy so far. In fact I have been staring a multiple versions of the texture below for several hours now



What really doesn't help much is that Ken Perlin and virtually everyone else who has written a tutorial on the subject, leaves out several steps in the implementation (like how to calculate the permutation texture for example). Ho hum...where would the achievement be if every tutorial was complete.

Best,
GrumpyOne

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 15th Aug 2014 00:02
Btw my permutation table is the numbers 0 to 255 in random order, rather than 255 random numbers. I think this is the right way to do it.

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 15th Aug 2014 00:08
@ GrumpyOne

Take a look at this.. it can make seemless planets. I tested it and it does work.

http://forum.thegamecreators.com/?m=forum_view&t=189861&b=18
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 15th Aug 2014 00:30
Jack,

Thanks for this, but don't tempt me to give up! I am so nearly there. It is worth it, put it this way, the shader will currently generate a perlin noise texture 1024x1024 in... 200 millisecs. Just have to make it seamless.

GrumpyOne

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 15th Aug 2014 02:39
Quote: "Just have to make it seamless."


It can definitely be done - I've done it in a shader (but my implementation was a messy attempt to avoid having any source images at all which lead to other difficulties which your method overcomes). I'll see if I can dig it out - at this late hour I can't recall exactly how I made it seamless.


Quote: "What really doesn't help much is that Ken Perlin and virtually everyone else who has written a tutorial on the subject, leaves out several steps in the implementation (like how to calculate the permutation texture for example)."


I had exactly the same problem with his tutorials and ended up doing something rather different.



Powered by Free Banners
TheComet
16
Years of Service
User Offline
Joined: 18th Oct 2007
Location: I`m under ur bridge eating ur goatz.
Posted: 15th Aug 2014 11:53 Edited at: 15th Aug 2014 11:54
Quote: "It seems the shifts should be in bits rather than bytes"


derp... You're right

As to seamless, the easiest method is to generate an image with half the length and width of the target image, then mirror that image four times:


Attachments

Login to view attachments
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 15th Aug 2014 12:50
Hi Comet,

That would sure do it, and this is how I create seamless images in photoshop. The problem of course is the outer part of your image is seamless, the inner seams through the middle aren't unless the original image was seamless.

What I am trying to do is implement Ken Perlin's noise algorithms from GPU Gems 2:

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter26.html

The algorithm generates perlin noise by subdividing the image into cells, then generating gradient vectors at the corner of each cell (one per corner but random). All four vectors in the cell are blended together to generate a smooth surface. The influence of each gradient at an point in a cell is determined by the dot product between the gradient vector at a corner and a vector from the cell corner to the point in the cell. You can see that in my image above..there are 16x16 cells in that image. Now as long as the 4 gradient vectors are the same two cells should be identical (hence the repeat every 8 cells in my image). Consequently if we make the gradient vectors periodic, the texture should be seamless (which it isn't ). The shader snippet is shown below. I've added some extra comments to make life easier:



numCells and numRepeats are globals.

Right now I have confirmed that the gradients at each point are indeed periodic...so why they don't blend to make a seamless texture is....puzzling. The modulus seems to have caused a negative shift in the output by 1 pixel...this might be the reason...or there may be an issue in the lerp blend.

Determined to solve it

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 15th Aug 2014 13:12 Edited at: 15th Aug 2014 13:13
Quote: "As to seamless, the easiest method is to generate an image with half the length and width of the target image, then mirror that image four times:"


Yes, but such images look obviously mirrored - and the change in gradient is abrupt and obvious along the seams. See image below using GrumpyOne's source image:





Powered by Free Banners

Attachments

Login to view attachments
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 15th Aug 2014 14:38
Hmm...

Is it odd when you know what is wrong, and are convinced you know how to fix it...and it doesn't work.

Have tracked it down to the periodicity if the +one corner gradients in my lerp function being out of phase with the first set of corner gradients. It must be something to do with the modulus.

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 15th Aug 2014 14:51
Don't forget that the corner gradients must all be identical (I think).

I've finally tracked down my old Perlin noise shader (it wasn't where I expected it to be ) and posted a demo on your Perlin noise thread. Sounds like you need to see how I made it seamless - the rest is best viewed as work in progress .

My version seems to run quite fast and is designed as an object shader at the moment and doesn't use any source textures - which may be why I'm having problems with it with some choices of settings.

I'm very interested to see how you get on with this.



Powered by Free Banners
JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 15th Aug 2014 17:38
Might I suggest using 128x128 tiles ( squared ) and run your simplex or even perlin noise ( Simplex is less overhead on the hardware ) and then add in an algorithm that checks the 4 ends of the texture and compare it with its opposite side. I say 128x128 because its smaller and easier to work with until you are comfortable with larger scale ones where you can use threading with them.

Normally for an unlimited terrain, you would compare one end of the texture ( or heightmap ) with the other end of another texture ( or heightmap ). That one end would be the bases for the other. If you must stitch two different heights ( textures ) together, you would lay them side by side and using smoothing techniques before chopping it out of memory.

I hope this makes sense to you. It's not an easy subject to explain.
TheComet
16
Years of Service
User Offline
Joined: 18th Oct 2007
Location: I`m under ur bridge eating ur goatz.
Posted: 15th Aug 2014 18:22
Why not do it the way GIMP does it?

Here's the above image made seamless:


GIMP does it by making a copy of the image, offsetting it by half of its size in both directions, and then linearly blending both images together where the blend weight is proportional to the distance to the center of the image.

It's easy to see if you make a face seamless, for example:


JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 15th Aug 2014 19:14
TheComet,

The reason I wouldn't use that technique, is because of exactly what your two pictures are showing.. the creasing in the centers. GG shown the problem as well.

Finding out what the ends are and then stitching them together and then using a smoothing pattern can get rid of the edging is the only solution I have seen that works. WGLfx shown it in his CPP code for his perlin noise plugin.
Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 15th Aug 2014 20:41
Quote: "Finding out what the ends are and then stitching them together and then using a smoothing pattern can get rid of the edging is the only solution I have seen that works."


Have you seen the one I posted here:

seamless texturing utility

That's very similar to what TheComet describes but adds a necessary contrast adjustment.



Powered by Free Banners
JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 16th Aug 2014 02:24
Yea that's a really nice texture plugin.
Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 16th Aug 2014 02:26


I've got a shader version in the pipeline which does the same thing but much faster.



Powered by Free Banners
JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 16th Aug 2014 02:43
Very cool. But can you add to that plugin the ability of a terrain like patching ? 1 seamless texture is fairly easy to do, but I have a fascination for tiled textures that blend together without that repetitive pattern look.
Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 16th Aug 2014 13:11
Quote: "But can you add to that plugin the ability of a terrain like patching ? 1 seamless texture is fairly easy to do, but I have a fascination for tiled textures that blend together without that repetitive pattern look."


Food for thought - and something along those lines was already on my "to do" list (which seems to be forever getting longer). If you have FPSC(R) then take a look at the terrain shader that comes with it. That shader uses a technique similar to what I have in mind - and in turn is similar to my demo here high resolution texturing.

It should be a simple matter to add additional blending between two (or more?) textures to achieve your patching effect. The contrast enhancement should probably be added too - but I haven't tried that in this context yet.



Powered by Free Banners
JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 16th Aug 2014 23:01
Interesting stuff. Thanks GG.
Phaelax
DBPro Master
21
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 16th Aug 2014 23:40
If I recall, perlin noise should automatically be seamless due to how the algorithm works, provided the image size is a power of 2.

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 16th Aug 2014 23:44
It isn't automatically - you need to make sure that the tangents on opposite edges are equal.



Powered by Free Banners
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 17th Aug 2014 03:58 Edited at: 17th Aug 2014 04:01
Hi Guys,

Well I found a solution to make it seamless....although it isn't elegant, but it does work. I used logic statements to wrap the gradient vectors, which is a bit slow, but it still does a 1024x1024 pixel perlin noise image in ~100 millisecs. HLSL shader 3.0 code below:



What I was doing wrong, just in case you were interested, was getting confused over the components of the pointers (AA and PN) to the gradient texture, in terms of the image. This is illustrated below.



Actually this algorithm can generate 3D noise, however, it isn't seamless along the z-axis (mainly because I am using it to generate a 2D texture).

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 17th Aug 2014 20:40 Edited at: 17th Aug 2014 20:42
Quote: "It isn't automatically - you need to make sure that the tangents on opposite edges are equal."


Just realised that is exactly the problem with my shader code.

Not sure what will be the best solution at the moment - either fix my existing shader or fix the seams afterwards.

Quote: "Well I found a solution to make it seamless....although it isn't elegant, but it does work."


Well done! I have a horrible feeling that a solution in my method is going to be equally inelegant. I hate fiddling with messy logic conditions just to get edge constraints to work - but if it has to be done so be it.

I don't know why I thought my code made my images seamless - I must have been misled by the fact they often appeared to be seamless so I was looking elsewhere in my code. But, as Sherlock Holmes said: "When you have eliminated the impossible, whatever remains, however improbable, must be the truth."

I hope you didn't spend time trying to see why my version was seamless (which it isn't).



Powered by Free Banners
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 18th Aug 2014 02:51
GG,

Don't worry...I was pretty obsessed with making mine work (I say mine...but I mean Ken Perlin's, optimised via an anonymous XNA coder, unoptimised by me). Anyway, I can't count the number of bits of code I have stolen from you in the past

There probably is a more elegant way of doing it. The problem lay in swapping components of a 4d vector used as a pointers to the gradient texture. If they were in 4 separate floats then I could use modulus statements instead of the nasty logic blocks.

From what I pretend to understand, the GPU doesn't like logic blocks. It does one of three things: (1) all 16 pixel processors run in parallel if they all evaluate the same way, (2) some pixel processors will have to wait whilst the others do their logic block statements and vice a versa, (3) all pixel processors run all statements and discard the ones they don't need (not sure how it does this...but I guess the compiler sorts it out beforehand). Anyway, logic blocks suck on parallel processors.

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 18th Aug 2014 19:24 Edited at: 18th Aug 2014 19:29
Quote: "From what I pretend to understand, the GPU doesn't like logic blocks."


Yea, I am under the impression from your statement that you are unaware of what Logic Blocks are.

https://en.wikipedia.org/wiki/Logical_block_addressing

Logic Block Addressing ( LBA ) is what is used to keep up with Hard Drive sectors. It's a way to know what is where on the Hard Drive.

I believe what you want is DMA or RDMA.
Example info from NVidia video cards. AMD video cards do something simular.
http://docs.nvidia.com/cuda/gpudirect-rdma/index.html

GLSL and HLSL have to have a way to keep up with memory. Otherwise they wouldn't function correctly. Keep in mind, GLSL and HLSL have to be compiled before they are shoved to the GPU. Once they are on the GPU, the cores of the GPU execute the program. And you can have multiple of these all running on the GPU at once. Think of them as miniature programs all running simultaneously.
Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 19th Aug 2014 03:13
I'm fairly sure GrumpyOne is referring to standard if/else blocks etc - in which case I'm sure he's right. I looked at the asm listing provided by Dark Shader and a simple block of the form:



had been unrolled so the full shader asm listing was about 1400 instructions! And there's the synchronisation problem GrumpyOne mentioned as well. That was before I added the code to wrap the edges correctly - which has only been partly successful so far.

Rather weirdly, the current working, but still slightly bugged, version takes forever to load into DBPro but once loaded it runs like a charm. Dark Shader (DS) struggles with it as well. First time I've encountered that with both versions of DS. At least DBPro runs it whereas DS often chokes itself.

Quote: "Don't worry...I was pretty obsessed with making mine work"


Ditto.

I still haven't convincingly identified why my "corrected" versions still exhibit the seam issue. The first octave is OK now but the higher ones often have seams - but not always. This makes me suspect that the issue could be simply a floating point precision issue because the opposite edge coefficients are calculated in two slightly different ways. The random number generator I'm using is designed so that input arguments which are very close numerically can give completely different random values as output.

But I'm not convinced because I would expect the same problem to manifest itself between each octave tile in the image but doesn't appear to. A silly coding oversight is still a possibility but where? Anyway, when I've run out of ideas I'll post a simplified cleaned up version to highlight the issue in the hope that someone else will have an idea.

By the way, I understand how you feel when you mentioned you'd spent a day staring at umpteen different versions of perfect noisy images - but all with seams.



Powered by Free Banners
JackDawson
12
Years of Service
User Offline
Joined: 12th Jul 2011
Location:
Posted: 19th Aug 2014 04:04
Quote: "I'm fairly sure GrumpyOne is referring to standard if/else blocks etc - in which case I'm sure he's right."


I should have examined this conversation closer it seems. My apologies then.
GrumpyOne
16
Years of Service
User Offline
Joined: 27th Nov 2007
Location: London, UK
Posted: 20th Aug 2014 00:27 Edited at: 20th Aug 2014 00:29
Actually had a similar problem with my shader. It was fine with just one Octave, but with higher octaves seams appeared. Turned out I was scaling the higher octaves correctly in the function I had been staring at for hours, but I had a rogue modulus statement hiding elsewhere in my gradient texture look-up...a leftover from when I was testing it with just one octave.

The moral of the story "If the code looks as if it absolutely has to be right, then maybe the bug is elsewhere."

Jack...no problem....I did mean the coding "if then else" kind of logic block.

GPUs are strange things...for example....I today made a classic CPU programmer mistake. To find the max and min RGB values in a texture I wrote a two pass technique, the first pass pixel shader stored the max and min RGB in a global, the second pass pixel shader normalized all the pixels according to the globals. Sounds like it should work right? Nope....I forgot that each "instance" of the pixel shader runs on a single pixel, the globals don't appear to be accessible by the other instances of the pixel shader, even in the second pass. Unless GG knows a clever way to do this I may have to resort to memblocks.

edit: actually there is a way...just search all pixels in the texture in every instance of the pixel shader...however...would be very slow.

GrumpyOne - the natural state of the programmer - Forester Pro (Tree & Plant Creator), Medusa Pro (Rock Creator), Mr Normal (Normal Map Generator) http://www.hptware.co.uk

TheComet
16
Years of Service
User Offline
Joined: 18th Oct 2007
Location: I`m under ur bridge eating ur goatz.
Posted: 20th Aug 2014 02:01 Edited at: 20th Aug 2014 02:10
Well yeah, if everything is happening asynchronously how are you supposed to compare values with one another? They're not guaranteed to exist. You can't do a convolution asynchronously. Doesn't work.

The best thing you can do to speed that process up is to use a butterfly algorithm.

O(n²/2) becomes O(log2(n)).


For a 1920x1080 picture that's approximately 21 passes.

Green Gandalf
VIP Member
19
Years of Service
User Offline
Joined: 3rd Jan 2005
Playing: Malevolence:Sword of Ahkranox, Skyrim, Civ6.
Posted: 20th Aug 2014 02:08
Quote: "The moral of the story "If the code looks as if it absolutely has to be right, then maybe the bug is elsewhere.""


Quite possible.

I've taken the case of the second octave, i.e. using 2x2 sub tiles and hard coded all the adjoining and wrapped edges. Result - the image is perfectly seamless with every choice of seed values I've tested. I've tried comparing the two sets of code, i.e. the hard-coded version and the calculated version and just can't see where they differ.

Your comment suggests that the calculated version is using a variable or structure which is calculated incorrectly elsewhere. I suppose one possibility is to see what happens when I replace the calculated lines of code with the hardcoded lines one by one. That might highlight where things go wrong. However, I tried a simpler thing first. I compared the output images to see where they differed. I expected (hoped?) just one sub tile would differ - but no, all four sub tiles differ from the hardcoded version. A bit puzzling - and infuriating.

[Aside: by hard-coded I mean writing things like



instead of



]

Quote: "edit: actually there is a way...just search all pixels in the texture in every instance of the pixel shader...however...would be very slow."


A better solution might be to exploit the parallel aspect of the GPU. Do several passes to a target image blending just a portion of the image using the MAX blend operation. See, for example, (from the DX9 SDK)

Quote: "D3DBLENDOP Enumeration
Defines the supported blend operations. See Remarks for definitions of terms.

Syntax
typedef enum D3DBLENDOP {
D3DBLENDOP_ADD = 1,
D3DBLENDOP_SUBTRACT = 2,
D3DBLENDOP_REVSUBTRACT = 3,
D3DBLENDOP_MIN = 4,
D3DBLENDOP_MAX = 5,
D3DBLENDOP_FORCE_DWORD = 0x7fffffff
} D3DBLENDOP, *LPD3DBLENDOP;

Constants
D3DBLENDOP_ADD
The result is the destination added to the source. Result = Source + Destination

D3DBLENDOP_SUBTRACT
The result is the destination subtracted from to the source. Result = Source - Destination

D3DBLENDOP_REVSUBTRACT
The result is the source subtracted from the destination. Result = Destination - Source

D3DBLENDOP_MIN
The result is the minimum of the source and destination. Result = MIN(Source, Destination)

D3DBLENDOP_MAX
The result is the maximum of the source and destination. Result = MAX(Source, Destination)

D3DBLENDOP_FORCE_DWORD
Forces this enumeration to compile to 32 bits in size. Without this value, some compilers would allow this enumeration to compile to a size other than 32 bits. This value is not used.

"


You could sub-divide the image into, say, a 4x4 grid of sub-images and blend each one with the result of blending the previous one. At the end of those 16 renders you'll have a much smaller image to search through. I've no idea whether that would be a practical way of doing things though.

Have fun.



Powered by Free Banners

Login to post a reply

Server time is: 2024-04-30 21:45:23
Your offset time is: 2024-04-30 21:45:23