How would I make this code faster?

Author

Message

Programmer X

17

Years of Service

User Offline

Joined: 14th Nov 2007

Location:

Posted: 3rd Apr 2014 22:00 Edited at: 3rd Apr 2014 22:36

Link

Rem Project: hash decompress
Rem Created: Thursday, March 20, 2014

Rem ***** Main Source File *****

rem setup to hold answers

Dim Answers(3)

rem vector values

xaxa=40
xbxb=40
xcxc=40
minval=10010
lasttotal=1000
total=1000
rem above is starting variables

DecompressVB:
rem when you decompress only yansy is 8bit->40bit or 5 to 1TEV and 5 to 1 2TEV on 16 bit
rem TEVs 3-4
rem 104 bits totalgetans(TEVADD(AngleA,256,AngleA,0,0,2))
Setup fourray which is 4x-4x^2 for[1/256,256/256]
global Dim fourray(256)
for AngC=1 to 256
fourray(AngC)=getans(TEVADD(AngC,256,AngC,0,0,2))
print AngC
next AngC
print "D2"
rem Show it has completed

rem search for value in spherical coordinates that matches (R,U,V) however Sin and Cos are approximated with TEV functions.

for AA=1 to 256
for BB=1 to 256
for CC=1 to 256
if AA<256 and AA>0 and BB<256 and BB>0 and CC<256 and CC>0
rem zansz=(CC*1.0*(256-AA))/256
zansz=CC*TEVSUB(BB,0,BB,0,0,4)
yansy=CC*TEVSUB(AA,256,AA,0,0,2)*TEVSUB(BB,256,BB,0,0,2)
rem yansy=CC*1.0/2*twosincossolved(AA,BB)
rem xansy=(CC*1.0/2*getans(TEVADD(0,fourray(AA),fourray(AA),yansy,0,3))*256/fourray(AA))/256
xansy=CC*TEVSUB(AA,256,AA,0,0,2)*TEVSUB(BB,0,BB,0,0,4)
lastval=total

total=abs(xaxa-xansy)+abs(xbxb-yansy)+abs(xcxc-zansz)

rem compare the total difference with the minimum difference.

if total<minval and AA<256 and AA>0 and BB<256 and BB>0 and CC<256 and CC>0

rem Output what angles it would take to get closest to input vectors

Answers(1)=AA
Answers(2)=BB
Answers(3)=CC
minval=total

rem print error

print minval
endif
endif

rem do for all values of R,U,V
next CC
next BB
next AA
do
rem Print the value that appears at points and allows you to change x,y,z to check. xaxa is to avoid duplicates in other programs.
repeat
text 400,10,str$(xaxa)
text 400,30,str$(xbxb)
text 400,50,str$(xcxc)
text 400,70,str$(Answers(1))
text 400,90,str$(Answers(2))
text 400,110,str$(Answers(3))
if upkey()=1
xaxa=xaxa+1
cls
endif
if downkey()=1
xaxa=xaxa-1
cls
endif
if xaxa>256 then xaxa=1
if leftkey()=1
xbxb=xbxb+1
cls
endif
if rightkey()=1
xbxb=xbxb-1
cls
endif
if xbxb>256 then xbxb=1
if returnkey()=1
xcxc=xcxc+1
cls
endif
if shiftkey()=1
xcxc=xcxc-1
cls
endif
if xcxc>256 then xcxc=1
wait 100
until controlkey()=1
if Answers(1)<256 and Answers(1)>0 and Answers(2)<256 and Answers(2)>0 and Answers(3)<256 and Answers(3)>0
zansz=(Answers(3)*1.0*(256-Answers(1)))/256
yansy=Answers(3)*1.0/2*twosincossolved(Answers(1),Answers(2))
xansy=(Answers(3)*1.0/2*getans(TEVADD(0,fourray(Answers(1)),fourray(Answers(1)),yansy,0,3))*256/fourray(Answers(1)))/256
endif
print zansz
print yansy
print xansy
wait key
loop
function twosincossolved(quadA,quadB)

rem table (4S-4S^2) where needed [1,0]
rem table (remember to half for 2T) (1-(2T-T^2)) where needed [1,0]
rem 2*-1/2*b+a*1/2
ans#=TEVSUB(quadA,quadB,128,256,0,3)
rem ans=getans(ans#)
endfunction ans#
function TEVADD(a,b,c,d,k,l)
a#=a
b#=b
c#=c
d#=d
k#=k
a#=a#*1.0/256
b#=b#*1.0/256
c#=c#*1.0/256
d#=d#*1.0/256
k#=k#*1.0/256
l#=2^(l-2)
stepA#=(1.0-c#)
stepB#=(stepA#*a#)
stepB#=getans(stepB#)*1.0/256
stepBB#=(c#*b#)
stepBB#=getans(stepBB#)*1.0/256
stepBBB#=stepBB#+stepB#
stepBBB#=getans(stepBBB#)*1.0/256
stepC#=d#+stepBBB#
stepC#=getans(stepC#)*1.0/256
stepD#=stepC#+k#
stepD#=getans(stepD#)*1.0/256
stepE#=stepD#*l#
endfunction stepE#
function TEVSUB(a,b,c,d,k,l)
a#=a
b#=b
c#=c
d#=d
k#=k
a#=a#*1.0/256
b#=b#*1.0/256
c#=c#*1.0/256
d#=d#*1.0/256
k#=k#*1.0/256
l#=2^(l-2)
stepA#=(1.0-c#)
stepB#=(stepA#*a#)
stepB#=getans(stepB#)*1.0/256
stepBB#=(c#*b#)
stepBB#=getans(stepBB#)*1.0/256
stepBBB#=stepBB#+stepB#
stepBBB#=getans(stepBBB#)*1.0/256
stepC#=d#-stepBBB#
stepC#=getans(stepC#)*1.0/256
stepD#=stepC#+k#
stepD#=getans(stepD#)*1.0/256
stepE#=stepD#*l#
endfunction stepE#
function getans(z#)
m=0
remstart
while z#*256>m
m=m+1
endwhile
remend
m=ceil(256*z#)
endfunction m

XXX

Back to top

Profile PM

Programmer X

17

Years of Service

User Offline

Joined: 14th Nov 2007

Location:

Posted: 3rd Apr 2014 22:20

Link

I want to use this to generat a full hash table but it takes forever

XXX

Back to top

Profile PM

tiffer

19

Years of Service

User Offline

Joined: 6th Apr 2006

Location: Scotland

Posted: 3rd Apr 2014 22:22

Link

It would be very hard for someone else to read this as there's no indentation and very few notes.

Back to top

Profile PM

Barry Pythagoras

11

Years of Service

User Offline

Joined: 14th Mar 2014

Location:

Posted: 3rd Apr 2014 23:02 Edited at: 3rd Apr 2014 23:04

Link

Well this part doesn't need to check if AA BB or CC are within certain bounds because the for/next loop already sets AA BB and CC to 1 to 256...

for AA=1 to 256
for BB=1 to 256
for CC=1 to 256

What's this for?...

if AA<256 and AA>0 and BB<256 and BB>0 and CC<256 and CC>0

And this?...

if total<minval and AA<256 and AA>0 and BB<256 and BB>0 and CC<256 and CC>0

Back to top

Profile PM

Programmer X

17

Years of Service

User Offline

Joined: 14th Nov 2007

Location:

Posted: 3rd Apr 2014 23:06

Link

It has to go through all angle K*pi where K=(1/256,256/256) to find which angle yields the closest solution.

XXX

Back to top

Profile PM

Barry Pythagoras

11

Years of Service

User Offline

Joined: 14th Mar 2014

Location:

Posted: 3rd Apr 2014 23:11

Link

Are AA, BB, and CC changing though? They seem just to be counting.

Back to top

Profile PM

Programmer X

17

Years of Service

User Offline

Joined: 14th Nov 2007

Location:

Posted: 3rd Apr 2014 23:33

Link

They change if i change xaxa,xbxb,xcxc like if i wanted a table for every value so i could just look up CompressThis(xaxa,xbxb,xcxc) where Compress this is an array or faster data structure

XXX

Back to top

Profile PM

Lukas W

21

Years of Service

User Offline

Joined: 5th Sep 2003

Location: Sweden

Posted: 3rd Apr 2014 23:43

Link

Here it is slightly faster

Rem Project: hash decompress
Rem Created: Thursday, March 20, 2014

Rem ***** Main Source File *****
    Sync On
    Sync Rate 0
    Sync

rem setup to hold answers
    Dim Answers(3)

rem vector values
    xaxa = 40
    xbxb = 40
    xcxc = 40
    
    minval    = 10010
    lasttotal =  1000
    total     =  1000
rem above is starting variables

DecompressVB:
    rem when you decompress only yansy is 8bit->40bit or 5 to 1TEV and 5 to 1 2TEV on 16 bit
    rem TEVs 3-4
    rem 104 bits totalgetans(TEVADD(AngleA,256,AngleA,0,0,2))

rem Setup fourray which is 4x-4x^2 for[1/256,256/256]
    Global DIM fourray(256)
    
    For AngC = 1 To 256
        fourray( AngC ) = getans(  TEVADD( AngC, 256, AngC, 0, 0, 2 )  )
        Print AngC
    Next AngC
rem Show it has completed
    Print "D2" : Sync

rem search for value in spherical coordinates that matches (R,U,V) however Sin and Cos are approximated with TEV functions.
    For AA = 1 To 256
        val2# = TEVSUB( AA, 256, AA, 0, 0, 2 )
            
        For BB = 1 To 256
            val1# = TEVSUB( BB,   0, BB, 0, 0, 4 )
            val3# = TEVSUB( BB, 256, BB, 0, 0, 2 )
            val4# = TEVSUB( BB,   0, BB, 0, 0, 4 )

For CC = 1 To 256
                    If AA<256 and AA>0 and BB<256 and BB>0 and CC<256 and CC>0
                        zansz = CC * val1#
                        yansy = CC * val2# * val3#
                        xansy = CC * val2# * val4#
                
                        lastval = total
                        
                        total   = ABS(xaxa-xansy) + ABS(xbxb-yansy) + ABS(xcxc-zansz)

rem compare the total difference with the minimum difference.
                        If total < minval
        
                            rem Output what angles it would take to get closest to input vectors
                            Answers( 1 ) = AA
                            Answers( 2 ) = BB
                            Answers( 3 ) = CC
                            minval = total
        
                            rem print error
                                Print minval
                        EndIf
                    EndIf

next CC
        next BB
    next AA

DO
    CLS
    
    rem Print the value that appears at points and allows you to change x,y,z to check. xaxa is to avoid duplicates in other programs.
        repeat
            text 400,10,str$(xaxa)
            text 400,30,str$(xbxb)
            text 400,50,str$(xcxc)
            text 400,70,str$(Answers(1))
            text 400,90,str$(Answers(2))
            text 400,110,str$(Answers(3))
            if upkey()=1
            xaxa=xaxa+1
            cls
            endif
            if downkey()=1
            xaxa=xaxa-1
            cls
            endif
            if xaxa>256 then xaxa=1
            if leftkey()=1
            xbxb=xbxb+1
            cls
            endif
            if rightkey()=1
            xbxb=xbxb-1
            cls
            endif
            if xbxb>256 then xbxb=1
            if returnkey()=1
            xcxc=xcxc+1
            cls
            endif
            if shiftkey()=1
            xcxc=xcxc-1
            cls
            endif
            if xcxc>256 then xcxc=1
            wait 100
            SYNC
        until controlkey()=1

If Answers(1) < 256 && Answers(1) > 0 && Answers(2) < 256 && Answers(2) > 0 && Answers(3) < 256 && Answers(3)>0
        zansz=(Answers(3)*1.0*(256-Answers(1)))/256
        yansy=Answers(3)*1.0/2*twosincossolved(Answers(1),Answers(2))
        xansy=(Answers(3)*1.0/2*getans(TEVADD(0,fourray(Answers(1)),fourray(Answers(1)),yansy,0,3))*256/fourray(Answers(1)))/256
    EndIf
    
    Print zansz
    Print yansy
    Print xansy

SYNC
LOOP
function twosincossolved(quadA,quadB)

rem table (4S-4S^2) where needed [1,0]
rem table (remember to half for 2T) (1-(2T-T^2)) where needed [1,0]
rem 2*-1/2*b+a*1/2
ans#=TEVSUB(quadA,quadB,128,256,0,3)
rem ans=getans(ans#)
endfunction ans#
function TEVADD(a,b,c,d,k,l)
a#=a
b#=b
c#=c
d#=d
k#=k
a#=a#*1.0/256
b#=b#*1.0/256
c#=c#*1.0/256
d#=d#*1.0/256
k#=k#*1.0/256
l#=2^(l-2)
stepA#=(1.0-c#)
stepB#=(stepA#*a#)
stepB#=getans(stepB#)*1.0/256
stepBB#=(c#*b#)
stepBB#=getans(stepBB#)*1.0/256
stepBBB#=stepBB#+stepB#
stepBBB#=getans(stepBBB#)*1.0/256
stepC#=d#+stepBBB#
stepC#=getans(stepC#)*1.0/256
stepD#=stepC#+k#
stepD#=getans(stepD#)*1.0/256
stepE#=stepD#*l#
endfunction stepE#

function TEVSUB(a#,b#,c#,d#,k#,l)
        a# = a# * 1.0/256
        b# = b# * 1.0/256
        c# = c# * 1.0/256
        d# = d# * 1.0/256
        k# = k# * 1.0/256
        l# = 2^(l-2)
        
        stepA# = (1.0 - c#)
        
        stepB1# = getans( (stepA#*a#) )         * 1.0/256
        stepB2# = getans( (c# * b#) )           * 1.0/256
        stepB3# = getans( (stepB2# + stepB1#) ) * 1.0/256

stepC# = getans( (d# - stepB3#) ) * 1.0/256
        stepD# = getans( (stepC# + k#) ) * 1.0/256        
        
        stepE# = stepD#*l#
endfunction stepE#

function getans(z#) : m = ceil( 256*z# )
endfunction m

+ Code Snippet

Rem Project: hash decompress
Rem Created: Thursday, March 20, 2014

Rem ***** Main Source File *****
    Sync On
    Sync Rate 0
    Sync

rem setup to hold answers
    Dim Answers(3)

rem vector values
    xaxa = 40
    xbxb = 40
    xcxc = 40
    
    minval    = 10010
    lasttotal =  1000
    total     =  1000
rem above is starting variables

DecompressVB:
    rem when you decompress only yansy is 8bit->40bit or 5 to 1TEV and 5 to 1 2TEV on 16 bit
    rem TEVs 3-4
    rem 104 bits totalgetans(TEVADD(AngleA,256,AngleA,0,0,2))

rem Setup fourray which is 4x-4x^2 for[1/256,256/256]
    Global DIM fourray(256)
    
    For AngC = 1 To 256
        fourray( AngC ) = getans(  TEVADD( AngC, 256, AngC, 0, 0, 2 )  )
        Print AngC
    Next AngC
rem Show it has completed
    Print "D2" : Sync

rem search for value in spherical coordinates that matches (R,U,V) however Sin and Cos are approximated with TEV functions.
    For AA = 1 To 256
        val2# = TEVSUB( AA, 256, AA, 0, 0, 2 )
            
        For BB = 1 To 256
            val1# = TEVSUB( BB,   0, BB, 0, 0, 4 )
            val3# = TEVSUB( BB, 256, BB, 0, 0, 2 )
            val4# = TEVSUB( BB,   0, BB, 0, 0, 4 )

            For CC = 1 To 256
                    If AA<256 and AA>0 and BB<256 and BB>0 and CC<256 and CC>0
                        zansz = CC * val1#
                        yansy = CC * val2# * val3#
                        xansy = CC * val2# * val4#
                
                        lastval = total
                        
                        total   = ABS(xaxa-xansy) + ABS(xbxb-yansy) + ABS(xcxc-zansz)

                        rem compare the total difference with the minimum difference.
                        If total < minval
        
                            rem Output what angles it would take to get closest to input vectors
                            Answers( 1 ) = AA
                            Answers( 2 ) = BB
                            Answers( 3 ) = CC
                            minval = total
        
                            rem print error
                                Print minval
                        EndIf
                    EndIf

            next CC
        next BB
    next AA

DO
    CLS
    
    rem Print the value that appears at points and allows you to change x,y,z to check. xaxa is to avoid duplicates in other programs.
        repeat
            text 400,10,str$(xaxa)
            text 400,30,str$(xbxb)
            text 400,50,str$(xcxc)
            text 400,70,str$(Answers(1))
            text 400,90,str$(Answers(2))
            text 400,110,str$(Answers(3))
            if upkey()=1
            xaxa=xaxa+1
            cls
            endif
            if downkey()=1
            xaxa=xaxa-1
            cls
            endif
            if xaxa>256 then xaxa=1
            if leftkey()=1
            xbxb=xbxb+1
            cls
            endif
            if rightkey()=1
            xbxb=xbxb-1
            cls
            endif
            if xbxb>256 then xbxb=1
            if returnkey()=1
            xcxc=xcxc+1
            cls
            endif
            if shiftkey()=1
            xcxc=xcxc-1
            cls
            endif
            if xcxc>256 then xcxc=1
            wait 100
            SYNC
        until controlkey()=1


    If Answers(1) < 256 && Answers(1) > 0 && Answers(2) < 256 && Answers(2) > 0 && Answers(3) < 256 && Answers(3)>0
        zansz=(Answers(3)*1.0*(256-Answers(1)))/256
        yansy=Answers(3)*1.0/2*twosincossolved(Answers(1),Answers(2))
        xansy=(Answers(3)*1.0/2*getans(TEVADD(0,fourray(Answers(1)),fourray(Answers(1)),yansy,0,3))*256/fourray(Answers(1)))/256
    EndIf
    
    Print zansz
    Print yansy
    Print xansy

    SYNC
LOOP
function twosincossolved(quadA,quadB)

rem table (4S-4S^2) where needed [1,0]
rem table (remember to half for 2T) (1-(2T-T^2)) where needed [1,0]
rem 2*-1/2*b+a*1/2
ans#=TEVSUB(quadA,quadB,128,256,0,3)
rem ans=getans(ans#)
endfunction ans#
function TEVADD(a,b,c,d,k,l)
a#=a
b#=b
c#=c
d#=d
k#=k
a#=a#*1.0/256
b#=b#*1.0/256
c#=c#*1.0/256
d#=d#*1.0/256
k#=k#*1.0/256
l#=2^(l-2)
stepA#=(1.0-c#)
stepB#=(stepA#*a#)
stepB#=getans(stepB#)*1.0/256
stepBB#=(c#*b#)
stepBB#=getans(stepBB#)*1.0/256
stepBBB#=stepBB#+stepB#
stepBBB#=getans(stepBBB#)*1.0/256
stepC#=d#+stepBBB#
stepC#=getans(stepC#)*1.0/256
stepD#=stepC#+k#
stepD#=getans(stepD#)*1.0/256
stepE#=stepD#*l#
endfunction stepE#


function TEVSUB(a#,b#,c#,d#,k#,l)
        a# = a# * 1.0/256
        b# = b# * 1.0/256
        c# = c# * 1.0/256
        d# = d# * 1.0/256
        k# = k# * 1.0/256
        l# = 2^(l-2)
        
        stepA# = (1.0 - c#)
        
        stepB1# = getans( (stepA#*a#) )         * 1.0/256
        stepB2# = getans( (c# * b#) )           * 1.0/256
        stepB3# = getans( (stepB2# + stepB1#) ) * 1.0/256

        stepC# = getans( (d# - stepB3#) ) * 1.0/256
        stepD# = getans( (stepC# + k#) ) * 1.0/256        
        
        stepE# = stepD#*l#
endfunction stepE#


function getans(z#) : m = ceil( 256*z# )
endfunction m

IS that a game, a program, or 3 year old drawing.

Back to top

Profile PM Email

Programmer X

17

Years of Service

User Offline

Joined: 14th Nov 2007

Location:

Posted: 4th Apr 2014 00:18

Link

Thanks.

XXX

Back to top

Profile PM

Kevin Picone

22

Years of Service

User Offline

Joined: 27th Aug 2002

Location: Australia

Posted: 4th Apr 2014 07:29 Edited at: 4th Apr 2014 18:55

Link

Unfortunately none of the TGC compilers support pre-solving literals in expressions like you might think. (DB / Dbpro and AppGameKit )

So in terms of operations, code like the function bellow includes a lot of redundancy in it, such as 1.0/256.0 for example. So there's a lot of extra of floating point division going on in the function, that you can either computer once or rip completely.

Original Function:

+ Code Snippet

function TEVADD(a,b,c,d,k,l)
  a#=a
  b#=b
  c#=c
  d#=d
  k#=k
  a#=a#*1.0/256
  b#=b#*1.0/256
  c#=c#*1.0/256
  d#=d#*1.0/256
  k#=k#*1.0/256
  l#=2^(l-2)
  stepA#=(1.0-c#)
  stepB#=(stepA#*a#)
  stepB#=getans(stepB#)*1.0/256
  stepBB#=(c#*b#)
  stepBB#=getans(stepBB#)*1.0/256
  stepBBB#=stepBB#+stepB#
  stepBBB#=getans(stepBBB#)*1.0/256
  stepC#=d#+stepBBB#
  stepC#=getans(stepC#)*1.0/256
  stepD#=stepC#+k#
  stepD#=getans(stepD#)*1.0/256
  stepE#=stepD#*l#
endfunction stepE#

Quick Tweak of function:

+ Code Snippet

function TEVADD(a,b,c,d,k,l)
  ; you could really computer this outside of the function, but 
  Scalar#=1.0/256
 
  a#=a
  b#=b
  c#=c
  d#=d
  k#=k
  a#=a#*Scalar#
  b#=b#*Scalar#
  c#=c#*Scalar#
  d#=d#*Scalar#
  k#=k#*Scalar#

  ; DBpro can't solve these at compile time, so it's doing this power every time you call this function 
  l#=2^(l-2)

  stepA#=(1.0-c#)
  stepB#=(stepA#*a#)
  stepB#=getans(stepB#)*Scalar#

  stepBB#=(c#*b#)
  stepBB#=getans(stepBB#)*Scalar#

  stepBBB#=stepBB#+stepB#
  stepBBB#=getans(stepBBB#)*Scalar#

  stepC#=d#+stepBBB#
  stepC#=getans(stepC#)*Scalar#

  stepD#=stepC#+k#
  stepD#=getans(stepD#)*Scalar#

  stepE#=stepD#*l#
endfunction stepE#

Just by doing a little reshuffle we've pulled 8 or 9 floating point divisions from that call.

Other things you could look at are inlining the getans(z#) function calls to avoid the wrapping overhead and of course, compute out any literal statements manually through out the rest of the program.

You might be able to remove some bogus casting also by changing this,

+ Code Snippet

  a#=a
  b#=b
  c#=c
  d#=d
  k#=k
  a#=a#*Scalar#
  b#=b#*Scalar#
  c#=c#*Scalar#
  d#=d#*Scalar#
  k#=k#*Scalar#

To this,

+ Code Snippet


  a#=a*Scalar#
  b#=b*Scalar#
  c#=c*Scalar#
  d#=d*Scalar#
  k#=k*Scalar#

The FPU can load integers and floats data to FPU stack/registers directly. Now assuming DBpro generates this type of response to such statements (ie Int * Float), you've removed another 5 bogus moves and casts (int to float) from the function.

Convert PlayBASIC To Machine Code

Back to top

Profile PM Website

Sorry your browser is not supported!

DarkBASIC Professional Discussion / How would I make this code faster?