BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Fastcode CharPosRev B&V 0.4.0

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM
View previous topic :: View next topic  
Author Message
Dennis
Guest





PostPosted: Sat Nov 25, 2006 9:11 am    Post subject: Fastcode CharPosRev B&V 0.4.0 Reply with quote



Hi

Lets play a little with simple loop unrolling.

First I pick the fastest of my non-unrolled functions for each CPU.

On Opteron it is CharPosRev_DKC_Pas_8

CharPosRev_DKC_Pas_8_c 0 1447 2036 3483

CharPosRev_DKC_Pas_7_a 8 1469 2065 3534

CharPosRev_DKC_Pas_4_b 0 1697 2602 4299

CharPosRev_DKC_Pas_1_c 4 1706 2622 4328

CharPosRev_DKC_Pas_5_d 4 1712 2618 4330

CharPosRev_DKC_Pas_3_d 0 1723 2608 4331

CharPosRev_DKC_Pas_2_d 0 1735 2619 4354

CharPosRev_DKC_Pas_6_b 0 1735 2623 4358

function CharPosRev_DKC_Pas_8_a(SearchChar : Char; const S: string) :
Integer;
var
Str, StrStart : PChar;
LengthS : Integer;

begin
Result := 0;
LengthS := Length(S);
if LengthS > 0 then
begin
StrStart := Pointer(S);
Str := StrStart + LengthS - 1;//Points to last char in S
repeat
if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);
Result := Str-StrStart+1;
end;
end;

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 3:23 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote



Hi

Simple loop unrolling goes like this. Take the loop

repeat
if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);

duplicate it

repeat
if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);

if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);

The loop break at this condition (Str < StrStart); and the first until is
converted into

if (Str < StrStart) then
Break

The unrolled loop becomes

repeat
if Str^ = SearchChar then
Break;
Dec(Str);

if (Str < StrStart) then
Break
if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);

and the entire function becomes

function CharPosRev_DKC_Pas_101_a(SearchChar : Char; const S: string) :
Integer;
var
Str, StrStart : PChar;
LengthS : Integer;

begin
Result := 0;
LengthS := Length(S);
if LengthS > 0 then
begin
StrStart := Pointer(S);
Str := StrStart + LengthS - 1;//Points to last char in S
repeat
if Str^ = SearchChar then
Break;
Dec(Str);
if (Str < StrStart) then
Break;
if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);
Result := Str-StrStart+1;
end;
end;

I made a copy of _8 and called it _100

The logic now goes that _100 is zero times unrolled, _101 is one time
unrolled etc.

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 3:47 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote



Hi

The next step in unrolling is simply to copy paste the block

if Str^ = SearchChar then
Break;
Dec(Str);
if (Str < StrStart) then
Break;

function CharPosRev_DKC_Pas_102_a(SearchChar : Char; const S: string) :
Integer;
var
Str, StrStart : PChar;
LengthS : Integer;

begin
Result := 0;
LengthS := Length(S);
if LengthS > 0 then
begin
StrStart := Pointer(S);
Str := StrStart + LengthS - 1;//Points to last char in S
repeat
if Str^ = SearchChar then
Break;
Dec(Str);
if (Str < StrStart) then
Break;
if Str^ = SearchChar then
Break;
Dec(Str);
if (Str < StrStart) then
Break;
if Str^ = SearchChar then
Break;
Dec(Str);
until (Str < StrStart);
Result := Str-StrStart+1;
end;
end;

What are the benefits of this optimization? All it does is to convert a
backward taken branch on each loop iteration into some forward non taken
branches. This can remove misaligned branch target penalties. Is the cost of
a taken correctly predicted branch the same as a nontaken correctly
predicted branch? On which CPU's - I do not know.

I call this simple unrolling because it does not reduce the number of times
the loop termination condition is evaluated. Real unrolling proccesors
2,4,8,16 bytes for each time the loop condition is checked. This is of
course faster, but not always possible. For AnsiString where we know the
string size up front it is possible, but for zero terminated strings it is
only partly possible. It is not possible if we do not allow reading past the
end of the string. In short - the functions I present here will not be
winners, they will just allow us to learn something. I expect none of them
to be faster than John's _2 function on any CPU's. Also they are much bigger
and will not be winners on any size penalty targets.

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 4:42 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi

Here are some numbers

Unrollcount 0 1 2 3 4 5 6 7 8 9
Opteron 3509 3385 3257 3278 3294 3338 3459 3019 3303 3112
Yonah 3018 3068 2975 2935 2951 2912 2870 2845 2857 2815
Dothan 2900 2890 2907 2835 2861 2813 2757 2693 2708 2713
Northwood 2177 2133 2032 2096 2036 2016 2009 2000 1979 2014
Prescott 2744 2654 2604 2577 2560 2523 2535 2525 2514 2495


Spreadsheet is here

http://tech.groups.yahoo.com/group/fastcodeproject/files/CharPosRev/Version%
200.4.0/

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 4:45 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi

Released at

http://tech.groups.yahoo.com/group/fastcodeproject/files/CharPosRev/Version%
200.4.0/

This release is dedicated to the unrolling experiment and has the relevant
functions enabled.

A real release with all functions coming very soon

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Davy Landman
Guest





PostPosted: Sat Nov 25, 2006 5:52 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi Dennis,

I would like to see my _6 version in the release too, because I'm not sure
if my _7 is faster on all cpu's...
( _6 uses more movzx edx,[esi+ecx+$01..$03] and _7 more movzx
edx,[esi+ecx-$01..$03])

(so just out of curiosity .. please include that one too? Smile)

Thanks,
Davy
"Dennis" <marianndkc (AT) home3 (DOT) gvdnet.dk> wrote in message
news:45681ec7$2 (AT) newsgroups (DOT) borland.com...
Quote:
Hi

Released at

http://tech.groups.yahoo.com/group/fastcodeproject/files/CharPosRev/Version%
200.4.0/

This release is dedicated to the unrolling experiment and has the relevant
functions enabled.

A real release with all functions coming very soon

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.

Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 7:14 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi Davy

Sure - I just missed it - sorry.

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 7:23 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi Davy

I did not include it because it borrows to much from Johns function.

I prefer that we make our own functions and also fix any issues in our own
functions.

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 7:58 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi John and Davy

Quote:
I did not include it because it borrows to much from Johns function.

Please correct me if I am wrong.

Best regards
Dennis Kjaer Christensen


----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Davy Landman
Guest





PostPosted: Sat Nov 25, 2006 7:58 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Quote:
Sorry
No worries, I see you doing a lot of good work


Quote:

I did not include it because it borrows to much from Johns function.

Okay, I totaly agree on that one.


that's why I included the comment with that version...

Regards,
Davy Landman
"Dennis" <marianndkc (AT) home3 (DOT) gvdnet.dk> wrote in message
news:456843d5$1 (AT) newsgroups (DOT) borland.com...
Quote:
Hi Davy

I did not include it because it borrows to much from Johns function.

I prefer that we make our own functions and also fix any issues in our own
functions.

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.

Back to top
Dennis
Guest





PostPosted: Sat Nov 25, 2006 9:47 pm    Post subject: Re: Fastcode CharPosRev B&V 0.4.0 Reply with quote

Hi

Interesting results ;-)

All processors benefit from unrolling.

Opteron benefits most, but the curve has some weird bends. This is probably
due to misalignment of the branch target when execution breaks out of the
loop. Opteron could probably benefit further from unrolling, but the other
processors will benefit only little from going about 9 times unrolling.

Both Prescott and Northwood are running at 2800 MHz, but Northwood is much
faster.

Dothan is sligthly faster than Yonah at the same clock = 1733 MHz.

Opteron is slowest but it runs at 1400 MHz only.

Best regards
Dennis Kjaer Christensen

----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 5740 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.