BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Fastcode MM B&V 0.39
Goto page 1, 2  Next
 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM
View previous topic :: View next topic  
Author Message
Dennis
Guest





PostPosted: Tue May 17, 2005 6:41 am    Post subject: Fastcode MM B&V 0.39 Reply with quote



Hi All

I think that Validate11 should be changed a little. Do you agree?

function TMMValidation.Validate11 : Boolean;
var
I, J, K, L, NoOfStrings : Cardinal;
SomeArray : array of Cardinal;
StringArray : array[1..517] of Ansistring;
TempS : AnsiString;
MemoryStatus : TMemoryStatus;
const
BYTESTOALLOCATEMAX : Integer = 320001;

begin
Result := True;

//GlobalMemoryStatus(MemoryStatus);//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Move this
try
for I := 0 to BYTESTOALLOCATEMAX-1 do
begin
NoOfStrings := Length(StringArray);
//Clear a string sometimes
if I mod 5 = 0 then
StringArray[Random(NoOfStrings) + 1] := '';
//Grow a string with an 'A'
StringArray[(I mod NoOfStrings) + 1] := StringArray[(I mod
Cardinal(Length(StringArray)))+1] + 'A';
//Grow SomeArray if place for it

GlobalMemoryStatus(MemoryStatus);//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!To here
if MemoryStatus.dwAvailVirtual >= I+1 then
SetLength(SomeArray, I+1);
SomeArray[I] := I;
//Validate that SomeArray data are not changed
for J := 0 to I do
begin
if SomeArray[J] <> J then
begin
Result := False;
Exit;
end;
end;
//Validate that all strings contains 'A's and nothing else
for K := 1 to Length(StringArray) do
begin
TempS := StringArray[K];
for L := 1 to Length(TempS) do
begin
if TempS[L] <> 'A' then
begin
Result := False;
Exit;
end;
end;
end;
end;
except
Result := False;
end;
end;

Best regards
Dennis


Back to top
Dennis
Guest





PostPosted: Tue May 17, 2005 8:19 am    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote



Validation of big downsized is missing and therefore I suggest inclusion of
this validation function

//Based on Validate11 but validating big array downsize combined with many
small strings resize

function TMMValidation.Validate26 : Boolean;
var
I, J, K, L, I2, NoOfStrings : Cardinal;
SomeArray : array of Cardinal;
StringArray : array[1..10000] of Ansistring;
TempS : AnsiString;
MemoryStatus : TMemoryStatus;
SomeArraySize : Cardinal;
SomeArraySizeFP : Double;
const
NOOFRUNS : Integer = 2000;
SOMEARRAYMAXSIZE : Integer = 100*1024*1024;
SHRINKFACTOR : Double = 0.95;
begin
Result := True;
GlobalMemoryStatus(MemoryStatus);
SomeArraySize := SOMEARRAYMAXSIZE;
SomeArraySizeFP := SomeArraySize;
if MemoryStatus.dwAvailVirtual >= SomeArraySize then
SetLength(SomeArray, SomeArraySize);
//Fill SomeArray
for I2 := 0 to SomeArraySize-1 do
SomeArray[I2] := I2;
try
for I := 0 to NOOFRUNS-1 do
begin
NoOfStrings := Length(StringArray);
//Clear a string sometimes
if I mod 5 = 0 then
StringArray[Random(NoOfStrings) + 1] := '';
//Grow a string with an 'A'
StringArray[(I mod NoOfStrings) + 1] := StringArray[(I mod
Cardinal(Length(StringArray)))+1] + 'A';
//Shrink SomeArray
SomeArraySizeFP := SomeArraySizeFP * SHRINKFACTOR;
SomeArraySize := Round(SomeArraySizeFP);
//SomeArray length always > 0
if SomeArraySize <= 0 then
SomeArraySize := 1;
SetLength(SomeArray, SomeArraySize);
//Validate that SomeArray data are not changed
for J := 0 to SomeArraySize-1 do
begin
if SomeArray[J] <> J then
begin
Result := False;
Exit;
end;
end;
//Validate that all strings contains 'A's and nothing else
for K := 1 to Length(StringArray) do
begin
TempS := StringArray[K];
for L := 1 to Length(TempS) do
begin
if TempS[L] <> 'A' then
begin
Result := False;
Exit;
end;
end;
end;
end;
except
Result := False;
end;
end;

Dennis


Back to top
Dennis
Guest





PostPosted: Tue May 17, 2005 8:36 am    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote



All MM's except Recycler passes this new validation

Regards
Dennis


Back to top
Dennis
Guest





PostPosted: Fri May 20, 2005 7:24 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

Who volunteers to build this release?

Regards
Dennis


Back to top
Dennis
Guest





PostPosted: Fri May 20, 2005 7:41 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Todo list that must be done to reach version 1.0:

-MultiThread validation
-BPL validation?
-Dll validation in B&V tool?
-bmMemoryAccessSpeed benchmark with non serial memory access pattern
(Selection sort on array of ??)?
-A new RecyclerMM
-FastMM4
-CPUID based selection of Move function for Move benchmarks
-CPUID based selection of FillChar function for MultiThreadFillChar
benchmark
-Add missing Dan Downs replay
-Add missing Nathanial Wools replay
-Eric Grange suggestions

Mimic a realistic memory usage in the Replay Benchmark.
Some ideas from previous posts (from Eric, if I remember well):
- Fillchar on every new allocation
- Reading randomly in previous allocations
- Filling every 64th byte (to be sure we're using all cache lines)

-Fix messed collumn display when running benchmark all
-Settle Validate23 discussion?
-etc ... help me ;-)

Regards
Dennis


Back to top
RainMaker
Guest





PostPosted: Sat May 21, 2005 9:54 am    Post subject: A Req to the FastCode Team Reply with quote

Hi,
please can you post some tutorials for a beginners users ;)

thanks

Back to top
Dennis
Guest





PostPosted: Sat May 21, 2005 10:13 am    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

I have posted a new proposal for a validation. It triggers the fragmentation
issue in the RTL MM.

It is a stripped down version of a simulator I made to simulate the decline
in the number of longhair carriers in a population on shorthaired cats under
different breeding rules.

http://www.gugs-cats.dk/Diverse.htm

I tested most MM's and they all pass, with the exception of the RTL MM.

Should we include it in the B&V?

Regards
Dennis


Back to top
Dennis
Guest





PostPosted: Sat May 21, 2005 10:17 am    Post subject: Re: A Req to the FastCode Team Reply with quote

Hi

Quote:
please can you post some tutorials for a beginners users Wink

Yes.

Add the MM as the very first unit in the uses clause of your .dpr.

program LongHairSimulator;

uses
BucketMem.pas//A replacement MM
Forms,
MainUnit in 'MainUnit.pas' {MainForm};

{$R *.res}

begin
Application.Initialize;
Application.CreateForm(TMainForm, MainForm);
Application.Run;
end.

Regards
Dennis



Back to top
Dennis
Guest





PostPosted: Sat May 21, 2005 3:02 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

Why is this benchmark called "DownSizeTest" ?

for n := 1 to 50 do // loop added to have more than 1000 MTicks for
this benchmark
begin
{Allocate a lot of strings}
SetLength(FStrings, 50000);
for i := 0 to high(FStrings) do
begin
{Grab a 20K block}
SetLength(FStrings[i], 20000);
{Reduce the size to 1 byte}
SetLength(FStrings[i], 1);
end;
{Update the peak address space usage}
UpdateUsageStatistics;
end;

In the first interation it allocates 20000 bytes for each string and then
downsize it to 1 byte + 8 byte.

This we could call alloc + downsize benchmark

in the next iteration it upsizes all strings from 1+8 byte to 20000 bytes
and then downsizes them again

This we could call upsize + downsize benchmark

The overall name of the benchmark should be UpsizeAndDownsize benchmark?

Regards
Dennis


Back to top
Dennis
Guest





PostPosted: Sun May 22, 2005 6:19 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

I just created a benchmark

Quote:
-bmMemoryAccessSpeed benchmark with non serial memory access pattern
(Selection sort on array of Integer)

unit SortIntArrayBenchmark1Unit;

interface

uses Windows, BenchmarkClassUnit, Classes, Math;

type

TSortIntArrayThreads = class(TFastcodeMMBenchmark)
public
procedure RunBenchmark; override;
class function GetBenchmarkName: string; override;
class function GetBenchmarkDescription: string; override;
class function GetSpeedWeight: Double; override;
class function GetCategory: TBenchmarkCategory; override;
end;

implementation

uses SysUtils;

type

TSortIntArrayThread = class(TThread)
FBenchmark: TFastcodeMMBenchmark;
procedure Execute; override;
end;

procedure TSortIntArrayThread.Execute;
var
IntArray : array of Integer;
Size, I1, I2, I3, IndexMax, Temp, Max : Integer;
const
MINSIZE : Integer = 500;
MAXSIZE : Integer = 1500;

begin
for Size := MINSIZE to MAXSIZE do
begin
SetLength(IntArray, Size);
//Fill array with random values
for I1 := 0 to Size-1 do
begin
IntArray[I1] := Random(100);
end;
//Sort array just to create an acces pattern
for I2 := 0 to Size-2 do
begin
//Find biggest element in unsorted part of array
Max := IntArray[I2];
IndexMax := I2;
for I3 := I2+1 to Size-1 do
begin
if IntArray[I3] > Max then
begin
Max := IntArray[I3];
IndexMax := I3;
end;
end;
//Swap current element with biggest remaining element
Temp := IntArray[I2];
IntArray[I2] := IntArray[IndexMax];
IntArray[IndexMax] := Temp;
end;
end;
//"Free" array
SetLength(IntArray, 0);
FBenchmark.UpdateUsageStatistics;
end;

class function TSortIntArrayThreads.GetBenchmarkDescription: string;
begin
Result := 'A benchmark that measures read and write speed to an array of
Integer. '
+ 'Access pattern is created by selection sorting array of random
values. '
+ 'Measures memory usage after all blocks have been freed. '
+ 'Benchmark submitted by Dennis Kjaer Christensen.';
end;

class function TSortIntArrayThreads.GetBenchmarkName: string;
begin
Result := 'SortIntegerArrayBenchmark';
end;

class function TSortIntArrayThreads.GetCategory: TBenchmarkCategory;
begin
Result := bmMemoryAccessSpeed;
end;

class function TSortIntArrayThreads.GetSpeedWeight: Double;
begin
Result := 0.75;
end;

procedure TSortIntArrayThreads.RunBenchmark;
var
SortIntArrayThread : TSortIntArrayThread;

begin
inherited;
SortIntArrayThread := TSortIntArrayThread.Create(True);
SortIntArrayThread.FreeOnTerminate := False;
SortIntArrayThread.FBenchmark := Self;
SortIntArrayThread.Resume;
SortIntArrayThread.WaitFor;
SortIntArrayThread.Free;
end;

end.

Regards
Dennis



Back to top
Dennis
Guest





PostPosted: Mon May 23, 2005 2:35 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

Here is a new benchmark. Could you help me make the dyn array 16 byte
aligned if the MM 16 byte aligns?

ExtArray : array of TExtended; //Not aligned due to 8 byte
offset
of first element

Regards
Dennis

unit SortExtendedArrayBenchmark1Unit;

interface

uses Windows, BenchmarkClassUnit, Classes, Math;

type

TSortExtendedArrayThreads = class(TFastcodeMMBenchmark)
public
procedure RunBenchmark; override;
class function GetBenchmarkName: string; override;
class function GetBenchmarkDescription: string; override;
class function GetSpeedWeight: Double; override;
class function GetCategory: TBenchmarkCategory; override;
end;

implementation

uses SysUtils;

type

TSortExtendedArrayThread = class(TThread)
FBenchmark: TFastcodeMMBenchmark;
procedure Execute; override;
end;

TExtended = record
X : Extended;
Pad1, Pad2, Pad3, Pad4, Pad5, Pad6 : Byte;
end;

procedure TSortExtendedArrayThread.Execute;
var
ExtArray : array of TExtended; //Not aligned due to 8 byte
offset of first element
Size, I1, I2, I3, IndexMax, RunNo : Integer;
Temp, Max : Extended;
const
MAXRUNNO : Integer = 10;
MAXELEMENTVALUE : Integer = MAXINT;
MINSIZE : Integer = 100;
MAXSIZE : Integer = 10000;

begin
for RunNo := 1 to MAXRUNNO do
begin
Size := Random(MAXSIZE-MINSIZE) + MINSIZE;
SetLength(ExtArray, Size);
//Fill array with random values
for I1 := 0 to Size-1 do
begin
ExtArray[I1].X := Random(MAXELEMENTVALUE);
end;
//Sort array just to create an acces pattern
for I2 := Size-1 downto 1 do
begin
//Find biggest element in unsorted part of array
Max := ExtArray[I2].X;
IndexMax := I2;
for I3 := I2-1 downto 0 do
begin
if ExtArray[I3].X > Max then
begin
Max := ExtArray[I3].X;
IndexMax := I3;
end;
end;
//Swap current element with biggest remaining element
Temp := ExtArray[I2].X;
ExtArray[I2].X := ExtArray[IndexMax].X;
ExtArray[IndexMax].X := Temp;
end;
end;
//"Free" array
SetLength(ExtArray, 0);
FBenchmark.UpdateUsageStatistics;
end;

class function TSortExtendedArrayThreads.GetBenchmarkDescription: string;
begin
Result := 'A benchmark that measures read and write speed to an array of
Extendeds. '
+ 'The Extended type is padded to fill 16 byte. Then it is
possible to align it '
+ 'Bonus is given for 16 byte alignment of array'
+ 'Access pattern is created by X sorting array of random
values.
'
+ 'Measures memory usage after all blocks have been freed. '
+ 'Benchmark submitted by Dennis Kjaer Christensen.';
end;

class function TSortExtendedArrayThreads.GetBenchmarkName: string;
begin
Result := 'SortExtendedArrayBenchmark';
end;

class function TSortExtendedArrayThreads.GetCategory: TBenchmarkCategory;
begin
Result := bmMemoryAccessSpeed;
end;

class function TSortExtendedArrayThreads.GetSpeedWeight: Double;
begin
Result := 0.75;
end;

procedure TSortExtendedArrayThreads.RunBenchmark;
var
SortExtendedArrayThread1, SortExtendedArrayThread2 :
TSortExtendedArrayThread;
SortExtendedArrayThread3, SortExtendedArrayThread4 :
TSortExtendedArrayThread;

begin
inherited;
SortExtendedArrayThread1 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread2 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread3 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread4 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread1.FreeOnTerminate := False;
SortExtendedArrayThread2.FreeOnTerminate := False;
SortExtendedArrayThread3.FreeOnTerminate := False;
SortExtendedArrayThread4.FreeOnTerminate := False;
SortExtendedArrayThread1.Priority := tpLower;
SortExtendedArrayThread2.Priority := tpNormal;
SortExtendedArrayThread3.Priority := tpHigher;
SortExtendedArrayThread4.Priority := tpHighest;
SortExtendedArrayThread1.FBenchmark := Self;
SortExtendedArrayThread2.FBenchmark := Self;
SortExtendedArrayThread3.FBenchmark := Self;
SortExtendedArrayThread4.FBenchmark := Self;
SortExtendedArrayThread1.Resume;
SortExtendedArrayThread2.Resume;
SortExtendedArrayThread3.Resume;
SortExtendedArrayThread4.Resume;
SortExtendedArrayThread1.WaitFor;
SortExtendedArrayThread2.WaitFor;
SortExtendedArrayThread3.WaitFor;
SortExtendedArrayThread4.WaitFor;
SortExtendedArrayThread1.Free;
SortExtendedArrayThread2.Free;
SortExtendedArrayThread3.Free;
SortExtendedArrayThread4.Free;
end;

end.


--
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 180 spam-mails.
Betalende brugere får ikke denne besked i deres e-mails.
Hent den gratis her: www.spamfighter.dk


Back to top
Dennis
Guest





PostPosted: Mon May 23, 2005 8:04 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

A very unimportant release ;-)

I hope very much that somebody will make a release soon. No matter how small
it is.

Regards
Dennis

unit DKC_IA32_MM_Unit;

//Version 0.9 23-5-2005

interface

implementation

uses
Windows;

type
TAllocType = record
PBlock : Pointer;
InternalSize : Integer;
ExternalSize : Integer;
SmallAlloc : Boolean;
end;

TAllocTypeArray = array[0..1000000] of TAllocType;
PAllocTypeArray = ^TAllocTypeArray;

var
Heap: THandle;
RTLCriticalSection : TRTLCriticalSection;
LastUsedIndexGlobal : Integer;
AllocArraySize : integer;
NoOfLivePointers: integer;
AllocArray : PAllocTypeArray;

const
ALLOCARRAYINITIALSIZE : Integer = 1000;
SPLITSIZE : Integer = 100*1024*1024;
FLAGS : Cardinal = HEAP_NO_SERIALIZE;
SMALLMOVESIZE = 36;//For JOH Move
SHRINKSIZE : Integer = 200;
GROWSIZE : Integer = 200;
MAXNOOFDEFRAGROUNDS : Integer = 100;
//For upsize
OVERALLOCPERCENTAGESMALLUPSIZE : Double = 2;
OVERALLOCPERCENTAGEBIGUPSIZE : Double = 1.1;
OVERALLOCEXTRA : Integer = 32;
//For downsize
OVERALLOCPERCENTAGESMALLDOWNSIZE : Double = 1/2;
OVERALLOCPERCENTAGEBIGDOWNSIZE : Double = 1/1.1;

{-------------------------------------------------------------------------}
{Perform Forward Move of 0..36 Bytes}
{On Entry, ECX = Count, EAX = Source+Count, EDX = Dest+Count. Destroys ECX}
procedure SmallForwardMove_9;
asm
jmp dword ptr [@@FwdJumpTable+ecx*4]
nop {Align Jump Table}
@@FwdJumpTable:
dd @@Done {Removes need to test for zero size move}
dd @@Fwd01, @@Fwd02, @@Fwd03, @@Fwd04, @@Fwd05, @@Fwd06, @@Fwd07,
@@Fwd08
dd @@Fwd09, @@Fwd10, @@Fwd11, @@Fwd12, @@Fwd13, @@Fwd14, @@Fwd15,
@@Fwd16
dd @@Fwd17, @@Fwd18, @@Fwd19, @@Fwd20, @@Fwd21, @@Fwd22, @@Fwd23,
@@Fwd24
dd @@Fwd25, @@Fwd26, @@Fwd27, @@Fwd28, @@Fwd29, @@Fwd30, @@Fwd31,
@@Fwd32
dd @@Fwd33, @@Fwd34, @@Fwd35, @@Fwd36
@@Fwd36:
mov ecx, [eax-36]
mov [edx-36], ecx
@@Fwd32:
mov ecx, [eax-32]
mov [edx-32], ecx
@@Fwd28:
mov ecx, [eax-28]
mov [edx-28], ecx
@@Fwd24:
mov ecx, [eax-24]
mov [edx-24], ecx
@@Fwd20:
mov ecx, [eax-20]
mov [edx-20], ecx
@@Fwd16:
mov ecx, [eax-16]
mov [edx-16], ecx
@@Fwd12:
mov ecx, [eax-12]
mov [edx-12], ecx
@@Fwd08:
mov ecx, [eax-8]
mov [edx-8], ecx
@@Fwd04:
mov ecx, [eax-4]
mov [edx-4], ecx
ret
nop
@@Fwd35:
mov ecx, [eax-35]
mov [edx-35], ecx
@@Fwd31:
mov ecx, [eax-31]
mov [edx-31], ecx
@@Fwd27:
mov ecx, [eax-27]
mov [edx-27], ecx
@@Fwd23:
mov ecx, [eax-23]
mov [edx-23], ecx
@@Fwd19:
mov ecx, [eax-19]
mov [edx-19], ecx
@@Fwd15:
mov ecx, [eax-15]
mov [edx-15], ecx
@@Fwd11:
mov ecx, [eax-11]
mov [edx-11], ecx
@@Fwd07:
mov ecx, [eax-7]
mov [edx-7], ecx
mov ecx, [eax-4]
mov [edx-4], ecx
ret
nop
@@Fwd03:
movzx ecx, word ptr [eax-3]
mov [edx-3], cx
movzx ecx, byte ptr [eax-1]
mov [edx-1], cl
ret
@@Fwd34:
mov ecx, [eax-34]
mov [edx-34], ecx
@@Fwd30:
mov ecx, [eax-30]
mov [edx-30], ecx
@@Fwd26:
mov ecx, [eax-26]
mov [edx-26], ecx
@@Fwd22:
mov ecx, [eax-22]
mov [edx-22], ecx
@@Fwd18:
mov ecx, [eax-18]
mov [edx-18], ecx
@@Fwd14:
mov ecx, [eax-14]
mov [edx-14], ecx
@@Fwd10:
mov ecx, [eax-10]
mov [edx-10], ecx
@@Fwd06:
mov ecx, [eax-6]
mov [edx-6], ecx
@@Fwd02:
movzx ecx, word ptr [eax-2]
mov [edx-2], cx
ret
nop
nop
nop
@@Fwd33:
mov ecx, [eax-33]
mov [edx-33], ecx
@@Fwd29:
mov ecx, [eax-29]
mov [edx-29], ecx
@@Fwd25:
mov ecx, [eax-25]
mov [edx-25], ecx
@@Fwd21:
mov ecx, [eax-21]
mov [edx-21], ecx
@@Fwd17:
mov ecx, [eax-17]
mov [edx-17], ecx
@@Fwd13:
mov ecx, [eax-13]
mov [edx-13], ecx
@@Fwd09:
mov ecx, [eax-9]
mov [edx-9], ecx
@@Fwd05:
mov ecx, [eax-5]
mov [edx-5], ecx
@@Fwd01:
movzx ecx, byte ptr [eax-1]
mov [edx-1], cl
ret
@@Done:
end; {SmallForwardMove}

{-------------------------------------------------------------------------}
{Perform Backward Move of 0..36 Bytes}
{On Entry, ECX = Count, EAX = Source, EDX = Dest. Destroys ECX}
procedure SmallBackwardMove_9;
asm
jmp dword ptr [@@BwdJumpTable+ecx*4]
nop {Align Jump Table}
@@BwdJumpTable:
dd @@Done {Removes need to test for zero size move}
dd @@Bwd01, @@Bwd02, @@Bwd03, @@Bwd04, @@Bwd05, @@Bwd06, @@Bwd07,
@@Bwd08
dd @@Bwd09, @@Bwd10, @@Bwd11, @@Bwd12, @@Bwd13, @@Bwd14, @@Bwd15,
@@Bwd16
dd @@Bwd17, @@Bwd18, @@Bwd19, @@Bwd20, @@Bwd21, @@Bwd22, @@Bwd23,
@@Bwd24
dd @@Bwd25, @@Bwd26, @@Bwd27, @@Bwd28, @@Bwd29, @@Bwd30, @@Bwd31,
@@Bwd32
dd @@Bwd33, @@Bwd34, @@Bwd35, @@Bwd36
@@Bwd36:
mov ecx, [eax+32]
mov [edx+32], ecx
@@Bwd32:
mov ecx, [eax+28]
mov [edx+28], ecx
@@Bwd28:
mov ecx, [eax+24]
mov [edx+24], ecx
@@Bwd24:
mov ecx, [eax+20]
mov [edx+20], ecx
@@Bwd20:
mov ecx, [eax+16]
mov [edx+16], ecx
@@Bwd16:
mov ecx, [eax+12]
mov [edx+12], ecx
@@Bwd12:
mov ecx, [eax+8]
mov [edx+8], ecx
@@Bwd08:
mov ecx, [eax+4]
mov [edx+4], ecx
@@Bwd04:
mov ecx, [eax]
mov [edx], ecx
ret
nop
nop
nop
@@Bwd35:
mov ecx, [eax+31]
mov [edx+31], ecx
@@Bwd31:
mov ecx, [eax+27]
mov [edx+27], ecx
@@Bwd27:
mov ecx, [eax+23]
mov [edx+23], ecx
@@Bwd23:
mov ecx, [eax+19]
mov [edx+19], ecx
@@Bwd19:
mov ecx, [eax+15]
mov [edx+15], ecx
@@Bwd15:
mov ecx, [eax+11]
mov [edx+11], ecx
@@Bwd11:
mov ecx, [eax+7]
mov [edx+7], ecx
@@Bwd07:
mov ecx, [eax+3]
mov [edx+3], ecx
mov ecx, [eax]
mov [edx], ecx
ret
nop
nop
nop
@@Bwd03:
movzx ecx, word ptr [eax+1]
mov [edx+1], cx
movzx ecx, byte ptr [eax]
mov [edx], cl
ret
nop
nop
@@Bwd34:
mov ecx, [eax+30]
mov [edx+30], ecx
@@Bwd30:
mov ecx, [eax+26]
mov [edx+26], ecx
@@Bwd26:
mov ecx, [eax+22]
mov [edx+22], ecx
@@Bwd22:
mov ecx, [eax+18]
mov [edx+18], ecx
@@Bwd18:
mov ecx, [eax+14]
mov [edx+14], ecx
@@Bwd14:
mov ecx, [eax+10]
mov [edx+10], ecx
@@Bwd10:
mov ecx, [eax+6]
mov [edx+6], ecx
@@Bwd06:
mov ecx, [eax+2]
mov [edx+2], ecx
@@Bwd02:
movzx ecx, word ptr [eax]
mov [edx], cx
ret
nop
@@Bwd33:
mov ecx, [eax+29]
mov [edx+29], ecx
@@Bwd29:
mov ecx, [eax+25]
mov [edx+25], ecx
@@Bwd25:
mov ecx, [eax+21]
mov [edx+21], ecx
@@Bwd21:
mov ecx, [eax+17]
mov [edx+17], ecx
@@Bwd17:
mov ecx, [eax+13]
mov [edx+13], ecx
@@Bwd13:
mov ecx, [eax+9]
mov [edx+9], ecx
@@Bwd09:
mov ecx, [eax+5]
mov [edx+5], ecx
@@Bwd05:
mov ecx, [eax+1]
mov [edx+1], ecx
@@Bwd01:
movzx ecx, byte ptr[eax]
mov [edx], cl
ret
nop
nop
@@Done:
end; {SmallBackwardMove}

{-------------------------------------------------------------------------}
{Move ECX Bytes from EAX to EDX, where EAX > EDX and ECX > 36
(SMALLMOVESIZE)}
procedure Forwards_IA32_9;
asm
push edx
fild qword ptr [eax]
lea eax, [eax+ecx-8]
lea ecx, [ecx+edx-8]
fild qword ptr [eax]
push ecx
neg ecx
and edx, -8
lea ecx, [ecx+edx+8]
pop edx
@FwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
add ecx, 8
jl @FwdLoop
fistp qword ptr [edx]
pop edx
fistp qword ptr [edx]
end; {Forwards_IA32}

{-------------------------------------------------------------------------}
{Move ECX Bytes from EAX to EDX, where EAX < EDX and ECX > 36
(SMALLMOVESIZE)}
procedure Backwards_IA32_9;
asm
sub ecx, 8
push ecx
fild qword ptr [eax+ecx] {Last 8}
fild qword ptr [eax] {First 8}
add ecx, edx
and ecx, -8
sub ecx, edx
@BwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
sub ecx, 8
jg @BwdLoop
pop ecx
fistp qword ptr [edx] {First 8}
fistp qword ptr [edx+ecx] {Last 8}
end; {Backwards_IA32}

{-------------------------------------------------------------------------}
{Move using IA32 Instruction Set Only}
procedure MoveJOH_IA32_9(const Source; var Dest; Count : Integer);
asm
cmp ecx, SMALLMOVESIZE
ja @Large {Count > SMALLMOVESIZE or Count < 0}
cmp eax, edx
jbe @SmallCheck
add eax, ecx
add edx, ecx
jmp SmallForwardMove_9
@SmallCheck:
jne SmallBackwardMove_9
ret {For Compatibility with Delphi's move for Source = Dest}
@Large:
jng @Done {For Compatibility with Delphi's move for Count < 0}
cmp eax, edx
ja Forwards_IA32_9
je @Done {For Compatibility with Delphi's move for Source = Dest}
sub edx, ecx
cmp eax, edx
lea edx, [edx+ecx]
jna Forwards_IA32_9
jmp Backwards_IA32_9 {Source/Dest Overlap}
@Done:
end; {MoveJOH_IA32}

procedure InitializeAllocArray(StartIndex, StopIndex : Integer);
var
Index : Integer;

begin
{ if (StartIndex < 0) or (StartIndex > AllocArraySize-1) then
RunError(203);
if (StopIndex < 0) or (StopIndex > AllocArraySize-1) then
RunError(203);}
for Index := StartIndex to StopIndex do
begin
AllocArray[Index].PBlock := nil;
//AllocArray[Index].Active := False;
AllocArray[Index].InternalSize := 0;
AllocArray[Index].ExternalSize := 0;
AllocArray[Index].SmallAlloc := True
end;
end;

function GrowAllocArray(var AllocArray : PAllocTypeArray; OldSize : Integer)
: Integer;
var
NewSize : Integer;

begin
NewSize := OldSize + GROWSIZE;
AllocArray := HeapRealloc(Heap, FLAGS, AllocArray , NewSize *
SizeOf(TAllocType));
if AllocArray = nil then
begin
//Try with a smaller GROWSIZE
NewSize := OldSize + 1;
AllocArray := HeapRealloc(Heap, FLAGS, AllocArray , NewSize *
SizeOf(TAllocType));
if AllocArray <> nil then
begin
InitializeAllocArray(OldSize, NewSize-1);
Result := NewSize;
end
else
begin
RunError(203);
Result := 0;//For compiler
end;
end
else
begin
InitializeAllocArray(OldSize, NewSize-1);
Result := NewSize;
end;
end;

function ShrinkAllocArray(var AllocArray : PAllocTypeArray; OldSize :
Integer) : Integer;
var
HighIndex, NewSize, Index, LowIndex, NoOfDefragRounds : Integer;

begin
NoOfDefragRounds := 0;
HighIndex := AllocArraySize;
LowIndex := 0;
repeat
//Find highest used index
repeat
Dec(HighIndex);
until(AllocArray[HighIndex].PBlock <> nil);
//Find lowest unused index
repeat
Inc(LowIndex);
until(AllocArray[LowIndex].PBlock = nil);
if LowIndex < HighIndex then
begin
//Copy HighIndex to LowIndex
AllocArray[LowIndex].PBlock := AllocArray[HighIndex].PBlock;
AllocArray[LowIndex].InternalSize := AllocArray[HighIndex].InternalSize;
AllocArray[LowIndex].ExternalSize := AllocArray[HighIndex].ExternalSize;
AllocArray[LowIndex].SmallAlloc := AllocArray[HighIndex].SmallAlloc;
//Clear HighIndex
AllocArray[HighIndex].PBlock := nil;
AllocArray[HighIndex].InternalSize := 0;
AllocArray[HighIndex].ExternalSize := 0;
AllocArray[HighIndex].SmallAlloc := True
end
else
//No more to defrag
Break;
Inc(NoOfDefragRounds)
until(NoOfDefragRounds >= MAXNOOFDEFRAGROUNDS);
//Find highest used index
Index := HighIndex;
repeat
Dec(Index);
until(AllocArray[Index].PBlock <> nil);
NewSize := OldSize - SHRINKSIZE;
//Do not shrink below any used entries
if NewSize < Index+1 then
NewSize := Index+1;
if NewSize < OldSize then
begin
AllocArray := HeapRealloc(Heap, FLAGS, AllocArray , NewSize *
SizeOf(TAllocType));
LastUsedIndexGlobal := NewSize-1;
end;
if AllocArray = nil then
RunError(203);
Result := NewSize;
end;

function GetSize(Index : Integer) : Integer; overload;
begin
{if (Index < 0) or (Index > AllocArraySize-1) then
begin
Result := 0; //For compiler
RunError(203);
Exit;
end;}
Result := AllocArray[Index].InternalSize;
end;

function GetExternalSize(Index : Integer) : Integer;
begin
{if (Index < 0) or (Index > AllocArraySize-1) then
begin
Result := 0; //For compiler
RunError(203);
Exit;
end;}
Result := AllocArray[Index].ExternalSize;
end;

procedure SetSize(Index, NewSize, ExternalSize : Integer);
begin
{ if (Index < 0) or (Index > AllocArraySize-1) then
begin
RunError(203);
Exit;
end;}
AllocArray[Index].InternalSize := NewSize;
AllocArray[Index].ExternalSize := ExternalSize;
end;

function GetSmallAlloc(Index : Integer) : Boolean; overload;
begin
{ if (Index < 0) or (Index > AllocArraySize-1) then
begin
Result := True; //For compiler
RunError(203);
Exit;
end;}
Result := AllocArray[Index].SmallAlloc;
end;

procedure AddToAllocTypeArray(P : Pointer; SmallAlloc : Boolean; Size,
ExternalSize : Integer);
var
Index1, Index2 : Integer;

begin
Inc(NoOfLivePointers);
if NoOfLivePointers > AllocArraySize then
AllocArraySize := GrowAllocArray(AllocArray, AllocArraySize);
Index1 := LastUsedIndexGlobal;
Index2 := LastUsedIndexGlobal+1;
if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;
repeat
//if (AllocArray[Index1].Active = False) then
if (AllocArray[Index1].PBlock = nil) then
begin
LastUsedIndexGlobal := Index1;
AllocArray[Index1].PBlock := P;
AllocArray[Index1].InternalSize := Size;
AllocArray[Index1].ExternalSize := ExternalSize;
AllocArray[Index1].SmallAlloc := SmallAlloc;
Exit;
end;
if (AllocArray[Index2].PBlock = nil) then
begin
LastUsedIndexGlobal := Index2;
AllocArray[Index2].PBlock := P;
AllocArray[Index2].InternalSize := Size;
AllocArray[Index2].ExternalSize := ExternalSize;
AllocArray[Index2].SmallAlloc := SmallAlloc;
Exit;
end;
Dec(Index1);
Inc(Index2);
if (Index1 < 0) and (Index2 > AllocArraySize-1) then
begin
//Did not find space for pointer
RunError(203);
Exit;
end;
if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;
until(False);
end;

procedure RemoveFromAllocTypeArray(Index : Integer); overload;
begin
{ if (Index < 0) or (Index > AllocArraySize-1) then
begin
RunError(203);
end;}
AllocArray[Index].PBlock := nil;
AllocArray[Index].InternalSize := 0;
AllocArray[Index].ExternalSize := 0;
AllocArray[Index].SmallAlloc := True;
Dec(NoOfLivePointers);
if NoOfLivePointers < AllocArraySize-SHRINKSIZE then
AllocArraySize := ShrinkAllocArray(AllocArray, AllocArraySize);
end;

function GetIndex(P : Pointer) : Integer;
var
Index1, Index2 : Integer;

begin
if LastUsedIndexGlobal > AllocArraySize-2 then
LastUsedIndexGlobal := AllocArraySize-2;
Index1 := LastUsedIndexGlobal;
Index2 := LastUsedIndexGlobal+1;
{ if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;}
repeat
if (AllocArray[Index1].PBlock = P) then
begin
LastUsedIndexGlobal := Index1;
Result := Index1;
Exit;
end;
if (AllocArray[Index2].PBlock = P) then
begin
LastUsedIndexGlobal := Index2;
Result := Index2;
Exit;
end;
Dec(Index1);
Inc(Index2);
if (Index1 < 0) and (Index2 > AllocArraySize-1) then
begin
//Did not find pointer
RunError(203);
Result := -1;
Exit;
end;
if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;
until(False);
end;

function DKCGetMem(Size: Integer): Pointer;
var
OverSize : Integer;

begin
EnterCriticalSection(RTLCriticalSection);
if Size < SPLITSIZE then
begin
OverSize := Size + OVERALLOCEXTRA;
Result := HeapAlloc(Heap, FLAGS, OverSize);
AddToAllocTypeArray(Result, True, OverSize, Size);
end
else
begin
Result := VirtualAlloc(nil, Size, MEM_COMMIT+MEM_TOP_DOWN,
PAGE_READWRITE);
AddToAllocTypeArray(Result, False, Size, Size);
end;
LeaveCriticalSection(RTLCriticalSection);
end;

function DKCFreeMem(Ptr: Pointer): Integer;
var
Res : Boolean;
Index : Integer;

begin
EnterCriticalSection(RTLCriticalSection);
Index := GetIndex(Ptr);
if GetSmallAlloc(Index) then
begin
if HeapFree(Heap, FLAGS, Ptr) then
begin
RemoveFromAllocTypeArray(Index);
Result := 0;
if HeapCompact(Heap, FLAGS) = 0 then
RunError(203);
end
else
Result := 1;
end
else
begin
Res := VirtualFree(Ptr, 0, MEM_RELEASE);
if Res then
begin
RemoveFromAllocTypeArray(Index);
Result := 0;
end
else
Result := 1;
end;
LeaveCriticalSection(RTLCriticalSection);
end;

function DKCReallocMem(Ptr: Pointer; Size: Integer): Pointer;
var
OldIndex, OldSize, NewSize, OldExternalSize, NewOverSize : Integer;
NewPtr, OldPtr : Pointer;

begin
EnterCriticalSection(RTLCriticalSection);
NewSize := Size;
OldPtr := Ptr;
OldIndex := GetIndex(Ptr);
OldSize := GetSize(OldIndex);
if GetSmallAlloc(OldIndex) then
begin
if NewSize < SPLITSIZE then
begin
//Realloc small as small
if (NewSize > OldSize) then //Upsize
begin
//Alloc more than requested
NewOverSize := Round(NewSize * OVERALLOCPERCENTAGESMALLUPSIZE) +
OVERALLOCEXTRA;
Result := HeapRealloc(Heap, FLAGS, OldPtr, NewOverSize);
end
else if (NewSize < Round(OldSize * OVERALLOCPERCENTAGESMALLDOWNSIZE) -
OVERALLOCEXTRA) then //Downsize
begin
//Allocate requested size
NewOverSize := NewSize;
Result := HeapRealloc(Heap, FLAGS, OldPtr, NewOverSize);
end
else
begin
//OverSize did not change because no realloc took place
NewOverSize := OldSize;
Result := OldPtr;
end;
if Result = OldPtr then
begin
SetSize(OldIndex, NewOverSize, NewSize);
end
else
begin
RemoveFromAllocTypeArray(OldIndex);
AddToAllocTypeArray(Result, True, NewOverSize, NewSize);
end;
end
else
begin
//Realloc small as big
//Get new block
Result := VirtualAlloc(nil, NewSize, MEM_COMMIT+MEM_TOP_DOWN,
PAGE_READWRITE);
NewPtr := Result;
if Result <> nil then
begin
AddToAllocTypeArray(NewPtr, False, NewSize, NewSize);
OldExternalSize := GetExternalSize(OldIndex);
MoveJOH_IA32_9(OldPtr^, NewPtr^, OldExternalSize);
if not HeapFree(Heap, FLAGS, OldPtr) then
RunError(203)
else
RemoveFromAllocTypeArray(OldIndex);
end
else
RunError(203);
end;
end
else
begin
if Size > SPLITSIZE then
begin
//Realloc big as big
//Is realloc needed
if NewSize > OldSize then //Upsize
begin
//Alloc more than requested
NewOverSize := Round(NewSize * OVERALLOCPERCENTAGEBIGUPSIZE);
Result := VirtualAlloc(OldPtr, NewOverSize, MEM_COMMIT or
MEM_TOP_DOWN, PAGE_READWRITE);
end
else if (NewSize < Round(OldSize * OVERALLOCPERCENTAGEBIGDOWNSIZE))
then //Downsize
begin
//Alloc more than requested
NewOverSize := Round(NewSize * OVERALLOCPERCENTAGEBIGUPSIZE);
Result := VirtualAlloc(OldPtr, NewOverSize, MEM_COMMIT or
MEM_TOP_DOWN, PAGE_READWRITE);
end
else
begin
//OverSize did not change because no realloc took place
NewOverSize := OldSize;
Result := OldPtr;
end;
if Result <> nil then
begin
if Result = OldPtr then
begin
SetSize(OldIndex, NewOverSize, NewSize);
end
else
begin
NewPtr := Result;
if NewSize > OldSize then
begin
OldExternalSize := GetExternalSize(OldIndex);
Move(OldPtr^, NewPtr^, OldExternalSize);
end;
RemoveFromAllocTypeArray(OldIndex);
AddToAllocTypeArray(Result, False, NewOverSize, NewSize);
end;
end
else
begin
//Realloc failed. Try get a new block
Result := VirtualAlloc(nil, NewOverSize, MEM_COMMIT+MEM_TOP_DOWN,
PAGE_READWRITE);
NewPtr := Result;
if Result <> nil then
begin
AddToAllocTypeArray(NewPtr, False, NewOverSize, NewSize);
OldExternalSize := GetExternalSize(OldIndex);
MoveJOH_IA32_9(OldPtr^, NewPtr^, OldExternalSize);
if not VirtualFree(OldPtr, 0, MEM_RELEASE) then
RunError(203)
else
RemoveFromAllocTypeArray(OldIndex);
end
else
RunError(203);
end;
end
else
begin
//Realloc big as small
Result := HeapAlloc(Heap, FLAGS, NewSize);
NewPtr := Result;
AddToAllocTypeArray(NewPtr, True, NewSize, NewSize);
Move(OldPtr^, NewPtr^, NewSize);
if not VirtualFree(OldPtr, 0, MEM_RELEASE) then
RunError(203)
else
RemoveFromAllocTypeArray(OldIndex);
end;
end;
LeaveCriticalSection(RTLCriticalSection);
end;

procedure InitMemoryManager;
resourcestring
sError = 'DKC_IA32_MM_Unit must be first unit used by the project';
var
MemMgr: TMemoryManager;

begin
NoOfLivePointers := 0;
Heap := HeapCreate(FLAGS, 0, 0);
if Heap = 0 then
RunError(203); // out of memory
Assert(AllocMemCount = 0, sError);
MemMgr.GetMem := DKCGetMem;
MemMgr.FreeMem := DKCFreeMem;
MemMgr.ReallocMem := DKCReallocMem;
SetMemoryManager(MemMgr);
InitializeCriticalSection(RTLCriticalSection);
EnterCriticalSection(RTLCriticalSection);
AllocArraySize := GROWSIZE;
AllocArray := HeapAlloc(Heap, FLAGS, AllocArraySize * SizeOf(TAllocType));
InitializeAllocArray(0, AllocArraySize-1);
LeaveCriticalSection(RTLCriticalSection);
end;

initialization

InitMemoryManager;

finalization

if Heap <> 0 then
HeapDestroy(Heap);
DeleteCriticalSection(RTLCriticalSection);

end.


Back to top
Dennis
Guest





PostPosted: Tue May 24, 2005 1:02 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi

Released in attachments

This is done:

-New DKCIA32MM.
-Validate11 fix
-Validate25 = Multithread validation (no cross thread interaction) added
-SortIntegerArrayBenchmark added (-bmMemoryAccessSpeed benchmark with non
serial memory access pattern)
-Validate26 added

Regards
Dennis


Back to top
Dan Downs
Guest





PostPosted: Tue May 24, 2005 3:38 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Quote:
Released in attachments


Oh thats just great!..... I just get done running all the 0.38 benchmarks on
a dual P3 500 and a new version gets posted. <g> Time to recompile.

The DKCIA32 MM and the EWCMM seemed to deadlock. Unfortunitly I didn't write
down which benchmark each one was on, so I'm rerunning them now.

Here's a summary of results from a dual P3 500mhz windows 2003 server, 768mb
ram. The dual 3ghz xeon is gone.

Average Total Performance: (Scaled so that the winner = 100%)
BucketMM : 97.0
BucketMM_Asm : 100.0
FastMM2 : 39.5
FastMM3 : 95.1
NexusMM : 67.5
RTLMM : 39.8
TopMM : 44.6
WinMem : 35.0

Average Speed Performance: (Scaled so that the winner = 100%)
BucketMM : 94.8
BucketMM_Asm : 100.0
FastMM2 : 29.7
FastMM3 : 93.0
NexusMM : 55.9
RTLMM : 29.7
TopMM : 56.4
WinMem : 44.7

Average Memory Performance: (Scaled so that the winner = 100%)
BucketMM : 99.2
BucketMM_Asm : 100.0
FastMM2 : 46.7
FastMM3 : 97.3
NexusMM : 76.7
RTLMM : 47.7
TopMM : 25.6
WinMem : 19.4



DD



Back to top
Dennis
Guest





PostPosted: Tue May 24, 2005 6:55 pm    Post subject: Re: Fastcode MM B&V 0.39 Reply with quote

Hi Dan

Quote:
Oh thats just great!..... I just get done running all the 0.38 benchmarks
on
a dual P3 500 and a new version gets posted. <g> Time to recompile.

It always goes like this ;-)

Quote:
The DKCIA32 MM and the EWCMM seemed to deadlock. Unfortunitly I didn't
write
down which benchmark each one was on, so I'm rerunning them now.

DKCIA32 MM is just very slow in some benchmarks and very very slow in others
:-)

Regards
Dennis



Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.