 |
BorlandTalk.com Borland discussion newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Dennis Guest
|
Posted: Tue May 17, 2005 6:41 am Post subject: Fastcode MM B&V 0.39 |
|
|
Hi All
I think that Validate11 should be changed a little. Do you agree?
function TMMValidation.Validate11 : Boolean;
var
I, J, K, L, NoOfStrings : Cardinal;
SomeArray : array of Cardinal;
StringArray : array[1..517] of Ansistring;
TempS : AnsiString;
MemoryStatus : TMemoryStatus;
const
BYTESTOALLOCATEMAX : Integer = 320001;
begin
Result := True;
//GlobalMemoryStatus(MemoryStatus);//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Move this
try
for I := 0 to BYTESTOALLOCATEMAX-1 do
begin
NoOfStrings := Length(StringArray);
//Clear a string sometimes
if I mod 5 = 0 then
StringArray[Random(NoOfStrings) + 1] := '';
//Grow a string with an 'A'
StringArray[(I mod NoOfStrings) + 1] := StringArray[(I mod
Cardinal(Length(StringArray)))+1] + 'A';
//Grow SomeArray if place for it
GlobalMemoryStatus(MemoryStatus);//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!To here
if MemoryStatus.dwAvailVirtual >= I+1 then
SetLength(SomeArray, I+1);
SomeArray[I] := I;
//Validate that SomeArray data are not changed
for J := 0 to I do
begin
if SomeArray[J] <> J then
begin
Result := False;
Exit;
end;
end;
//Validate that all strings contains 'A's and nothing else
for K := 1 to Length(StringArray) do
begin
TempS := StringArray[K];
for L := 1 to Length(TempS) do
begin
if TempS[L] <> 'A' then
begin
Result := False;
Exit;
end;
end;
end;
end;
except
Result := False;
end;
end;
Best regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Tue May 17, 2005 8:19 am Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Validation of big downsized is missing and therefore I suggest inclusion of
this validation function
//Based on Validate11 but validating big array downsize combined with many
small strings resize
function TMMValidation.Validate26 : Boolean;
var
I, J, K, L, I2, NoOfStrings : Cardinal;
SomeArray : array of Cardinal;
StringArray : array[1..10000] of Ansistring;
TempS : AnsiString;
MemoryStatus : TMemoryStatus;
SomeArraySize : Cardinal;
SomeArraySizeFP : Double;
const
NOOFRUNS : Integer = 2000;
SOMEARRAYMAXSIZE : Integer = 100*1024*1024;
SHRINKFACTOR : Double = 0.95;
begin
Result := True;
GlobalMemoryStatus(MemoryStatus);
SomeArraySize := SOMEARRAYMAXSIZE;
SomeArraySizeFP := SomeArraySize;
if MemoryStatus.dwAvailVirtual >= SomeArraySize then
SetLength(SomeArray, SomeArraySize);
//Fill SomeArray
for I2 := 0 to SomeArraySize-1 do
SomeArray[I2] := I2;
try
for I := 0 to NOOFRUNS-1 do
begin
NoOfStrings := Length(StringArray);
//Clear a string sometimes
if I mod 5 = 0 then
StringArray[Random(NoOfStrings) + 1] := '';
//Grow a string with an 'A'
StringArray[(I mod NoOfStrings) + 1] := StringArray[(I mod
Cardinal(Length(StringArray)))+1] + 'A';
//Shrink SomeArray
SomeArraySizeFP := SomeArraySizeFP * SHRINKFACTOR;
SomeArraySize := Round(SomeArraySizeFP);
//SomeArray length always > 0
if SomeArraySize <= 0 then
SomeArraySize := 1;
SetLength(SomeArray, SomeArraySize);
//Validate that SomeArray data are not changed
for J := 0 to SomeArraySize-1 do
begin
if SomeArray[J] <> J then
begin
Result := False;
Exit;
end;
end;
//Validate that all strings contains 'A's and nothing else
for K := 1 to Length(StringArray) do
begin
TempS := StringArray[K];
for L := 1 to Length(TempS) do
begin
if TempS[L] <> 'A' then
begin
Result := False;
Exit;
end;
end;
end;
end;
except
Result := False;
end;
end;
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Tue May 17, 2005 8:36 am Post subject: Re: Fastcode MM B&V 0.39 |
|
|
All MM's except Recycler passes this new validation
Regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Fri May 20, 2005 7:24 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
Who volunteers to build this release?
Regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Fri May 20, 2005 7:41 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Todo list that must be done to reach version 1.0:
-MultiThread validation
-BPL validation?
-Dll validation in B&V tool?
-bmMemoryAccessSpeed benchmark with non serial memory access pattern
(Selection sort on array of ??)?
-A new RecyclerMM
-FastMM4
-CPUID based selection of Move function for Move benchmarks
-CPUID based selection of FillChar function for MultiThreadFillChar
benchmark
-Add missing Dan Downs replay
-Add missing Nathanial Wools replay
-Eric Grange suggestions
Mimic a realistic memory usage in the Replay Benchmark.
Some ideas from previous posts (from Eric, if I remember well):
- Fillchar on every new allocation
- Reading randomly in previous allocations
- Filling every 64th byte (to be sure we're using all cache lines)
-Fix messed collumn display when running benchmark all
-Settle Validate23 discussion?
-etc ... help me ;-)
Regards
Dennis
|
|
| Back to top |
|
 |
RainMaker Guest
|
Posted: Sat May 21, 2005 9:54 am Post subject: A Req to the FastCode Team |
|
|
Hi,
please can you post some tutorials for a beginners users ;)
thanks
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Sat May 21, 2005 10:13 am Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
I have posted a new proposal for a validation. It triggers the fragmentation
issue in the RTL MM.
It is a stripped down version of a simulator I made to simulate the decline
in the number of longhair carriers in a population on shorthaired cats under
different breeding rules.
http://www.gugs-cats.dk/Diverse.htm
I tested most MM's and they all pass, with the exception of the RTL MM.
Should we include it in the B&V?
Regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Sat May 21, 2005 10:17 am Post subject: Re: A Req to the FastCode Team |
|
|
Hi
| Quote: | please can you post some tutorials for a beginners users
|
Yes.
Add the MM as the very first unit in the uses clause of your .dpr.
program LongHairSimulator;
uses
BucketMem.pas//A replacement MM
Forms,
MainUnit in 'MainUnit.pas' {MainForm};
{$R *.res}
begin
Application.Initialize;
Application.CreateForm(TMainForm, MainForm);
Application.Run;
end.
Regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Sat May 21, 2005 3:02 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
Why is this benchmark called "DownSizeTest" ?
for n := 1 to 50 do // loop added to have more than 1000 MTicks for
this benchmark
begin
{Allocate a lot of strings}
SetLength(FStrings, 50000);
for i := 0 to high(FStrings) do
begin
{Grab a 20K block}
SetLength(FStrings[i], 20000);
{Reduce the size to 1 byte}
SetLength(FStrings[i], 1);
end;
{Update the peak address space usage}
UpdateUsageStatistics;
end;
In the first interation it allocates 20000 bytes for each string and then
downsize it to 1 byte + 8 byte.
This we could call alloc + downsize benchmark
in the next iteration it upsizes all strings from 1+8 byte to 20000 bytes
and then downsizes them again
This we could call upsize + downsize benchmark
The overall name of the benchmark should be UpsizeAndDownsize benchmark?
Regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Sun May 22, 2005 6:19 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
I just created a benchmark
| Quote: | -bmMemoryAccessSpeed benchmark with non serial memory access pattern
(Selection sort on array of Integer)
|
unit SortIntArrayBenchmark1Unit;
interface
uses Windows, BenchmarkClassUnit, Classes, Math;
type
TSortIntArrayThreads = class(TFastcodeMMBenchmark)
public
procedure RunBenchmark; override;
class function GetBenchmarkName: string; override;
class function GetBenchmarkDescription: string; override;
class function GetSpeedWeight: Double; override;
class function GetCategory: TBenchmarkCategory; override;
end;
implementation
uses SysUtils;
type
TSortIntArrayThread = class(TThread)
FBenchmark: TFastcodeMMBenchmark;
procedure Execute; override;
end;
procedure TSortIntArrayThread.Execute;
var
IntArray : array of Integer;
Size, I1, I2, I3, IndexMax, Temp, Max : Integer;
const
MINSIZE : Integer = 500;
MAXSIZE : Integer = 1500;
begin
for Size := MINSIZE to MAXSIZE do
begin
SetLength(IntArray, Size);
//Fill array with random values
for I1 := 0 to Size-1 do
begin
IntArray[I1] := Random(100);
end;
//Sort array just to create an acces pattern
for I2 := 0 to Size-2 do
begin
//Find biggest element in unsorted part of array
Max := IntArray[I2];
IndexMax := I2;
for I3 := I2+1 to Size-1 do
begin
if IntArray[I3] > Max then
begin
Max := IntArray[I3];
IndexMax := I3;
end;
end;
//Swap current element with biggest remaining element
Temp := IntArray[I2];
IntArray[I2] := IntArray[IndexMax];
IntArray[IndexMax] := Temp;
end;
end;
//"Free" array
SetLength(IntArray, 0);
FBenchmark.UpdateUsageStatistics;
end;
class function TSortIntArrayThreads.GetBenchmarkDescription: string;
begin
Result := 'A benchmark that measures read and write speed to an array of
Integer. '
+ 'Access pattern is created by selection sorting array of random
values. '
+ 'Measures memory usage after all blocks have been freed. '
+ 'Benchmark submitted by Dennis Kjaer Christensen.';
end;
class function TSortIntArrayThreads.GetBenchmarkName: string;
begin
Result := 'SortIntegerArrayBenchmark';
end;
class function TSortIntArrayThreads.GetCategory: TBenchmarkCategory;
begin
Result := bmMemoryAccessSpeed;
end;
class function TSortIntArrayThreads.GetSpeedWeight: Double;
begin
Result := 0.75;
end;
procedure TSortIntArrayThreads.RunBenchmark;
var
SortIntArrayThread : TSortIntArrayThread;
begin
inherited;
SortIntArrayThread := TSortIntArrayThread.Create(True);
SortIntArrayThread.FreeOnTerminate := False;
SortIntArrayThread.FBenchmark := Self;
SortIntArrayThread.Resume;
SortIntArrayThread.WaitFor;
SortIntArrayThread.Free;
end;
end.
Regards
Dennis
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Mon May 23, 2005 2:35 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
Here is a new benchmark. Could you help me make the dyn array 16 byte
aligned if the MM 16 byte aligns?
ExtArray : array of TExtended; //Not aligned due to 8 byte
offset
of first element
Regards
Dennis
unit SortExtendedArrayBenchmark1Unit;
interface
uses Windows, BenchmarkClassUnit, Classes, Math;
type
TSortExtendedArrayThreads = class(TFastcodeMMBenchmark)
public
procedure RunBenchmark; override;
class function GetBenchmarkName: string; override;
class function GetBenchmarkDescription: string; override;
class function GetSpeedWeight: Double; override;
class function GetCategory: TBenchmarkCategory; override;
end;
implementation
uses SysUtils;
type
TSortExtendedArrayThread = class(TThread)
FBenchmark: TFastcodeMMBenchmark;
procedure Execute; override;
end;
TExtended = record
X : Extended;
Pad1, Pad2, Pad3, Pad4, Pad5, Pad6 : Byte;
end;
procedure TSortExtendedArrayThread.Execute;
var
ExtArray : array of TExtended; //Not aligned due to 8 byte
offset of first element
Size, I1, I2, I3, IndexMax, RunNo : Integer;
Temp, Max : Extended;
const
MAXRUNNO : Integer = 10;
MAXELEMENTVALUE : Integer = MAXINT;
MINSIZE : Integer = 100;
MAXSIZE : Integer = 10000;
begin
for RunNo := 1 to MAXRUNNO do
begin
Size := Random(MAXSIZE-MINSIZE) + MINSIZE;
SetLength(ExtArray, Size);
//Fill array with random values
for I1 := 0 to Size-1 do
begin
ExtArray[I1].X := Random(MAXELEMENTVALUE);
end;
//Sort array just to create an acces pattern
for I2 := Size-1 downto 1 do
begin
//Find biggest element in unsorted part of array
Max := ExtArray[I2].X;
IndexMax := I2;
for I3 := I2-1 downto 0 do
begin
if ExtArray[I3].X > Max then
begin
Max := ExtArray[I3].X;
IndexMax := I3;
end;
end;
//Swap current element with biggest remaining element
Temp := ExtArray[I2].X;
ExtArray[I2].X := ExtArray[IndexMax].X;
ExtArray[IndexMax].X := Temp;
end;
end;
//"Free" array
SetLength(ExtArray, 0);
FBenchmark.UpdateUsageStatistics;
end;
class function TSortExtendedArrayThreads.GetBenchmarkDescription: string;
begin
Result := 'A benchmark that measures read and write speed to an array of
Extendeds. '
+ 'The Extended type is padded to fill 16 byte. Then it is
possible to align it '
+ 'Bonus is given for 16 byte alignment of array'
+ 'Access pattern is created by X sorting array of random
values.
'
+ 'Measures memory usage after all blocks have been freed. '
+ 'Benchmark submitted by Dennis Kjaer Christensen.';
end;
class function TSortExtendedArrayThreads.GetBenchmarkName: string;
begin
Result := 'SortExtendedArrayBenchmark';
end;
class function TSortExtendedArrayThreads.GetCategory: TBenchmarkCategory;
begin
Result := bmMemoryAccessSpeed;
end;
class function TSortExtendedArrayThreads.GetSpeedWeight: Double;
begin
Result := 0.75;
end;
procedure TSortExtendedArrayThreads.RunBenchmark;
var
SortExtendedArrayThread1, SortExtendedArrayThread2 :
TSortExtendedArrayThread;
SortExtendedArrayThread3, SortExtendedArrayThread4 :
TSortExtendedArrayThread;
begin
inherited;
SortExtendedArrayThread1 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread2 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread3 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread4 := TSortExtendedArrayThread.Create(True);
SortExtendedArrayThread1.FreeOnTerminate := False;
SortExtendedArrayThread2.FreeOnTerminate := False;
SortExtendedArrayThread3.FreeOnTerminate := False;
SortExtendedArrayThread4.FreeOnTerminate := False;
SortExtendedArrayThread1.Priority := tpLower;
SortExtendedArrayThread2.Priority := tpNormal;
SortExtendedArrayThread3.Priority := tpHigher;
SortExtendedArrayThread4.Priority := tpHighest;
SortExtendedArrayThread1.FBenchmark := Self;
SortExtendedArrayThread2.FBenchmark := Self;
SortExtendedArrayThread3.FBenchmark := Self;
SortExtendedArrayThread4.FBenchmark := Self;
SortExtendedArrayThread1.Resume;
SortExtendedArrayThread2.Resume;
SortExtendedArrayThread3.Resume;
SortExtendedArrayThread4.Resume;
SortExtendedArrayThread1.WaitFor;
SortExtendedArrayThread2.WaitFor;
SortExtendedArrayThread3.WaitFor;
SortExtendedArrayThread4.WaitFor;
SortExtendedArrayThread1.Free;
SortExtendedArrayThread2.Free;
SortExtendedArrayThread3.Free;
SortExtendedArrayThread4.Free;
end;
end.
--
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at få 180 spam-mails.
Betalende brugere får ikke denne besked i deres e-mails.
Hent den gratis her: www.spamfighter.dk
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Mon May 23, 2005 8:04 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
A very unimportant release ;-)
I hope very much that somebody will make a release soon. No matter how small
it is.
Regards
Dennis
unit DKC_IA32_MM_Unit;
//Version 0.9 23-5-2005
interface
implementation
uses
Windows;
type
TAllocType = record
PBlock : Pointer;
InternalSize : Integer;
ExternalSize : Integer;
SmallAlloc : Boolean;
end;
TAllocTypeArray = array[0..1000000] of TAllocType;
PAllocTypeArray = ^TAllocTypeArray;
var
Heap: THandle;
RTLCriticalSection : TRTLCriticalSection;
LastUsedIndexGlobal : Integer;
AllocArraySize : integer;
NoOfLivePointers: integer;
AllocArray : PAllocTypeArray;
const
ALLOCARRAYINITIALSIZE : Integer = 1000;
SPLITSIZE : Integer = 100*1024*1024;
FLAGS : Cardinal = HEAP_NO_SERIALIZE;
SMALLMOVESIZE = 36;//For JOH Move
SHRINKSIZE : Integer = 200;
GROWSIZE : Integer = 200;
MAXNOOFDEFRAGROUNDS : Integer = 100;
//For upsize
OVERALLOCPERCENTAGESMALLUPSIZE : Double = 2;
OVERALLOCPERCENTAGEBIGUPSIZE : Double = 1.1;
OVERALLOCEXTRA : Integer = 32;
//For downsize
OVERALLOCPERCENTAGESMALLDOWNSIZE : Double = 1/2;
OVERALLOCPERCENTAGEBIGDOWNSIZE : Double = 1/1.1;
{-------------------------------------------------------------------------}
{Perform Forward Move of 0..36 Bytes}
{On Entry, ECX = Count, EAX = Source+Count, EDX = Dest+Count. Destroys ECX}
procedure SmallForwardMove_9;
asm
jmp dword ptr [@@FwdJumpTable+ecx*4]
nop {Align Jump Table}
@@FwdJumpTable:
dd @@Done {Removes need to test for zero size move}
dd @@Fwd01, @@Fwd02, @@Fwd03, @@Fwd04, @@Fwd05, @@Fwd06, @@Fwd07,
@@Fwd08
dd @@Fwd09, @@Fwd10, @@Fwd11, @@Fwd12, @@Fwd13, @@Fwd14, @@Fwd15,
@@Fwd16
dd @@Fwd17, @@Fwd18, @@Fwd19, @@Fwd20, @@Fwd21, @@Fwd22, @@Fwd23,
@@Fwd24
dd @@Fwd25, @@Fwd26, @@Fwd27, @@Fwd28, @@Fwd29, @@Fwd30, @@Fwd31,
@@Fwd32
dd @@Fwd33, @@Fwd34, @@Fwd35, @@Fwd36
@@Fwd36:
mov ecx, [eax-36]
mov [edx-36], ecx
@@Fwd32:
mov ecx, [eax-32]
mov [edx-32], ecx
@@Fwd28:
mov ecx, [eax-28]
mov [edx-28], ecx
@@Fwd24:
mov ecx, [eax-24]
mov [edx-24], ecx
@@Fwd20:
mov ecx, [eax-20]
mov [edx-20], ecx
@@Fwd16:
mov ecx, [eax-16]
mov [edx-16], ecx
@@Fwd12:
mov ecx, [eax-12]
mov [edx-12], ecx
@@Fwd08:
mov ecx, [eax-8]
mov [edx-8], ecx
@@Fwd04:
mov ecx, [eax-4]
mov [edx-4], ecx
ret
nop
@@Fwd35:
mov ecx, [eax-35]
mov [edx-35], ecx
@@Fwd31:
mov ecx, [eax-31]
mov [edx-31], ecx
@@Fwd27:
mov ecx, [eax-27]
mov [edx-27], ecx
@@Fwd23:
mov ecx, [eax-23]
mov [edx-23], ecx
@@Fwd19:
mov ecx, [eax-19]
mov [edx-19], ecx
@@Fwd15:
mov ecx, [eax-15]
mov [edx-15], ecx
@@Fwd11:
mov ecx, [eax-11]
mov [edx-11], ecx
@@Fwd07:
mov ecx, [eax-7]
mov [edx-7], ecx
mov ecx, [eax-4]
mov [edx-4], ecx
ret
nop
@@Fwd03:
movzx ecx, word ptr [eax-3]
mov [edx-3], cx
movzx ecx, byte ptr [eax-1]
mov [edx-1], cl
ret
@@Fwd34:
mov ecx, [eax-34]
mov [edx-34], ecx
@@Fwd30:
mov ecx, [eax-30]
mov [edx-30], ecx
@@Fwd26:
mov ecx, [eax-26]
mov [edx-26], ecx
@@Fwd22:
mov ecx, [eax-22]
mov [edx-22], ecx
@@Fwd18:
mov ecx, [eax-18]
mov [edx-18], ecx
@@Fwd14:
mov ecx, [eax-14]
mov [edx-14], ecx
@@Fwd10:
mov ecx, [eax-10]
mov [edx-10], ecx
@@Fwd06:
mov ecx, [eax-6]
mov [edx-6], ecx
@@Fwd02:
movzx ecx, word ptr [eax-2]
mov [edx-2], cx
ret
nop
nop
nop
@@Fwd33:
mov ecx, [eax-33]
mov [edx-33], ecx
@@Fwd29:
mov ecx, [eax-29]
mov [edx-29], ecx
@@Fwd25:
mov ecx, [eax-25]
mov [edx-25], ecx
@@Fwd21:
mov ecx, [eax-21]
mov [edx-21], ecx
@@Fwd17:
mov ecx, [eax-17]
mov [edx-17], ecx
@@Fwd13:
mov ecx, [eax-13]
mov [edx-13], ecx
@@Fwd09:
mov ecx, [eax-9]
mov [edx-9], ecx
@@Fwd05:
mov ecx, [eax-5]
mov [edx-5], ecx
@@Fwd01:
movzx ecx, byte ptr [eax-1]
mov [edx-1], cl
ret
@@Done:
end; {SmallForwardMove}
{-------------------------------------------------------------------------}
{Perform Backward Move of 0..36 Bytes}
{On Entry, ECX = Count, EAX = Source, EDX = Dest. Destroys ECX}
procedure SmallBackwardMove_9;
asm
jmp dword ptr [@@BwdJumpTable+ecx*4]
nop {Align Jump Table}
@@BwdJumpTable:
dd @@Done {Removes need to test for zero size move}
dd @@Bwd01, @@Bwd02, @@Bwd03, @@Bwd04, @@Bwd05, @@Bwd06, @@Bwd07,
@@Bwd08
dd @@Bwd09, @@Bwd10, @@Bwd11, @@Bwd12, @@Bwd13, @@Bwd14, @@Bwd15,
@@Bwd16
dd @@Bwd17, @@Bwd18, @@Bwd19, @@Bwd20, @@Bwd21, @@Bwd22, @@Bwd23,
@@Bwd24
dd @@Bwd25, @@Bwd26, @@Bwd27, @@Bwd28, @@Bwd29, @@Bwd30, @@Bwd31,
@@Bwd32
dd @@Bwd33, @@Bwd34, @@Bwd35, @@Bwd36
@@Bwd36:
mov ecx, [eax+32]
mov [edx+32], ecx
@@Bwd32:
mov ecx, [eax+28]
mov [edx+28], ecx
@@Bwd28:
mov ecx, [eax+24]
mov [edx+24], ecx
@@Bwd24:
mov ecx, [eax+20]
mov [edx+20], ecx
@@Bwd20:
mov ecx, [eax+16]
mov [edx+16], ecx
@@Bwd16:
mov ecx, [eax+12]
mov [edx+12], ecx
@@Bwd12:
mov ecx, [eax+8]
mov [edx+8], ecx
@@Bwd08:
mov ecx, [eax+4]
mov [edx+4], ecx
@@Bwd04:
mov ecx, [eax]
mov [edx], ecx
ret
nop
nop
nop
@@Bwd35:
mov ecx, [eax+31]
mov [edx+31], ecx
@@Bwd31:
mov ecx, [eax+27]
mov [edx+27], ecx
@@Bwd27:
mov ecx, [eax+23]
mov [edx+23], ecx
@@Bwd23:
mov ecx, [eax+19]
mov [edx+19], ecx
@@Bwd19:
mov ecx, [eax+15]
mov [edx+15], ecx
@@Bwd15:
mov ecx, [eax+11]
mov [edx+11], ecx
@@Bwd11:
mov ecx, [eax+7]
mov [edx+7], ecx
@@Bwd07:
mov ecx, [eax+3]
mov [edx+3], ecx
mov ecx, [eax]
mov [edx], ecx
ret
nop
nop
nop
@@Bwd03:
movzx ecx, word ptr [eax+1]
mov [edx+1], cx
movzx ecx, byte ptr [eax]
mov [edx], cl
ret
nop
nop
@@Bwd34:
mov ecx, [eax+30]
mov [edx+30], ecx
@@Bwd30:
mov ecx, [eax+26]
mov [edx+26], ecx
@@Bwd26:
mov ecx, [eax+22]
mov [edx+22], ecx
@@Bwd22:
mov ecx, [eax+18]
mov [edx+18], ecx
@@Bwd18:
mov ecx, [eax+14]
mov [edx+14], ecx
@@Bwd14:
mov ecx, [eax+10]
mov [edx+10], ecx
@@Bwd10:
mov ecx, [eax+6]
mov [edx+6], ecx
@@Bwd06:
mov ecx, [eax+2]
mov [edx+2], ecx
@@Bwd02:
movzx ecx, word ptr [eax]
mov [edx], cx
ret
nop
@@Bwd33:
mov ecx, [eax+29]
mov [edx+29], ecx
@@Bwd29:
mov ecx, [eax+25]
mov [edx+25], ecx
@@Bwd25:
mov ecx, [eax+21]
mov [edx+21], ecx
@@Bwd21:
mov ecx, [eax+17]
mov [edx+17], ecx
@@Bwd17:
mov ecx, [eax+13]
mov [edx+13], ecx
@@Bwd13:
mov ecx, [eax+9]
mov [edx+9], ecx
@@Bwd09:
mov ecx, [eax+5]
mov [edx+5], ecx
@@Bwd05:
mov ecx, [eax+1]
mov [edx+1], ecx
@@Bwd01:
movzx ecx, byte ptr[eax]
mov [edx], cl
ret
nop
nop
@@Done:
end; {SmallBackwardMove}
{-------------------------------------------------------------------------}
{Move ECX Bytes from EAX to EDX, where EAX > EDX and ECX > 36
(SMALLMOVESIZE)}
procedure Forwards_IA32_9;
asm
push edx
fild qword ptr [eax]
lea eax, [eax+ecx-8]
lea ecx, [ecx+edx-8]
fild qword ptr [eax]
push ecx
neg ecx
and edx, -8
lea ecx, [ecx+edx+8]
pop edx
@FwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
add ecx, 8
jl @FwdLoop
fistp qword ptr [edx]
pop edx
fistp qword ptr [edx]
end; {Forwards_IA32}
{-------------------------------------------------------------------------}
{Move ECX Bytes from EAX to EDX, where EAX < EDX and ECX > 36
(SMALLMOVESIZE)}
procedure Backwards_IA32_9;
asm
sub ecx, 8
push ecx
fild qword ptr [eax+ecx] {Last 8}
fild qword ptr [eax] {First 8}
add ecx, edx
and ecx, -8
sub ecx, edx
@BwdLoop:
fild qword ptr [eax+ecx]
fistp qword ptr [edx+ecx]
sub ecx, 8
jg @BwdLoop
pop ecx
fistp qword ptr [edx] {First 8}
fistp qword ptr [edx+ecx] {Last 8}
end; {Backwards_IA32}
{-------------------------------------------------------------------------}
{Move using IA32 Instruction Set Only}
procedure MoveJOH_IA32_9(const Source; var Dest; Count : Integer);
asm
cmp ecx, SMALLMOVESIZE
ja @Large {Count > SMALLMOVESIZE or Count < 0}
cmp eax, edx
jbe @SmallCheck
add eax, ecx
add edx, ecx
jmp SmallForwardMove_9
@SmallCheck:
jne SmallBackwardMove_9
ret {For Compatibility with Delphi's move for Source = Dest}
@Large:
jng @Done {For Compatibility with Delphi's move for Count < 0}
cmp eax, edx
ja Forwards_IA32_9
je @Done {For Compatibility with Delphi's move for Source = Dest}
sub edx, ecx
cmp eax, edx
lea edx, [edx+ecx]
jna Forwards_IA32_9
jmp Backwards_IA32_9 {Source/Dest Overlap}
@Done:
end; {MoveJOH_IA32}
procedure InitializeAllocArray(StartIndex, StopIndex : Integer);
var
Index : Integer;
begin
{ if (StartIndex < 0) or (StartIndex > AllocArraySize-1) then
RunError(203);
if (StopIndex < 0) or (StopIndex > AllocArraySize-1) then
RunError(203);}
for Index := StartIndex to StopIndex do
begin
AllocArray[Index].PBlock := nil;
//AllocArray[Index].Active := False;
AllocArray[Index].InternalSize := 0;
AllocArray[Index].ExternalSize := 0;
AllocArray[Index].SmallAlloc := True
end;
end;
function GrowAllocArray(var AllocArray : PAllocTypeArray; OldSize : Integer)
: Integer;
var
NewSize : Integer;
begin
NewSize := OldSize + GROWSIZE;
AllocArray := HeapRealloc(Heap, FLAGS, AllocArray , NewSize *
SizeOf(TAllocType));
if AllocArray = nil then
begin
//Try with a smaller GROWSIZE
NewSize := OldSize + 1;
AllocArray := HeapRealloc(Heap, FLAGS, AllocArray , NewSize *
SizeOf(TAllocType));
if AllocArray <> nil then
begin
InitializeAllocArray(OldSize, NewSize-1);
Result := NewSize;
end
else
begin
RunError(203);
Result := 0;//For compiler
end;
end
else
begin
InitializeAllocArray(OldSize, NewSize-1);
Result := NewSize;
end;
end;
function ShrinkAllocArray(var AllocArray : PAllocTypeArray; OldSize :
Integer) : Integer;
var
HighIndex, NewSize, Index, LowIndex, NoOfDefragRounds : Integer;
begin
NoOfDefragRounds := 0;
HighIndex := AllocArraySize;
LowIndex := 0;
repeat
//Find highest used index
repeat
Dec(HighIndex);
until(AllocArray[HighIndex].PBlock <> nil);
//Find lowest unused index
repeat
Inc(LowIndex);
until(AllocArray[LowIndex].PBlock = nil);
if LowIndex < HighIndex then
begin
//Copy HighIndex to LowIndex
AllocArray[LowIndex].PBlock := AllocArray[HighIndex].PBlock;
AllocArray[LowIndex].InternalSize := AllocArray[HighIndex].InternalSize;
AllocArray[LowIndex].ExternalSize := AllocArray[HighIndex].ExternalSize;
AllocArray[LowIndex].SmallAlloc := AllocArray[HighIndex].SmallAlloc;
//Clear HighIndex
AllocArray[HighIndex].PBlock := nil;
AllocArray[HighIndex].InternalSize := 0;
AllocArray[HighIndex].ExternalSize := 0;
AllocArray[HighIndex].SmallAlloc := True
end
else
//No more to defrag
Break;
Inc(NoOfDefragRounds)
until(NoOfDefragRounds >= MAXNOOFDEFRAGROUNDS);
//Find highest used index
Index := HighIndex;
repeat
Dec(Index);
until(AllocArray[Index].PBlock <> nil);
NewSize := OldSize - SHRINKSIZE;
//Do not shrink below any used entries
if NewSize < Index+1 then
NewSize := Index+1;
if NewSize < OldSize then
begin
AllocArray := HeapRealloc(Heap, FLAGS, AllocArray , NewSize *
SizeOf(TAllocType));
LastUsedIndexGlobal := NewSize-1;
end;
if AllocArray = nil then
RunError(203);
Result := NewSize;
end;
function GetSize(Index : Integer) : Integer; overload;
begin
{if (Index < 0) or (Index > AllocArraySize-1) then
begin
Result := 0; //For compiler
RunError(203);
Exit;
end;}
Result := AllocArray[Index].InternalSize;
end;
function GetExternalSize(Index : Integer) : Integer;
begin
{if (Index < 0) or (Index > AllocArraySize-1) then
begin
Result := 0; //For compiler
RunError(203);
Exit;
end;}
Result := AllocArray[Index].ExternalSize;
end;
procedure SetSize(Index, NewSize, ExternalSize : Integer);
begin
{ if (Index < 0) or (Index > AllocArraySize-1) then
begin
RunError(203);
Exit;
end;}
AllocArray[Index].InternalSize := NewSize;
AllocArray[Index].ExternalSize := ExternalSize;
end;
function GetSmallAlloc(Index : Integer) : Boolean; overload;
begin
{ if (Index < 0) or (Index > AllocArraySize-1) then
begin
Result := True; //For compiler
RunError(203);
Exit;
end;}
Result := AllocArray[Index].SmallAlloc;
end;
procedure AddToAllocTypeArray(P : Pointer; SmallAlloc : Boolean; Size,
ExternalSize : Integer);
var
Index1, Index2 : Integer;
begin
Inc(NoOfLivePointers);
if NoOfLivePointers > AllocArraySize then
AllocArraySize := GrowAllocArray(AllocArray, AllocArraySize);
Index1 := LastUsedIndexGlobal;
Index2 := LastUsedIndexGlobal+1;
if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;
repeat
//if (AllocArray[Index1].Active = False) then
if (AllocArray[Index1].PBlock = nil) then
begin
LastUsedIndexGlobal := Index1;
AllocArray[Index1].PBlock := P;
AllocArray[Index1].InternalSize := Size;
AllocArray[Index1].ExternalSize := ExternalSize;
AllocArray[Index1].SmallAlloc := SmallAlloc;
Exit;
end;
if (AllocArray[Index2].PBlock = nil) then
begin
LastUsedIndexGlobal := Index2;
AllocArray[Index2].PBlock := P;
AllocArray[Index2].InternalSize := Size;
AllocArray[Index2].ExternalSize := ExternalSize;
AllocArray[Index2].SmallAlloc := SmallAlloc;
Exit;
end;
Dec(Index1);
Inc(Index2);
if (Index1 < 0) and (Index2 > AllocArraySize-1) then
begin
//Did not find space for pointer
RunError(203);
Exit;
end;
if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;
until(False);
end;
procedure RemoveFromAllocTypeArray(Index : Integer); overload;
begin
{ if (Index < 0) or (Index > AllocArraySize-1) then
begin
RunError(203);
end;}
AllocArray[Index].PBlock := nil;
AllocArray[Index].InternalSize := 0;
AllocArray[Index].ExternalSize := 0;
AllocArray[Index].SmallAlloc := True;
Dec(NoOfLivePointers);
if NoOfLivePointers < AllocArraySize-SHRINKSIZE then
AllocArraySize := ShrinkAllocArray(AllocArray, AllocArraySize);
end;
function GetIndex(P : Pointer) : Integer;
var
Index1, Index2 : Integer;
begin
if LastUsedIndexGlobal > AllocArraySize-2 then
LastUsedIndexGlobal := AllocArraySize-2;
Index1 := LastUsedIndexGlobal;
Index2 := LastUsedIndexGlobal+1;
{ if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;}
repeat
if (AllocArray[Index1].PBlock = P) then
begin
LastUsedIndexGlobal := Index1;
Result := Index1;
Exit;
end;
if (AllocArray[Index2].PBlock = P) then
begin
LastUsedIndexGlobal := Index2;
Result := Index2;
Exit;
end;
Dec(Index1);
Inc(Index2);
if (Index1 < 0) and (Index2 > AllocArraySize-1) then
begin
//Did not find pointer
RunError(203);
Result := -1;
Exit;
end;
if Index1 < 0 then
Index1 := 0;
if Index2 > AllocArraySize-1 then
Index2 := AllocArraySize-1;
until(False);
end;
function DKCGetMem(Size: Integer): Pointer;
var
OverSize : Integer;
begin
EnterCriticalSection(RTLCriticalSection);
if Size < SPLITSIZE then
begin
OverSize := Size + OVERALLOCEXTRA;
Result := HeapAlloc(Heap, FLAGS, OverSize);
AddToAllocTypeArray(Result, True, OverSize, Size);
end
else
begin
Result := VirtualAlloc(nil, Size, MEM_COMMIT+MEM_TOP_DOWN,
PAGE_READWRITE);
AddToAllocTypeArray(Result, False, Size, Size);
end;
LeaveCriticalSection(RTLCriticalSection);
end;
function DKCFreeMem(Ptr: Pointer): Integer;
var
Res : Boolean;
Index : Integer;
begin
EnterCriticalSection(RTLCriticalSection);
Index := GetIndex(Ptr);
if GetSmallAlloc(Index) then
begin
if HeapFree(Heap, FLAGS, Ptr) then
begin
RemoveFromAllocTypeArray(Index);
Result := 0;
if HeapCompact(Heap, FLAGS) = 0 then
RunError(203);
end
else
Result := 1;
end
else
begin
Res := VirtualFree(Ptr, 0, MEM_RELEASE);
if Res then
begin
RemoveFromAllocTypeArray(Index);
Result := 0;
end
else
Result := 1;
end;
LeaveCriticalSection(RTLCriticalSection);
end;
function DKCReallocMem(Ptr: Pointer; Size: Integer): Pointer;
var
OldIndex, OldSize, NewSize, OldExternalSize, NewOverSize : Integer;
NewPtr, OldPtr : Pointer;
begin
EnterCriticalSection(RTLCriticalSection);
NewSize := Size;
OldPtr := Ptr;
OldIndex := GetIndex(Ptr);
OldSize := GetSize(OldIndex);
if GetSmallAlloc(OldIndex) then
begin
if NewSize < SPLITSIZE then
begin
//Realloc small as small
if (NewSize > OldSize) then //Upsize
begin
//Alloc more than requested
NewOverSize := Round(NewSize * OVERALLOCPERCENTAGESMALLUPSIZE) +
OVERALLOCEXTRA;
Result := HeapRealloc(Heap, FLAGS, OldPtr, NewOverSize);
end
else if (NewSize < Round(OldSize * OVERALLOCPERCENTAGESMALLDOWNSIZE) -
OVERALLOCEXTRA) then //Downsize
begin
//Allocate requested size
NewOverSize := NewSize;
Result := HeapRealloc(Heap, FLAGS, OldPtr, NewOverSize);
end
else
begin
//OverSize did not change because no realloc took place
NewOverSize := OldSize;
Result := OldPtr;
end;
if Result = OldPtr then
begin
SetSize(OldIndex, NewOverSize, NewSize);
end
else
begin
RemoveFromAllocTypeArray(OldIndex);
AddToAllocTypeArray(Result, True, NewOverSize, NewSize);
end;
end
else
begin
//Realloc small as big
//Get new block
Result := VirtualAlloc(nil, NewSize, MEM_COMMIT+MEM_TOP_DOWN,
PAGE_READWRITE);
NewPtr := Result;
if Result <> nil then
begin
AddToAllocTypeArray(NewPtr, False, NewSize, NewSize);
OldExternalSize := GetExternalSize(OldIndex);
MoveJOH_IA32_9(OldPtr^, NewPtr^, OldExternalSize);
if not HeapFree(Heap, FLAGS, OldPtr) then
RunError(203)
else
RemoveFromAllocTypeArray(OldIndex);
end
else
RunError(203);
end;
end
else
begin
if Size > SPLITSIZE then
begin
//Realloc big as big
//Is realloc needed
if NewSize > OldSize then //Upsize
begin
//Alloc more than requested
NewOverSize := Round(NewSize * OVERALLOCPERCENTAGEBIGUPSIZE);
Result := VirtualAlloc(OldPtr, NewOverSize, MEM_COMMIT or
MEM_TOP_DOWN, PAGE_READWRITE);
end
else if (NewSize < Round(OldSize * OVERALLOCPERCENTAGEBIGDOWNSIZE))
then //Downsize
begin
//Alloc more than requested
NewOverSize := Round(NewSize * OVERALLOCPERCENTAGEBIGUPSIZE);
Result := VirtualAlloc(OldPtr, NewOverSize, MEM_COMMIT or
MEM_TOP_DOWN, PAGE_READWRITE);
end
else
begin
//OverSize did not change because no realloc took place
NewOverSize := OldSize;
Result := OldPtr;
end;
if Result <> nil then
begin
if Result = OldPtr then
begin
SetSize(OldIndex, NewOverSize, NewSize);
end
else
begin
NewPtr := Result;
if NewSize > OldSize then
begin
OldExternalSize := GetExternalSize(OldIndex);
Move(OldPtr^, NewPtr^, OldExternalSize);
end;
RemoveFromAllocTypeArray(OldIndex);
AddToAllocTypeArray(Result, False, NewOverSize, NewSize);
end;
end
else
begin
//Realloc failed. Try get a new block
Result := VirtualAlloc(nil, NewOverSize, MEM_COMMIT+MEM_TOP_DOWN,
PAGE_READWRITE);
NewPtr := Result;
if Result <> nil then
begin
AddToAllocTypeArray(NewPtr, False, NewOverSize, NewSize);
OldExternalSize := GetExternalSize(OldIndex);
MoveJOH_IA32_9(OldPtr^, NewPtr^, OldExternalSize);
if not VirtualFree(OldPtr, 0, MEM_RELEASE) then
RunError(203)
else
RemoveFromAllocTypeArray(OldIndex);
end
else
RunError(203);
end;
end
else
begin
//Realloc big as small
Result := HeapAlloc(Heap, FLAGS, NewSize);
NewPtr := Result;
AddToAllocTypeArray(NewPtr, True, NewSize, NewSize);
Move(OldPtr^, NewPtr^, NewSize);
if not VirtualFree(OldPtr, 0, MEM_RELEASE) then
RunError(203)
else
RemoveFromAllocTypeArray(OldIndex);
end;
end;
LeaveCriticalSection(RTLCriticalSection);
end;
procedure InitMemoryManager;
resourcestring
sError = 'DKC_IA32_MM_Unit must be first unit used by the project';
var
MemMgr: TMemoryManager;
begin
NoOfLivePointers := 0;
Heap := HeapCreate(FLAGS, 0, 0);
if Heap = 0 then
RunError(203); // out of memory
Assert(AllocMemCount = 0, sError);
MemMgr.GetMem := DKCGetMem;
MemMgr.FreeMem := DKCFreeMem;
MemMgr.ReallocMem := DKCReallocMem;
SetMemoryManager(MemMgr);
InitializeCriticalSection(RTLCriticalSection);
EnterCriticalSection(RTLCriticalSection);
AllocArraySize := GROWSIZE;
AllocArray := HeapAlloc(Heap, FLAGS, AllocArraySize * SizeOf(TAllocType));
InitializeAllocArray(0, AllocArraySize-1);
LeaveCriticalSection(RTLCriticalSection);
end;
initialization
InitMemoryManager;
finalization
if Heap <> 0 then
HeapDestroy(Heap);
DeleteCriticalSection(RTLCriticalSection);
end.
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Tue May 24, 2005 1:02 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi
Released in attachments
This is done:
-New DKCIA32MM.
-Validate11 fix
-Validate25 = Multithread validation (no cross thread interaction) added
-SortIntegerArrayBenchmark added (-bmMemoryAccessSpeed benchmark with non
serial memory access pattern)
-Validate26 added
Regards
Dennis
|
|
| Back to top |
|
 |
Dan Downs Guest
|
Posted: Tue May 24, 2005 3:38 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
| Quote: | Released in attachments
|
Oh thats just great!..... I just get done running all the 0.38 benchmarks on
a dual P3 500 and a new version gets posted. <g> Time to recompile.
The DKCIA32 MM and the EWCMM seemed to deadlock. Unfortunitly I didn't write
down which benchmark each one was on, so I'm rerunning them now.
Here's a summary of results from a dual P3 500mhz windows 2003 server, 768mb
ram. The dual 3ghz xeon is gone.
Average Total Performance: (Scaled so that the winner = 100%)
BucketMM : 97.0
BucketMM_Asm : 100.0
FastMM2 : 39.5
FastMM3 : 95.1
NexusMM : 67.5
RTLMM : 39.8
TopMM : 44.6
WinMem : 35.0
Average Speed Performance: (Scaled so that the winner = 100%)
BucketMM : 94.8
BucketMM_Asm : 100.0
FastMM2 : 29.7
FastMM3 : 93.0
NexusMM : 55.9
RTLMM : 29.7
TopMM : 56.4
WinMem : 44.7
Average Memory Performance: (Scaled so that the winner = 100%)
BucketMM : 99.2
BucketMM_Asm : 100.0
FastMM2 : 46.7
FastMM3 : 97.3
NexusMM : 76.7
RTLMM : 47.7
TopMM : 25.6
WinMem : 19.4
DD
|
|
| Back to top |
|
 |
Dennis Guest
|
Posted: Tue May 24, 2005 6:55 pm Post subject: Re: Fastcode MM B&V 0.39 |
|
|
Hi Dan
| Quote: | Oh thats just great!..... I just get done running all the 0.38 benchmarks
on
a dual P3 500 and a new version gets posted. <g> Time to recompile.
|
It always goes like this ;-)
| Quote: | The DKCIA32 MM and the EWCMM seemed to deadlock. Unfortunitly I didn't
write
down which benchmark each one was on, so I'm rerunning them now.
|
DKCIA32 MM is just very slow in some benchmarks and very very slow in others
:-)
Regards
Dennis
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|