BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

MM B&V Benchmark weights

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM
View previous topic :: View next topic  
Author Message
Robert Houdart
Guest





PostPosted: Thu Feb 24, 2005 1:43 am    Post subject: MM B&V Benchmark weights Reply with quote



Hi Pierre, Eric, Ivo and others,

To enable Dennis to release B&V 1.0, it may be time we fix some rules for
how to weigh the different benchmarks in the final results.
What follows is an attempt to get some meaningful results out of the growing
number of benchmarks.

There appear to be three main categories of benchmarks:
A) Reallocation (Realloc) benchmarks
B) Allocation (GetMem / FreeMem) benchmarks
C) Replay benchmarks of real-world applications
I would suggest a 20 % - 40 % - 40 % split in the final score for the three
categories.

All benchmarks are either run in a single-threaded or multi-threaded
context.
I would suggest a 70 % weight for multi-threaded and a 30 % for
single-threaded benchmarks.

This would give the following split:
- Single-thread Reallocation: 6 %
- Multi-thread Reallocation: 14 %
- Single-thread Allocation: 12 %
- Multi-thread Allocation: 28 %
- Single-thread real-world replay: 12 %
- Multi-thread real-world replay: 28 %

Each individual benchmark is calibrated to yield 100 % for the best
performing entry on a reference computer. The benchmark score obtained by
the MM is then multiplied by its relative importance in the total score.
As an example, if there are 2 multi-thread allocation benchmarks, each one
would count for 28 % / 2 = 14 % in the final score. If a MM would score 75 %
on the benchmark, it would count as 75 % of 14 % = 10.5 % in the final
score.

All this can very easily be programmed in the B&V tool, we only need to
assign a category to each individual Benchmark.

Any suggestions or comments ?

Robert


Back to top
Pierre le Riche
Guest





PostPosted: Thu Feb 24, 2005 6:35 am    Post subject: Re: MM B&V Benchmark weights Reply with quote



Hi Robert,

Quote:
To enable Dennis to release B&V 1.0, it may be time we fix some rules for
how to weigh the different benchmarks in the final results.

I think we also need to extend the scoring mechanism so that different
benchmarks can specify how the execution time and peak memory usage should
be factored into the score. For the fragmentation benchmarks, for example,
the execution time is of little importance to what is being tested. Also,
for the block up- and downsize tests the CPU time is not important.

I propose we change the base benchmark class so that it has a virtual method
that calculates the relative score given the clock count and peak address
space usage.

Regards,
Pierre



Back to top
Robert Houdart
Guest





PostPosted: Thu Feb 24, 2005 11:04 am    Post subject: Re: MM B&V Benchmark weights Reply with quote




"Pierre le Riche" <pleriche (AT) hotmail (DOT) com> wrote

Quote:
I think we also need to extend the scoring mechanism so that different
benchmarks can specify how the execution time and peak memory usage should
be factored into the score. For the fragmentation benchmarks, for example,
the execution time is of little importance to what is being tested. Also,
for the block up- and downsize tests the CPU time is not important.

I propose we change the base benchmark class so that it has a virtual
method that calculates the relative score given the clock count and peak
address space usage.

There's a real danger we'll fall into endless discussions about which factor
we should use for each individual benchmark. We can hardly expect MM authors
to be unbiased, or to have had the same experiences with real-world
applications.

As an example, I feel that speed does matter a lot in the fragmentation
benchmark. It mimics a very real situation that occurs quite often, e.g.
when using a TMemoryStream or with Midas.dll and the MidasMemPatch. Although
its original intent may have been to show that the RTL MM fragments badly,
its significance goes way beyond that.

For simplicity's sake, I suggest that we stick to the original formula.

Robert



Back to top
Pierre le Riche
Guest





PostPosted: Thu Feb 24, 2005 12:17 pm    Post subject: Re: MM B&V Benchmark weights Reply with quote

Hi Robert,

Quote:
For simplicity's sake, I suggest that we stick to the original formula.

Then some benchmarks need to be fixed to bring out the intended scoring bias
towards either speed or memory size. Many of the benchmarks are so short
that the variance in the CPU time is too large for accurate measurements.

For an application that never uses more than for example 64MB, memory usage
should rarely be an issue provided that the usage does not grow unbounded.
For such applications raw speed is by far the biggest consideration.

Conversely, for applications that moves around large chunks of data the
memory efficiency of the MM may be a bigger factor (avoiding disk swapping,
etc.).

Some MMs have an initial overhead of lookup tables, etc. I feel those MMs
are unfairly penalised throughout all the benchmarks for this cost -
especially in benchmarks that allocate very little memory. These lookup
tables are a one-time cost and should not weigh as heavily in the scoring as
they do. In situations where memory consumption is important (i.e. in
benchmarks that allocate a lot of memory) this one-time cost is usually
insignificant, yet these MMs are penalised in benchmarks where memory
consumption is largely irrelevant.

I had a look at your replay benchmarks. They allocate in the region of 50MB
maximum. Surely you wouldn't mind if usage was 20% more if it got you an
extra 10% in speed? If it allocated 1GB then it would have been an entirely
different situation and you would probably want to sacrifice some speed for
a significant reduction in memory usage. Unfortunately our scoring system
does not accommodate this.

So to summarise: Either we must drop the "relative performance" score, and
work with two separate scores - the address space usage and the speed, or we
must do a serious rethink of the way in which the relative performance is
calculated.

Having each benchmark calculate its own "relative performance" is still the
best option IMO. We could have the benchmark pick a weight between say 30%
and 70% for the memory usage and then the speed comprises the rest of the
score.

If there are no objections, I'll do this tonight. If we cannot agree on a
weighting for a particular benchmark, then we leave it at 50%/50%.

Regards,
Pierre



Back to top
Robert Houdart
Guest





PostPosted: Thu Feb 24, 2005 12:53 pm    Post subject: Re: MM B&V Benchmark weights Reply with quote

Hello Pierre,

Quote:
Some MMs have an initial overhead of lookup tables, etc. I feel those MMs
are unfairly penalised throughout all the benchmarks for this cost -


True, but fairly marginal. The benchmark overhead and the impact of previous
benchmarks are usually much larger than this.

Quote:
I had a look at your replay benchmarks. They allocate in the region of
50MB maximum.

In fact they allocate less than 10 MB, the overhead of the replay benchmark
is about 30 MB in this case. These replays really should be used in a
multi-threaded replay with between 8 and 16 of these threads running
simultaneously.

Quote:
So to summarise: Either we must drop the "relative performance" score, and
work with two separate scores - the address space usage and the speed, or
we must do a serious rethink of the way in which the relative performance
is calculated.

Having each benchmark calculate its own "relative performance" is still
the best option IMO. We could have the benchmark pick a weight between say
30% and 70% for the memory usage and then the speed comprises the rest of
the score.

A very good idea, if I understand well it involves the following:
1) Compute a relative speed performance,
2) Compute a relative memory performance,
3) Add up the two (instead of multiplying them, what we're currently doing).
Let's stick to 50% speed / 50 % performance, this avoids a lot of
discussions.

I like this a lot, will give a much fairer chance to MultiMM, TopMM etc.
They will no longer see their score divided by 10 simply because they did
not survive the previous benchmarks very well.

Looking forward to the next release Smile
Cheers,
Robert



Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.