BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Optimisation

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM
View previous topic :: View next topic  
Author Message
Sanyin
Guest





PostPosted: Tue May 15, 2007 7:21 pm    Post subject: Optimisation Reply with quote



I want to optimize this:
Any tips?MMX, SSE or asm?

function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;

val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then r:= 0 else
if val > 255 then r:=255 else
r:=round(val + 0.5);

val := _y - 0.34414 * (_cb - 128) - 0.71414 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then g:=0 else
if val > 255 then g:=255 else
g:=round(val + 0.5);

val := _y + 1.772 * (_cb - 128) - _k;
val := (val - 128) *gm + 128;
if val < 0 then b:= 0 else
if val > 255 then b:=255 else
b:=round(val + 0.5);

a:=255;
end
end ;
Back to top
Bob Gonder
Guest





PostPosted: Tue May 15, 2007 11:34 pm    Post subject: Re: Optimisation Reply with quote



Sanyin wrote:

Quote:
I want to optimize this:
Any tips?MMX, SSE or asm?

I would start by optimizing the basic math:

Quote:
function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;

val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;

Expand
val= (255 - Y + 1.402 * (255 - Cr - 128) - 220 + K -128 ) * 0.65 + 128
Rearange
val= ( K - Y + 1.402 * (255-128 - Cr) +255- 220 -128 ) * 0.65 + 128
Simplify
val= ( K - Y + 1.402 * (127 - Cr) -93 ) * 0.65 + 128
Comute
val= ( K - Y + 1.402 * (127)+1.402 * ( - Cr) -93 ) * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr -93+178.054 ) * 0.65 + 128
Simplify/Comute
val= ( K - Y -1.402 *Cr)*0.65 + 85.054 * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr)*0.65 + 183.2851
Simplify
Integer R := round(( K - Y -1.402 *Cr)*0.65 + 183.7851);
if R < 0 then r:= 0 else
if R > 255 then r:=255 else
r := R;

Then I'd look into getting rid of the floats.
(1000* K - 1000*Y-1000*1.402*Cr)*100*0.65+100*1000*183.7851)/100/1000;
Or
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000

But that adds one multiply and one divide which might negate the
advantage over float. If you could come up with a workable
power-of-two instead of 100 and 1000, then the extra mul and div would
become efficient shifts.
Back to top
Davy Landman
Guest





PostPosted: Wed May 16, 2007 1:42 am    Post subject: Re: Optimisation Reply with quote



Quote:
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
and in Delphi you could even make that faster by replacing the /100000 by

*(1/100000)

Kind regards,
Davy
Back to top
sdasdasf
Guest





PostPosted: Wed May 16, 2007 1:42 am    Post subject: Re: Optimisation Reply with quote

"Bob Gonder" <notbg (AT) notmindspring (DOT) invalid> wrote in message
news:5orj43l661ajbj8lvqm9ivj20tsfqpik9a (AT) 4ax (DOT) com...
Quote:
Sanyin wrote:

I want to optimize this:
Any tips?MMX, SSE or asm?

I would start by optimizing the basic math:

function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;

val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;

Expand
val= (255 - Y + 1.402 * (255 - Cr - 128) - 220 + K -128 ) * 0.65 + 128
Rearange
val= ( K - Y + 1.402 * (255-128 - Cr) +255- 220 -128 ) * 0.65 + 128
Simplify
val= ( K - Y + 1.402 * (127 - Cr) -93 ) * 0.65 + 128
Comute
val= ( K - Y + 1.402 * (127)+1.402 * ( - Cr) -93 ) * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr -93+178.054 ) * 0.65 + 128
Simplify/Comute
val= ( K - Y -1.402 *Cr)*0.65 + 85.054 * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr)*0.65 + 183.2851
Simplify
Integer R := round(( K - Y -1.402 *Cr)*0.65 + 183.7851);
if R < 0 then r:= 0 else
if R > 255 then r:=255 else
r := R;

Then I'd look into getting rid of the floats.
(1000* K - 1000*Y-1000*1.402*Cr)*100*0.65+100*1000*183.7851)/100/1000;
Or
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000

But that adds one multiply and one divide which might negate the
advantage over float. If you could come up with a workable
power-of-two instead of 100 and 1000, then the extra mul and div would
become efficient shifts.

Fixed math? (16.16)
Back to top
Bob Gonder
Guest





PostPosted: Wed May 16, 2007 1:47 am    Post subject: Re: Optimisation Reply with quote

Davy Landman wrote:

Quote:
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
and in Delphi you could even make that faster by replacing the /100000 by
*(1/100000)

Ummm,, except we were trying to get away from floats?
Back to top
Davy Landman
Guest





PostPosted: Wed May 16, 2007 3:48 am    Post subject: Re: Optimisation Reply with quote

Quote:
Ummm,, except we were trying to get away from floats?
Very true!


But in case the 1000 must remain, than the multiplication is faster..
(shameless plug:
http://delphi-snippets.blogspot.com/2005/08/getting-possibly-500-speed-gain-on.html)

I agree that if it can be replaced by a shift... that's allot faster :)

Regards,
Davy
Back to top
Michael Hansen
Guest





PostPosted: Wed May 16, 2007 1:54 pm    Post subject: Re: Optimisation Reply with quote

Bob Gonder wrote:
Quote:
...
Then I'd look into getting rid of the floats.
(1000* K - 1000*Y-1000*1.402*Cr)*100*0.65+100*1000*183.7851)/100/1000;
Or
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000

But that adds one multiply and one divide which might negate the
advantage over float. If you could come up with a workable
power-of-two instead of 100 and 1000, then the extra mul and div would
become efficient shifts.

Since the result channels are bytes, my guess is that the precision of those
constants are not really going to have influence anyway. Choosing another
fixed point precision level could very well work then (with rounding taken
into account).

/Michael
Back to top
Nils Haeck
Guest





PostPosted: Wed May 16, 2007 4:51 pm    Post subject: Re: Optimisation Reply with quote

From my Jpeg library:

No multiplies, just int additions, shifts and table lookups. If you want to
use this code you'll have to acknowledge with my name and company (Nils
Haeck, www.simdesign.nl)

(call InitYCbCrToRGBTables in initialization section)

Hope that helps,

Nils

const
cColorConvScale = 1 shl 10;
c__toR = Round( -179.456 * cColorConvScale);
c__toG = Round(135.53664 * cColorConvScale);
c__toB = Round( -226.816 * cColorConvScale);

var
cY_toRT: array[0..255] of integer;
cCrtoRT: array[0..255] of integer;
cCbtoGT: array[0..255] of integer;
cCrtoGT: array[0..255] of integer;
cCbtoBT: array[0..255] of integer;

procedure InitYCbCrToRGBTables;
var
i: integer;
begin
for i := 0 to 255 do begin
cY_toRT[i] := Round( 1 * cColorConvScale * i);
cCrtoRT[i] := Round( 1.402 * cColorConvScale * i);
cCbtoGT[i] := Round( -0.34414 * cColorConvScale * i);
cCrtoGT[i] := Round( -0.71414 * cColorConvScale * i);
cCbtoBT[i] := Round( 1.772 * cColorConvScale * i);
end;
end;

procedure TsdTransformYCCKToRGB.Transform(Source, Dest: pointer; Count:
integer);
// YCCK is a colorspace where the CMY part of CMYK is first converted to
RGB, then
// transformed to YCbCr as usual. The K part is appended without any
changes.
// To transform back, we do the YCbCr -> RGB transform, then add K
var
R, G, B, Y, Cb, Cr, K: PByte;
Yi, Ki, Ri, Gi, Bi: integer;
function RangeLimit(A: integer): integer;
begin
// Delphi seems to convert the "div" here to SAR just fine (D7), so we
// don't use ASM but plain pascal
Result := A div (1 shl 10);
if Result < 0 then
Result := 0
else
if Result > 255 then
Result := 255;
end;
begin
Y := Source;
Cb := Source; inc(Cb);
Cr := Source; inc(Cr, 2);
K := Source; inc(K, 3);
// RGB is layed out in memory as BGR
B := Dest;
G := Dest; inc(G);
R := Dest; inc(R, 2);
// Repeat Count times..
while Count > 0 do
begin
// Do the conversion in int
Yi := cY_toRT[Y^];
Ki := cY_toRT[K^];
// The components seem to be inverted. Seems to be a Photoshop problem.
Ri := -Yi - cCrtoRT[Cr^] - c__toR + Ki;
Gi := -Yi - cCbToGT[Cb^] - cCrtoGT[Cr^] - c__toG + Ki;
Bi := -Yi - cCbtoBT[Cb^] - c__toB + Ki;
R^ := RangeLimit(Ri);
G^ := RangeLimit(Gi);
B^ := RangeLimit(Bi);
// Advance pointers
inc(Y, 4); inc(Cb, 4); inc(Cr, 4); inc(K, 4);
inc(R, 3); inc(G, 3); inc(B, 3);
dec(Count);
end;
end;



"Sanyin" <prevodilac (AT) hotmail (DOT) com> schreef in bericht
news:4649c1d8 (AT) newsgroups (DOT) borland.com...
Quote:
I want to optimize this:
Any tips?MMX, SSE or asm?

function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;

val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then r:= 0 else
if val > 255 then r:=255 else
r:=round(val + 0.5);

val := _y - 0.34414 * (_cb - 128) - 0.71414 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then g:=0 else
if val > 255 then g:=255 else
g:=round(val + 0.5);

val := _y + 1.772 * (_cb - 128) - _k;
val := (val - 128) *gm + 128;
if val < 0 then b:= 0 else
if val > 255 then b:=255 else
b:=round(val + 0.5);

a:=255;
end
end ;
Back to top
Sanyin
Guest





PostPosted: Wed May 16, 2007 6:20 pm    Post subject: Re: Optimisation OT Reply with quote

I dont know for that Adobe problem (inverted YCCK).
Saving with Photoshop 7, CS, CS2 etc. produces YCCK jpeg?
I think its correct, at least with my functions there are no problens
decoding it.
Back to top
Sanyin
Guest





PostPosted: Wed May 16, 2007 6:23 pm    Post subject: Re: Optimisation Reply with quote

"Nils Haeck" <bla (AT) bla (DOT) com> wrote in message
news:464aede1$1 (AT) newsgroups (DOT) borland.com...
Quote:
From my Jpeg library:

No multiplies, just int additions, shifts and table lookups. If you want
to use this code you'll have to acknowledge with my name and company (Nils
Haeck, www.simdesign.nl)

(call InitYCbCrToRGBTables in initialization section)

Your code makes very dark images!
Back to top
Nils Haeck
Guest





PostPosted: Thu May 17, 2007 3:46 am    Post subject: Re: Optimisation OT Reply with quote

Can you send me some test images by email?

Nils
Back to top
Nils Haeck
Guest





PostPosted: Thu May 17, 2007 4:00 am    Post subject: Re: Optimisation Reply with quote

Quote:
gm:=0.65;
_k := 220 - K;

Where does 0.65 and 220 come from? Did you find that in some documentation,
or did you experiment with these values? Do you have a link to any
documentation describing this?

Nils
Back to top
Nils Haeck
Guest





PostPosted: Thu May 17, 2007 4:48 am    Post subject: Re: Optimisation OT Reply with quote

This conversion should be correct.. can you test? It adds a few int
multiplications, which can be avoided by adding another set of tables. But I
don't want to do that before I know for sure this one works.

Nils

procedure TsdTransformYCCKToRGBAdobe.Transform(Source, Dest: pointer;
Count: integer);
// YCCK to RGB for Adobe images is different. First, the Y, Cr and Cb are
inverted,
// and k* = 220 - K. The normal YCbCr to RGB is then applied. As a last
step,
// the values are scaled by 0.65 around 128
const
c0_65: integer = round(0.65 * cColorConvScale);
c44_8: integer = round(44.8 * cColorConvScale);// 128 - 0.65 * 128
var
R, G, B, Y, Cb, Cr, K: PByte;
Yi, Ki, Ri, Gi, Bi, Cbi, Cri: integer;
function ScaleAndRangeLimit(A: integer): integer;
begin
// First the scaling
A := (A * c0_65) div cColorConvScale + c44_8;
// Undo fixed precision and range limit
Result := A div cColorConvScale;
if Result < 0 then
Result := 0
else
if Result > 255 then
Result := 255;
end;
begin
Y := Source;
Cb := Source; inc(Cb);
Cr := Source; inc(Cr, 2);
K := Source; inc(K, 3);
// RGB is layed out in memory as BGR
B := Dest;
G := Dest; inc(G);
R := Dest; inc(R, 2);
// Repeat Count times..
while Count > 0 do
begin
// Do the conversion in int
Yi := cY_toRT[255 - Y^];
Cbi := 255 - Cb^;
Cri := 255 - Cr^;
Ki := (220 - K^) * cColorConvScale;
Ri := Yi + cCrtoRT[Cri] + c__toR - Ki;
Gi := Yi + cCbToGT[Cbi] + cCrtoGT[Cri] + c__toG - Ki;
Bi := Yi + cCbtoBT[Cbi] + c__toB - Ki;
R^ := ScaleAndRangeLimit(Ri);
G^ := ScaleAndRangeLimit(Gi);
B^ := ScaleAndRangeLimit(Bi);
// Advance pointers
inc(Y, 4); inc(Cb, 4); inc(Cr, 4); inc(K, 4);
inc(R, 3); inc(G, 3); inc(B, 3);
dec(Count);
end;
end;
Back to top
Sanyin
Guest





PostPosted: Fri May 18, 2007 8:03 pm    Post subject: Re: Optimisation OT Reply with quote

I think it shold be done with color management.
Back to top
Nils Haeck
Guest





PostPosted: Sat May 19, 2007 8:11 am    Post subject: Re: Optimisation OT Reply with quote

Well most CMYK's I saw indeed contain an ICC colour profile. You can extract
that with my lib, but then you need something like Little CMS to actually
use it. ICC colour profiles in these CMYK Jpegs can be quite large.. spread
over e.g. 15 meta tags each containing 32K or so. So indeed there may be
valuable information in there :)

Nils
Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Language BASM All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.