 |
BorlandTalk.com Borland discussion newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Sanyin Guest
|
Posted: Tue May 15, 2007 7:21 pm Post subject: Optimisation |
|
|
I want to optimize this:
Any tips?MMX, SSE or asm?
function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;
val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then r:= 0 else
if val > 255 then r:=255 else
r:=round(val + 0.5);
val := _y - 0.34414 * (_cb - 128) - 0.71414 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then g:=0 else
if val > 255 then g:=255 else
g:=round(val + 0.5);
val := _y + 1.772 * (_cb - 128) - _k;
val := (val - 128) *gm + 128;
if val < 0 then b:= 0 else
if val > 255 then b:=255 else
b:=round(val + 0.5);
a:=255;
end
end ; |
|
| Back to top |
|
 |
Bob Gonder Guest
|
Posted: Tue May 15, 2007 11:34 pm Post subject: Re: Optimisation |
|
|
Sanyin wrote:
| Quote: | I want to optimize this:
Any tips?MMX, SSE or asm?
|
I would start by optimizing the basic math:
| Quote: | function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;
val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
|
Expand
val= (255 - Y + 1.402 * (255 - Cr - 128) - 220 + K -128 ) * 0.65 + 128
Rearange
val= ( K - Y + 1.402 * (255-128 - Cr) +255- 220 -128 ) * 0.65 + 128
Simplify
val= ( K - Y + 1.402 * (127 - Cr) -93 ) * 0.65 + 128
Comute
val= ( K - Y + 1.402 * (127)+1.402 * ( - Cr) -93 ) * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr -93+178.054 ) * 0.65 + 128
Simplify/Comute
val= ( K - Y -1.402 *Cr)*0.65 + 85.054 * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr)*0.65 + 183.2851
Simplify
Integer R := round(( K - Y -1.402 *Cr)*0.65 + 183.7851);
if R < 0 then r:= 0 else
if R > 255 then r:=255 else
r := R;
Then I'd look into getting rid of the floats.
(1000* K - 1000*Y-1000*1.402*Cr)*100*0.65+100*1000*183.7851)/100/1000;
Or
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
But that adds one multiply and one divide which might negate the
advantage over float. If you could come up with a workable
power-of-two instead of 100 and 1000, then the extra mul and div would
become efficient shifts. |
|
| Back to top |
|
 |
Davy Landman Guest
|
Posted: Wed May 16, 2007 1:42 am Post subject: Re: Optimisation |
|
|
| Quote: | R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
and in Delphi you could even make that faster by replacing the /100000 by |
*(1/100000)
Kind regards,
Davy |
|
| Back to top |
|
 |
sdasdasf Guest
|
Posted: Wed May 16, 2007 1:42 am Post subject: Re: Optimisation |
|
|
"Bob Gonder" <notbg (AT) notmindspring (DOT) invalid> wrote in message
news:5orj43l661ajbj8lvqm9ivj20tsfqpik9a (AT) 4ax (DOT) com...
| Quote: | Sanyin wrote:
I want to optimize this:
Any tips?MMX, SSE or asm?
I would start by optimizing the basic math:
function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;
val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
Expand
val= (255 - Y + 1.402 * (255 - Cr - 128) - 220 + K -128 ) * 0.65 + 128
Rearange
val= ( K - Y + 1.402 * (255-128 - Cr) +255- 220 -128 ) * 0.65 + 128
Simplify
val= ( K - Y + 1.402 * (127 - Cr) -93 ) * 0.65 + 128
Comute
val= ( K - Y + 1.402 * (127)+1.402 * ( - Cr) -93 ) * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr -93+178.054 ) * 0.65 + 128
Simplify/Comute
val= ( K - Y -1.402 *Cr)*0.65 + 85.054 * 0.65 + 128
Simplify
val= ( K - Y -1.402 *Cr)*0.65 + 183.2851
Simplify
Integer R := round(( K - Y -1.402 *Cr)*0.65 + 183.7851);
if R < 0 then r:= 0 else
if R > 255 then r:=255 else
r := R;
Then I'd look into getting rid of the floats.
(1000* K - 1000*Y-1000*1.402*Cr)*100*0.65+100*1000*183.7851)/100/1000;
Or
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
But that adds one multiply and one divide which might negate the
advantage over float. If you could come up with a workable
power-of-two instead of 100 and 1000, then the extra mul and div would
become efficient shifts.
|
Fixed math? (16.16) |
|
| Back to top |
|
 |
Bob Gonder Guest
|
Posted: Wed May 16, 2007 1:47 am Post subject: Re: Optimisation |
|
|
Davy Landman wrote:
| Quote: | R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
and in Delphi you could even make that faster by replacing the /100000 by
*(1/100000)
|
Ummm,, except we were trying to get away from floats? |
|
| Back to top |
|
 |
Davy Landman Guest
|
|
| Back to top |
|
 |
Michael Hansen Guest
|
Posted: Wed May 16, 2007 1:54 pm Post subject: Re: Optimisation |
|
|
Bob Gonder wrote:
| Quote: | ...
Then I'd look into getting rid of the floats.
(1000* K - 1000*Y-1000*1.402*Cr)*100*0.65+100*1000*183.7851)/100/1000;
Or
R:= (( 1000 * (K-Y) - 1402*Cr) * 65 +18378510 )/100000
But that adds one multiply and one divide which might negate the
advantage over float. If you could come up with a workable
power-of-two instead of 100 and 1000, then the extra mul and div would
become efficient shifts.
|
Since the result channels are bytes, my guess is that the precision of those
constants are not really going to have influence anyway. Choosing another
fixed point precision level could very well work then (with rounding taken
into account).
/Michael |
|
| Back to top |
|
 |
Nils Haeck Guest
|
Posted: Wed May 16, 2007 4:51 pm Post subject: Re: Optimisation |
|
|
From my Jpeg library:
No multiplies, just int additions, shifts and table lookups. If you want to
use this code you'll have to acknowledge with my name and company (Nils
Haeck, www.simdesign.nl)
(call InitYCbCrToRGBTables in initialization section)
Hope that helps,
Nils
const
cColorConvScale = 1 shl 10;
c__toR = Round( -179.456 * cColorConvScale);
c__toG = Round(135.53664 * cColorConvScale);
c__toB = Round( -226.816 * cColorConvScale);
var
cY_toRT: array[0..255] of integer;
cCrtoRT: array[0..255] of integer;
cCbtoGT: array[0..255] of integer;
cCrtoGT: array[0..255] of integer;
cCbtoBT: array[0..255] of integer;
procedure InitYCbCrToRGBTables;
var
i: integer;
begin
for i := 0 to 255 do begin
cY_toRT[i] := Round( 1 * cColorConvScale * i);
cCrtoRT[i] := Round( 1.402 * cColorConvScale * i);
cCbtoGT[i] := Round( -0.34414 * cColorConvScale * i);
cCrtoGT[i] := Round( -0.71414 * cColorConvScale * i);
cCbtoBT[i] := Round( 1.772 * cColorConvScale * i);
end;
end;
procedure TsdTransformYCCKToRGB.Transform(Source, Dest: pointer; Count:
integer);
// YCCK is a colorspace where the CMY part of CMYK is first converted to
RGB, then
// transformed to YCbCr as usual. The K part is appended without any
changes.
// To transform back, we do the YCbCr -> RGB transform, then add K
var
R, G, B, Y, Cb, Cr, K: PByte;
Yi, Ki, Ri, Gi, Bi: integer;
function RangeLimit(A: integer): integer;
begin
// Delphi seems to convert the "div" here to SAR just fine (D7), so we
// don't use ASM but plain pascal
Result := A div (1 shl 10);
if Result < 0 then
Result := 0
else
if Result > 255 then
Result := 255;
end;
begin
Y := Source;
Cb := Source; inc(Cb);
Cr := Source; inc(Cr, 2);
K := Source; inc(K, 3);
// RGB is layed out in memory as BGR
B := Dest;
G := Dest; inc(G);
R := Dest; inc(R, 2);
// Repeat Count times..
while Count > 0 do
begin
// Do the conversion in int
Yi := cY_toRT[Y^];
Ki := cY_toRT[K^];
// The components seem to be inverted. Seems to be a Photoshop problem.
Ri := -Yi - cCrtoRT[Cr^] - c__toR + Ki;
Gi := -Yi - cCbToGT[Cb^] - cCrtoGT[Cr^] - c__toG + Ki;
Bi := -Yi - cCbtoBT[Cb^] - c__toB + Ki;
R^ := RangeLimit(Ri);
G^ := RangeLimit(Gi);
B^ := RangeLimit(Bi);
// Advance pointers
inc(Y, 4); inc(Cb, 4); inc(Cr, 4); inc(K, 4);
inc(R, 3); inc(G, 3); inc(B, 3);
dec(Count);
end;
end;
"Sanyin" <prevodilac (AT) hotmail (DOT) com> schreef in bericht
news:4649c1d8 (AT) newsgroups (DOT) borland.com...
| Quote: | I want to optimize this:
Any tips?MMX, SSE or asm?
function YCCKtoColor32(Y, Cb, Cr, K: integer): tcolor32;
var _k,_y,_cb,_cr: Integer;
val: single;
gm: single;
begin
with tcolor32entry(RESULT) do
begin
gm:=0.65;
_k := 220 - K;
_y := 255 - Y;
_cb := 255 - Cb;
_cr := 255 - Cr;
val := _y + 1.402 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then r:= 0 else
if val > 255 then r:=255 else
r:=round(val + 0.5);
val := _y - 0.34414 * (_cb - 128) - 0.71414 * (_cr - 128) - _k;
val := (val - 128) * gm + 128;
if val < 0 then g:=0 else
if val > 255 then g:=255 else
g:=round(val + 0.5);
val := _y + 1.772 * (_cb - 128) - _k;
val := (val - 128) *gm + 128;
if val < 0 then b:= 0 else
if val > 255 then b:=255 else
b:=round(val + 0.5);
a:=255;
end
end ;
|
|
|
| Back to top |
|
 |
Sanyin Guest
|
Posted: Wed May 16, 2007 6:20 pm Post subject: Re: Optimisation OT |
|
|
I dont know for that Adobe problem (inverted YCCK).
Saving with Photoshop 7, CS, CS2 etc. produces YCCK jpeg?
I think its correct, at least with my functions there are no problens
decoding it. |
|
| Back to top |
|
 |
Sanyin Guest
|
Posted: Wed May 16, 2007 6:23 pm Post subject: Re: Optimisation |
|
|
"Nils Haeck" <bla (AT) bla (DOT) com> wrote in message
news:464aede1$1 (AT) newsgroups (DOT) borland.com...
| Quote: | From my Jpeg library:
No multiplies, just int additions, shifts and table lookups. If you want
to use this code you'll have to acknowledge with my name and company (Nils
Haeck, www.simdesign.nl)
(call InitYCbCrToRGBTables in initialization section)
|
Your code makes very dark images! |
|
| Back to top |
|
 |
Nils Haeck Guest
|
Posted: Thu May 17, 2007 3:46 am Post subject: Re: Optimisation OT |
|
|
Can you send me some test images by email?
Nils |
|
| Back to top |
|
 |
Nils Haeck Guest
|
Posted: Thu May 17, 2007 4:00 am Post subject: Re: Optimisation |
|
|
| Quote: | gm:=0.65;
_k := 220 - K;
|
Where does 0.65 and 220 come from? Did you find that in some documentation,
or did you experiment with these values? Do you have a link to any
documentation describing this?
Nils |
|
| Back to top |
|
 |
Nils Haeck Guest
|
Posted: Thu May 17, 2007 4:48 am Post subject: Re: Optimisation OT |
|
|
This conversion should be correct.. can you test? It adds a few int
multiplications, which can be avoided by adding another set of tables. But I
don't want to do that before I know for sure this one works.
Nils
procedure TsdTransformYCCKToRGBAdobe.Transform(Source, Dest: pointer;
Count: integer);
// YCCK to RGB for Adobe images is different. First, the Y, Cr and Cb are
inverted,
// and k* = 220 - K. The normal YCbCr to RGB is then applied. As a last
step,
// the values are scaled by 0.65 around 128
const
c0_65: integer = round(0.65 * cColorConvScale);
c44_8: integer = round(44.8 * cColorConvScale);// 128 - 0.65 * 128
var
R, G, B, Y, Cb, Cr, K: PByte;
Yi, Ki, Ri, Gi, Bi, Cbi, Cri: integer;
function ScaleAndRangeLimit(A: integer): integer;
begin
// First the scaling
A := (A * c0_65) div cColorConvScale + c44_8;
// Undo fixed precision and range limit
Result := A div cColorConvScale;
if Result < 0 then
Result := 0
else
if Result > 255 then
Result := 255;
end;
begin
Y := Source;
Cb := Source; inc(Cb);
Cr := Source; inc(Cr, 2);
K := Source; inc(K, 3);
// RGB is layed out in memory as BGR
B := Dest;
G := Dest; inc(G);
R := Dest; inc(R, 2);
// Repeat Count times..
while Count > 0 do
begin
// Do the conversion in int
Yi := cY_toRT[255 - Y^];
Cbi := 255 - Cb^;
Cri := 255 - Cr^;
Ki := (220 - K^) * cColorConvScale;
Ri := Yi + cCrtoRT[Cri] + c__toR - Ki;
Gi := Yi + cCbToGT[Cbi] + cCrtoGT[Cri] + c__toG - Ki;
Bi := Yi + cCbtoBT[Cbi] + c__toB - Ki;
R^ := ScaleAndRangeLimit(Ri);
G^ := ScaleAndRangeLimit(Gi);
B^ := ScaleAndRangeLimit(Bi);
// Advance pointers
inc(Y, 4); inc(Cb, 4); inc(Cr, 4); inc(K, 4);
inc(R, 3); inc(G, 3); inc(B, 3);
dec(Count);
end;
end; |
|
| Back to top |
|
 |
Sanyin Guest
|
Posted: Fri May 18, 2007 8:03 pm Post subject: Re: Optimisation OT |
|
|
| I think it shold be done with color management. |
|
| Back to top |
|
 |
Nils Haeck Guest
|
Posted: Sat May 19, 2007 8:11 am Post subject: Re: Optimisation OT |
|
|
Well most CMYK's I saw indeed contain an ICC colour profile. You can extract
that with my lib, but then you need something like Little CMS to actually
use it. ICC colour profiles in these CMYK Jpegs can be quite large.. spread
over e.g. 15 meta tags each containing 32K or so. So indeed there may be
valuable information in there :)
Nils |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|