BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

[V. Vassiliev] ADO data transformations

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> C++ Builder Databases (ADO)
View previous topic :: View next topic  
Author Message
Vladimir Stefanovic
Guest





PostPosted: Thu Feb 17, 2005 6:54 pm    Post subject: [V. Vassiliev] ADO data transformations Reply with quote



Hi,

I have MS Access database (it's a dictionary), which generally
consists of three tables, which I have shown below. I want to
transform these three tables into single one and I already made
a utility for that. Unfortunately, my routine is too slow for 400.000
rows. Just to mention that I have nothing against the design with
three tables, I just want to make some speed tests.

Can someone help me with some advices of how to accomlish
the transformation.

[TableRelations]
Struct:
Language1_ID (number)
Language2_ID (number)

Sample:
1000 5000
1000 5001
1000 5002
1001 6000
1002 6001

[TableLanguage1]
Struct:
Language1_ID (number)
Language1_Word (Text)
Language1_Bool (Yes/No)

Sample:
1000 word1 Yes
1001 word2 Yes
1002 word3 Yes

[TableLanguage2]
Struct:
Language2_ID (number)
Language2_Word (Text)
Language2_Bool (Yes/No)

Sample:
5000 word4 No
5001 word5 No
5002 word6 No
6000 word7 No
6001 word8 No

Essencially, that means:

Translation: Language1->Language2:
word1 = word4,word5,word6
word2 = word7
word3 = word8

Translation: Language2->Language1:
word4 = word1
word5 = word1
word6 = word1
word7 = word2
word8 = word3

I want to make one table to look like this:

[TableOnlyOne]
Struct:
Word (Text)
Selected (Yes/No)
Meaning (Text)

Sample:
word1 Yes word4;word5;word6
word2 Yes word7
word3 Yes word8
word4 No word1
word5 No word1
word6 No word1
word7 No word2
word8 No word3

This is my routine /I changed some variable names without testing
so some simple anomalies are possible, anyway the routine worked
in the original/:

The problem is that after 50000 records it works 1 row/sec, and I
have 400000. Do you have any advice?

void __fastcall TForm1::Button1Click(TObject *Sender)
{
int BrLang1, BrLang2;
AnsiString RecLang1, RecLang2;
Variant Lang1LookupResults, Lang2LookupResults;

Variant Lang1_locvalues[2], Lang2_locvalues[2];

Lang1_locvalues[1] = Variant( 1 );
Lang2_locvalues[1] = Variant( 0 );

TLocateOptions SearchOptions;
SearchOptions.Clear();
SearchOptions = SearchOptions << loCaseInsensitive;

// Controls Disabled Here

ADOTableRelations->First();
while ( !ADOTableRelations->Eof )
{
// Take the Relations ID's
BrLang1 = ADOTableRelations->FieldValues["ID_Lang1"];
BrLang2 = ADOTableRelations->FieldValues["ID_Lang2"];

// Find Language1 word
Lang1LookupResults = ADOTableLang1->Lookup( "ID_Lang1", BrLang1,
"Language1_Word" );
RecLang1 = Lang1LookupResults;

// Find Language2 word
Lang2LookupResults = ADOTableLang2->Lookup( "ID_Lang2", BrLang2,
"Language2_Word" );
RecLang2 = Lang2LookupResults;

// Append to new row or add to existing
Lang1_locvalues[0] = Variant( RecLang1 );
if ( ADOTableOnlyOne->Locate( "Word;Selected",
VarArrayOf(Lang1_locvalues, 1), SearchOptions ) == false )
{
ADOTableOnlyOne->Insert();
ADOTableOnlyOne->FieldValues["Word"] = RecLang1;
ADOTableOnlyOne->FieldValues["Selected"] = 1;
ADOTableOnlyOne->FieldValues["Meaning"] = RecLang2;
ADOTableOnlyOne->Post();
}
else
{
ADOTableOnlyOne->Edit();
AnsiString Current = ADOTableOnlyOne->FieldValues["Meaning"];
ADOTableOnlyOne->FieldValues["Meaning"] = Current + ";" + RecLang2;
ADOTableOnlyOne->Post();
}

// Append to new row or add to existing
Lang2_locvalues[0] = Variant( RecLang2 );
if ( ADOTableOnlyOne->Locate( "Word;Selected",
VarArrayOf(Lang2_locvalues, 1), SearchOptions ) == false )
{
ADOTableOnlyOne->Insert();
ADOTableOnlyOne->FieldValues["Word"] = RecLang2;
ADOTableOnlyOne->FieldValues["Selected"] = 0;
ADOTableOnlyOne->FieldValues["Meaning"] = RecLang1;
ADOTableOnlyOne->Post();
}
else
{
ADOTableOnlyOne->Edit();
AnsiString Current = ADOTableOnlyOne->FieldValues["Meaning"];
ADOTableOnlyOne->FieldValues["Meaning"] = Current + ";" + RecLang1;
ADOTableOnlyOne->Post();
}

// Progress bars, etc ...

ADOTableRelations->Next();
}

// Controls Enabled Here
}


--
Best regards,
Vladimir Stefanovic


Back to top
Vladimir Stefanovic
Guest





PostPosted: Fri Feb 18, 2005 11:20 am    Post subject: Re: [V. Vassiliev] ADO data transformations Reply with quote



Maybe I should delete every processed record? That
will definitely slow down the process but I expect that
progress will be paced and predictable.


--
Best regards,
Vladimir Stefanovic
"Vladimir Stefanovic" <antivari (AT) po (DOT) sbb.co.yu> wrote

Quote:
Hi,

I have MS Access database (it's a dictionary), which generally
consists of three tables, which I have shown below. I want to
transform these three tables into single one and I already made
a utility for that. Unfortunately, my routine is too slow for 400.000
rows. Just to mention that I have nothing against the design with
three tables, I just want to make some speed tests.

Can someone help me with some advices of how to accomlish
the transformation.

[TableRelations]
Struct:
Language1_ID (number)
Language2_ID (number)

Sample:
1000 5000
1000 5001
1000 5002
1001 6000
1002 6001

[TableLanguage1]
Struct:
Language1_ID (number)
Language1_Word (Text)
Language1_Bool (Yes/No)

Sample:
1000 word1 Yes
1001 word2 Yes
1002 word3 Yes

[TableLanguage2]
Struct:
Language2_ID (number)
Language2_Word (Text)
Language2_Bool (Yes/No)

Sample:
5000 word4 No
5001 word5 No
5002 word6 No
6000 word7 No
6001 word8 No

Essencially, that means:

Translation: Language1->Language2:
word1 = word4,word5,word6
word2 = word7
word3 = word8

Translation: Language2->Language1:
word4 = word1
word5 = word1
word6 = word1
word7 = word2
word8 = word3

I want to make one table to look like this:

[TableOnlyOne]
Struct:
Word (Text)
Selected (Yes/No)
Meaning (Text)

Sample:
word1 Yes word4;word5;word6
word2 Yes word7
word3 Yes word8
word4 No word1
word5 No word1
word6 No word1
word7 No word2
word8 No word3

This is my routine /I changed some variable names without testing
so some simple anomalies are possible, anyway the routine worked
in the original/:

The problem is that after 50000 records it works 1 row/sec, and I
have 400000. Do you have any advice?

void __fastcall TForm1::Button1Click(TObject *Sender)
{
int BrLang1, BrLang2;
AnsiString RecLang1, RecLang2;
Variant Lang1LookupResults, Lang2LookupResults;

Variant Lang1_locvalues[2], Lang2_locvalues[2];

Lang1_locvalues[1] = Variant( 1 );
Lang2_locvalues[1] = Variant( 0 );

TLocateOptions SearchOptions;
SearchOptions.Clear();
SearchOptions = SearchOptions << loCaseInsensitive;

// Controls Disabled Here

ADOTableRelations->First();
while ( !ADOTableRelations->Eof )
{
// Take the Relations ID's
BrLang1 = ADOTableRelations->FieldValues["ID_Lang1"];
BrLang2 = ADOTableRelations->FieldValues["ID_Lang2"];

// Find Language1 word
Lang1LookupResults = ADOTableLang1->Lookup( "ID_Lang1", BrLang1,
"Language1_Word" );
RecLang1 = Lang1LookupResults;

// Find Language2 word
Lang2LookupResults = ADOTableLang2->Lookup( "ID_Lang2", BrLang2,
"Language2_Word" );
RecLang2 = Lang2LookupResults;

// Append to new row or add to existing
Lang1_locvalues[0] = Variant( RecLang1 );
if ( ADOTableOnlyOne->Locate( "Word;Selected",
VarArrayOf(Lang1_locvalues, 1), SearchOptions ) == false )
{
ADOTableOnlyOne->Insert();
ADOTableOnlyOne->FieldValues["Word"] = RecLang1;
ADOTableOnlyOne->FieldValues["Selected"] = 1;
ADOTableOnlyOne->FieldValues["Meaning"] = RecLang2;
ADOTableOnlyOne->Post();
}
else
{
ADOTableOnlyOne->Edit();
AnsiString Current = ADOTableOnlyOne->FieldValues["Meaning"];
ADOTableOnlyOne->FieldValues["Meaning"] = Current + ";" +
RecLang2;
ADOTableOnlyOne->Post();
}

// Append to new row or add to existing
Lang2_locvalues[0] = Variant( RecLang2 );
if ( ADOTableOnlyOne->Locate( "Word;Selected",
VarArrayOf(Lang2_locvalues, 1), SearchOptions ) == false )
{
ADOTableOnlyOne->Insert();
ADOTableOnlyOne->FieldValues["Word"] = RecLang2;
ADOTableOnlyOne->FieldValues["Selected"] = 0;
ADOTableOnlyOne->FieldValues["Meaning"] = RecLang1;
ADOTableOnlyOne->Post();
}
else
{
ADOTableOnlyOne->Edit();
AnsiString Current = ADOTableOnlyOne->FieldValues["Meaning"];
ADOTableOnlyOne->FieldValues["Meaning"] = Current + ";" +
RecLang1;
ADOTableOnlyOne->Post();
}

// Progress bars, etc ...

ADOTableRelations->Next();
}

// Controls Enabled Here
}


--
Best regards,
Vladimir Stefanovic




Back to top
Viatcheslav V. Vassiliev
Guest





PostPosted: Sat Feb 19, 2005 2:34 pm    Post subject: Re: [V. Vassiliev] ADO data transformations Reply with quote



If you can consider to not concat values from table lang2 and get result as:

word1 Yes word4
word1 Yes word5
word1 Yes word6
word2 Yes word7
word3 Yes word8
word4 No word1
word5 No word1
word6 No word1
word7 No word2
word8 No word3

then all you need is to execute 2 SQL commands (table and column names are
slightly changed):

insert into Result
select lang1_word as word, true as selected, lang2_word as meaning
from lang1
inner join (Relations
inner join lang2
on Relations.lang2_id = lang2.lang2_id)
on lang1.lang1_id = Relations.lang1_id

insert into Result
select lang2_word as word, false as selected, lang1_word as meaning
from lang2
inner join (Relations
inner join lang1
on Relations.lang1_id = lang1.lang1_id)
on lang2.lang2_id = Relations.lang2_id

This design (not concat) is more flexible - if you need all meanings for
word1, you just execute "select * from result where word = 'word1'" - if
this operation is performed frequently, index column word in table results.
This design may simplify future modifications too. In my test all string
columns are varchar(50).

Executing these 2 commands should be quite fast - be sure to have indexes on
all id columns (primary keys in tables lang1 and lang2 and non-unique
indexes in relations table).

If you must follow your design (with concat) - from tables lang1 and lang2
you need only 2 columns - ID and Word. Fetch both tables into TStringList
(string is for column Word, object - for ID), sort these string list by ID
(using custom sort) and work with these string lists (search needed id using
sort order) - execute only inserts into database using TableDirect = true
and server-side cursor, or using TADOCommand.

//------------------------------------------
Regards,
Vassiliev V. V.
http://www.managed-vcl.com - using .Net objects in Delphi for Win32 +
ADO.Net
http://www.oledbdirect.com - The fastest way to access MS SQL Server,
MS Jet (Access) and Interbase (through OLEDB)

"Vladimir Stefanovic" <antivari (AT) po (DOT) sbb.co.yu> сообщил/сообщила в новостях
следующее: news:4214e840 (AT) newsgroups (DOT) borland.com...
Quote:
Hi,

I have MS Access database (it's a dictionary), which generally
consists of three tables, which I have shown below. I want to
transform these three tables into single one and I already made
a utility for that. Unfortunately, my routine is too slow for 400.000
rows. Just to mention that I have nothing against the design with
three tables, I just want to make some speed tests.

Can someone help me with some advices of how to accomlish
the transformation.

[TableRelations]
Struct:
Language1_ID (number)
Language2_ID (number)

Sample:
1000 5000
1000 5001
1000 5002
1001 6000
1002 6001

[TableLanguage1]
Struct:
Language1_ID (number)
Language1_Word (Text)
Language1_Bool (Yes/No)

Sample:
1000 word1 Yes
1001 word2 Yes
1002 word3 Yes

[TableLanguage2]
Struct:
Language2_ID (number)
Language2_Word (Text)
Language2_Bool (Yes/No)

Sample:
5000 word4 No
5001 word5 No
5002 word6 No
6000 word7 No
6001 word8 No

Essencially, that means:

Translation: Language1->Language2:
word1 = word4,word5,word6
word2 = word7
word3 = word8

Translation: Language2->Language1:
word4 = word1
word5 = word1
word6 = word1
word7 = word2
word8 = word3

I want to make one table to look like this:

[TableOnlyOne]
Struct:
Word (Text)
Selected (Yes/No)
Meaning (Text)

Sample:
word1 Yes word4;word5;word6
word2 Yes word7
word3 Yes word8
word4 No word1
word5 No word1
word6 No word1
word7 No word2
word8 No word3

This is my routine /I changed some variable names without testing
so some simple anomalies are possible, anyway the routine worked
in the original/:

The problem is that after 50000 records it works 1 row/sec, and I
have 400000. Do you have any advice?

void __fastcall TForm1::Button1Click(TObject *Sender)
{
int BrLang1, BrLang2;
AnsiString RecLang1, RecLang2;
Variant Lang1LookupResults, Lang2LookupResults;

Variant Lang1_locvalues[2], Lang2_locvalues[2];

Lang1_locvalues[1] = Variant( 1 );
Lang2_locvalues[1] = Variant( 0 );

TLocateOptions SearchOptions;
SearchOptions.Clear();
SearchOptions = SearchOptions << loCaseInsensitive;

// Controls Disabled Here

ADOTableRelations->First();
while ( !ADOTableRelations->Eof )
{
// Take the Relations ID's
BrLang1 = ADOTableRelations->FieldValues["ID_Lang1"];
BrLang2 = ADOTableRelations->FieldValues["ID_Lang2"];

// Find Language1 word
Lang1LookupResults = ADOTableLang1->Lookup( "ID_Lang1", BrLang1,
"Language1_Word" );
RecLang1 = Lang1LookupResults;

// Find Language2 word
Lang2LookupResults = ADOTableLang2->Lookup( "ID_Lang2", BrLang2,
"Language2_Word" );
RecLang2 = Lang2LookupResults;

// Append to new row or add to existing
Lang1_locvalues[0] = Variant( RecLang1 );
if ( ADOTableOnlyOne->Locate( "Word;Selected",
VarArrayOf(Lang1_locvalues, 1), SearchOptions ) == false )
{
ADOTableOnlyOne->Insert();
ADOTableOnlyOne->FieldValues["Word"] = RecLang1;
ADOTableOnlyOne->FieldValues["Selected"] = 1;
ADOTableOnlyOne->FieldValues["Meaning"] = RecLang2;
ADOTableOnlyOne->Post();
}
else
{
ADOTableOnlyOne->Edit();
AnsiString Current = ADOTableOnlyOne->FieldValues["Meaning"];
ADOTableOnlyOne->FieldValues["Meaning"] = Current + ";" +
RecLang2;
ADOTableOnlyOne->Post();
}

// Append to new row or add to existing
Lang2_locvalues[0] = Variant( RecLang2 );
if ( ADOTableOnlyOne->Locate( "Word;Selected",
VarArrayOf(Lang2_locvalues, 1), SearchOptions ) == false )
{
ADOTableOnlyOne->Insert();
ADOTableOnlyOne->FieldValues["Word"] = RecLang2;
ADOTableOnlyOne->FieldValues["Selected"] = 0;
ADOTableOnlyOne->FieldValues["Meaning"] = RecLang1;
ADOTableOnlyOne->Post();
}
else
{
ADOTableOnlyOne->Edit();
AnsiString Current = ADOTableOnlyOne->FieldValues["Meaning"];
ADOTableOnlyOne->FieldValues["Meaning"] = Current + ";" +
RecLang1;
ADOTableOnlyOne->Post();
}

// Progress bars, etc ...

ADOTableRelations->Next();
}

// Controls Enabled Here
}


--
Best regards,
Vladimir Stefanovic




Back to top
Vladimir Stefanovic
Guest





PostPosted: Sat Feb 19, 2005 3:03 pm    Post subject: Re: [V. Vassiliev] ADO data transformations Reply with quote

"Viatcheslav V. Vassiliev" wrote

Quote:
If you can consider to not concat values from table lang2 and
get result as [...]

Yes, I will definitely try the concept you suggested.


Quote:
If you must follow your design (with concat) - from tables lang1
and lang2 you need only 2 columns - ID and Word. Fetch both
tables into TStringList (string is for column Word, object - for
ID), sort these string list by ID (using custom sort) and work
with these string lists (search needed id using sort order) - execute
only inserts into database using TableDirect = true and server-side
cursor, or using TADOCommand.

Ok, thanks a lot for your help. I'll try this concept too.



--
Best regards,
Vladimir Stefanovic




Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> C++ Builder Databases (ADO) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.