BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Help with Text File Processing Problem

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi (General)
View previous topic :: View next topic  
Author Message
cloudzero
Guest





PostPosted: Tue Nov 23, 2004 12:10 am    Post subject: Help with Text File Processing Problem Reply with quote



Hi gang,
After a very long absence from programming, I've been persuaded to
build a text processing program. I've come up against a sticky little
problem that I hope you can help on.

This routine is expected to read and create two text files. The
primary file is the original file containing perhaps several tens of
thousand lines. The secondary file contains a shorter list of lines.

This routine needs to read the primary file a line at a time, and
compare it to lines found in the in the secondary file. If the line
in the secondary is found in a line in the primary file, it needs to
output the quarantine file, otherwise, it needs to output to the
target file. I have tried several incarnations of the code below with
varying degrees of failure.

I'm sure I'm missing something really fundamental and yes I know I can
shorten a few lines..

Thanks in advance.

begin
// original source text file
AssignFile(sourcefile, PrimaryDialogue.filename);
Reset(sourcefile);

// output file with lines containing banned word
assignFile(quarantined, 'quarantined.txt');
rewrite(quarantined);

// remaining lines
assignFile(targetfile, savedialogue.FileName);
rewrite(targetfile);
while not EOF(sourcefile) do begin
readln(sourcefile, buffer2);

// text file containing list of banned words.
AssignFile(blacklist, SecondaryDialogue.filename);
Reset(blacklist);
while not EOF(blacklist) do begin
readln(blacklist, buffer);
textpos:=pos((buffer2),(buffer));
if textpos <> 0 then
Writeln(quarantined, buffer2)
else
Writeln(targetfile, buffer2);
end; // if
end; // while
end; // while
end; // if
closefile(sourcefile);
CloseFile(targetfile);
CloseFile(blacklist);
CloseFile(quarantined);
Back to top
Alan Mead
Guest





PostPosted: Tue Nov 23, 2004 8:28 am    Post subject: Re: Help with Text File Processing Problem Reply with quote



On Mon, 22 Nov 2004 16:10:41 -0800, cloudzero wrote:

Quote:
target file. I have tried several incarnations of the code below with
varying degrees of failure.

I'm sure I'm missing something really fundamental and yes I know I can
shorten a few lines..

Could you be more specific in how it fails? One problem.. it looks like
you pop the SecondaryDialogue and read the banned words list for each line
of the sourcefile. That seems wrong.. You should read the short list of
banned words into memory before you read the source file. Otherwise, the
program will take forever.

You could read them into a TStringList (indeed, a TStringList could
probably read them for you).

-Alan

Back to top
cloudzero
Guest





PostPosted: Tue Nov 23, 2004 5:27 pm    Post subject: Re: Help with Text File Processing Problem Reply with quote



Hi Alan, and thanks for replying :^)

The code filters out just the first occurrence of a banned word, the
rest are ignored and passed onto the safe list. In addition to this,
if I have say 10 lines in my banned list, the first matching line is
passed, but then 9 other copies of that line are passed through to the
safe list. All other lines are multiplied 10 times too.

Glenn


Alan Mead <amead (AT) comcast (DOT) net> wrote

Quote:
On Mon, 22 Nov 2004 16:10:41 -0800, cloudzero wrote:

target file. I have tried several incarnations of the code below with
varying degrees of failure.

I'm sure I'm missing something really fundamental and yes I know I can
shorten a few lines..

Could you be more specific in how it fails? One problem.. it looks like
you pop the SecondaryDialogue and read the banned words list for each line
of the sourcefile. That seems wrong.. You should read the short list of
banned words into memory before you read the source file. Otherwise, the
program will take forever.

You could read them into a TStringList (indeed, a TStringList could
probably read them for you).

-Alan

Back to top
Alan Mead
Guest





PostPosted: Wed Nov 24, 2004 4:37 am    Post subject: Re: Help with Text File Processing Problem Reply with quote

On Tue, 23 Nov 2004 09:27:24 -0800, cloudzero wrote:

Quote:
The code filters out just the first occurrence of a banned word, the
rest are ignored and passed onto the safe list. In addition to this, if
I have say 10 lines in my banned list, the first matching line is
passed, but then 9 other copies of that line are passed through to the
safe list. All other lines are multiplied 10 times too.

I would have to run your code on your data to really see what it's
doing...

I'm still thinking that you are reading the files in the wrong order... I
guess that's why you have duplicated garbage output.

But I see what looks like a clear error: your arguments to pos are
backwards... it's pos(needle,haystack) (look for string needle in string
haystack).

Why don't you fix those problems and if the code still doesn't work,
re-post the new code and the way that it fails. If you had a couple made
up lines of data, that would help someone help you (but don't post a lot
of garbage or swear words here).

-Alan



Back to top
cloudzero
Guest





PostPosted: Wed Nov 24, 2004 5:18 pm    Post subject: Re: Help with Text File Processing Problem Reply with quote

Thanks Alan, I'll go away and rework the code.

Glenn

Alan Mead <amead (AT) comcast (DOT) net> wrote

Quote:
On Tue, 23 Nov 2004 09:27:24 -0800, cloudzero wrote:

The code filters out just the first occurrence of a banned word, the
rest are ignored and passed onto the safe list. In addition to this, if
I have say 10 lines in my banned list, the first matching line is
passed, but then 9 other copies of that line are passed through to the
safe list. All other lines are multiplied 10 times too.

I would have to run your code on your data to really see what it's
doing...

I'm still thinking that you are reading the files in the wrong order... I
guess that's why you have duplicated garbage output.

But I see what looks like a clear error: your arguments to pos are
backwards... it's pos(needle,haystack) (look for string needle in string
haystack).

Why don't you fix those problems and if the code still doesn't work,
re-post the new code and the way that it fails. If you had a couple made
up lines of data, that would help someone help you (but don't post a lot
of garbage or swear words here).

-Alan

Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi (General) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.