 |
BorlandTalk.com Borland discussion newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
desp Guest
|
Posted: Mon Apr 30, 2007 1:31 am Post subject: Need some suggestions from the experts here, for a fast algo |
|
|
Hello, was wondering if some of you could share your knowledge here,
im trying to write an algorithem to (very) quickly do this:
i have a file (around 10gb in size) with letters A->Z randomy (like ASDFSDA)
if the input is "ABD" i find the first "A" then find the distance between A
to the first "B" (say 5chars), i then need to check if "D" is the same
distance from B as A to B. (like B+5chars).
what is the best way to go along with this, the file is STATIC and constant,
and will not change.
any ideas/theories about going along doing this would be greatly
appericated! |
|
| Back to top |
|
 |
John Herbster Guest
|
Posted: Mon Apr 30, 2007 2:10 am Post subject: Re: Need some suggestions from the experts here, for a fast |
|
|
"desp" <sabrwolf (AT) gmail (DOT) com> wrote
| Quote: | im trying to write an algorithem to (very) quickly
do this: i have a file (around 10 GB in size) with
letters A->Z randomly (like ASDFSDA) if the input is
"ABD" i find the first "A" then find the distance
between A to the first "B" (say 5chars), i then need
to check if "D" is the same distance from B as A to B.
(like B+5chars). ...
|
Is the result to be just true or false, or if not, then
what results are needed?
Is the file to be treated as circular?
--JohnH |
|
| Back to top |
|
 |
Bob Gonder Guest
|
Posted: Mon Apr 30, 2007 4:52 am Post subject: Re: Need some suggestions from the experts here, for a fast |
|
|
desp wrote:
| Quote: | i have a file (around 10gb in size) with letters A->Z randomy (like ASDFSDA)
if the input is "ABD" i find the first "A" then find the distance between A
to the first "B" (say 5chars), i then need to check if "D" is the same
distance from B as A to B. (like B+5chars).
|
So,
AOffset = scan( array, 0, AVal )
BOffset = scan( array, AOffset+1, BVal )
return CVal == array[ BOffset * 2 - AOffset ]
Tricky parts being array[] is a file, and offsets may be more than the
4GB limit.
I would be tempted to have xOffset be a 2 part value, Sector and
Offset. Maybe decide that Sector would be 1MB in size, so disk reads
would be 1MB each. Offset would of course be the offset within the 1MB
Sector. (I seems to remember someone mentioning that 32KB is optimal
read size, rather than 1MB, but then you'd be limiting yourself to
128TB. ) |
|
| Back to top |
|
 |
Robert Marquardt Guest
|
Posted: Mon Apr 30, 2007 8:11 am Post subject: Re: Need some suggestions from the experts here, for a fast |
|
|
desp wrote:
| Quote: | i have a file (around 10gb in size)
|
This is the main problem. File IO will dominate so make yourself
familiar with CreateFileMapping and MapViewOfFile. |
|
| Back to top |
|
 |
Jonathan Benedicto Guest
|
Posted: Mon Apr 30, 2007 7:10 pm Post subject: Re: Need some suggestions from the experts here, for a fast |
|
|
Bob Gonder wrote:
| Quote: | (I seems to remember someone mentioning that 32KB is optimal
read size, rather than 1MB, but then you'd be limiting yourself to
128TB. )
|
I believe that for NTFS 256KB is the optimal read-size, with 64KB coming in
second.
Jon |
|
| Back to top |
|
 |
JBR Guest
|
Posted: Mon Apr 30, 2007 8:51 pm Post subject: Re: Need some suggestions from the experts here, for a fast |
|
|
desp a couché sur son écran :
| Quote: | Hello, was wondering if some of you could share your knowledge here,
im trying to write an algorithem to (very) quickly do this:
i have a file (around 10gb in size) with letters A->Z randomy (like ASDFSDA)
if the input is "ABD" i find the first "A" then find the distance between A
to the first "B" (say 5chars), i then need to check if "D" is the same
distance from B as A to B. (like B+5chars).
what is the best way to go along with this, the file is STATIC and constant,
and will not change.
any ideas/theories about going along doing this would be greatly
appericated!
|
Is it anyhow related to DNA analysis ?  |
|
| Back to top |
|
 |
desp Guest
|
Posted: Mon Apr 30, 2007 11:13 pm Post subject: Re: Need some suggestions from the experts here, for a fast |
|
|
----- Original Message -----
From: "John Herbster" <herb-sci1_AT_sbcglobal.net>
Newsgroups: borland.public.delphi.language.basm
Sent: Monday, April 30, 2007 12:10 AM
Subject: Re: Need some suggestions from the experts here, for a fast
algorithem :)
| Quote: |
"desp" <sabrwolf (AT) gmail (DOT) com> wrote
im trying to write an algorithem to (very) quickly
do this: i have a file (around 10 GB in size) with
letters A->Z randomly (like ASDFSDA) if the input is
"ABD" i find the first "A" then find the distance
between A to the first "B" (say 5chars), i then need
to check if "D" is the same distance from B as A to B.
(like B+5chars). ...
Is the result to be just true or false, or if not, then
what results are needed?
Is the file to be treated as circular?
--JohnH
|
the result should be true/false for each query
the file will not be treated as cricular no.
"Is it anyhow related to DNA analysis ? "
nope. |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|