BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Folding Editor

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> comp.lang.pascal.delphi.misc
View previous topic :: View next topic  
Author Message
Hans-Peter Diettrich
Guest





PostPosted: Thu Jan 19, 2006 4:20 am    Post subject: Folding Editor Reply with quote



Hi there,

after some tries I have found an very simple algorithm for handling
folded (collapsed, hidden, ...) blocks in an editor. Please correct me
if I missed something:

A foldable block is described by start and size (e.g. by line numbers).
All blocks are stored in a list (array), sorted by the starting value.
The block state can be stored in a second array, of the same length.

The most frequently used function maps an row number, e.g. in a control,
to a line number in the underlying file:

The line number is initialized with the row number.
The blocks are examined from lowest to highest one, and for every
collapsed block:
1) the block size (number of hidden lines) is added to the line number, and
2) all hidden blocks (ending before the end of this block) are skipped.
The search stops at the first block, that starts after the actual line
number.

Can it really be so simple?


Unfortunately this algorithm is slow, O(n), and the typical use is for a
sequence of row/line numbers in an OnPaint of a control. Consequently
I'm looking for an faster algorithm, or for an abbreviation that uses
the previously computed line number as the starting point of the next
calculation.

Any ideas?


Possibly some special requirements or extensions should be considered:

A file can be folded in multiple levels, e.g. in the lowest level those
lines are hidden, which are excluded by conditional compilation, and in
a higher level the control structures etc. can be folded. More levels
can occur when included files shall be inlined as foldable blocks...

Multiple views of the same file should be supported, based on the same
block definitions, but with a possibly different visibility. That's why
I consider to separate the block states from the block definitions, with
the additional advantage of a neatly aligned block definition
TBlock = record start, count: integer; end; //8 bytes


More problems will arise in the future, like syncing multiple views when
the file(s) or blocks are changed by editing. I also consider to use the
block definitions for kind of an project explorer, and more useful
features for an IDE. But that's another chapter of the never ending
story, for now I'm only looking for comments or suggestions on the above
algorithm and procedures.

TIA
DoDi
Back to top
Maarten Wiltink
Guest





PostPosted: Thu Jan 19, 2006 9:18 am    Post subject: Re: Folding Editor Reply with quote



"Hans-Peter Diettrich" <DrDiettrich (AT) nowhere (DOT) nix> wrote

[...]
Quote:
A foldable block is described by start and size (e.g. by line numbers).
All blocks are stored in a list (array), sorted by the starting value.
The block state can be stored in a second array, of the same length.

Here, I thought "single array of record", but you covered that below.
The state is per-view, the definition is per-file.


Quote:
The most frequently used function maps an row number, e.g. in a control,
to a line number in the underlying file:

The line number is initialized with the row number.
The blocks are examined from lowest to highest one, and for every
collapsed block:
1) the block size (number of hidden lines) is added to the line number,
and 2) all hidden blocks (ending before the end of this block) are
skipped. The search stops at the first block, that starts after the
actual line number.

Can it really be so simple?

Yes. Sounds correct to me. Not necessarily optimal, but that's for
version 1.1.


Quote:
Unfortunately this algorithm is slow, O(n), and the typical use is for
a sequence of row/line numbers in an OnPaint of a control. Consequently
I'm looking for an faster algorithm, or for an abbreviation that uses
the previously computed line number as the starting point of the next
calculation.

Any ideas?

Probably some scheme to cache the row-to-line mapping on every visible
line (row). With a valid flag, and on-idle precompute.

If you expand a collapsed block, one row will be replaced by N rows (I
guess). Mappings on subsequent rows will need to be adjusted by N-1,
and mappings computed incrementally for the inserted rows. It sounds like
you want quick access to nested blocks, so perhaps a tree would be better
than a flat list.

Do you keep a block pointer with a row that represents a collapsed block?

Groetjes,
Maarten Wiltink



Back to top
Hans-Peter Diettrich
Guest





PostPosted: Thu Jan 19, 2006 8:04 pm    Post subject: Re: Folding Editor Reply with quote



Maarten Wiltink schrieb:

Quote:
Can it really be so simple?


Yes. Sounds correct to me. Not necessarily optimal, but that's for
version 1.1.

Perhaps I'll keep it that simple, with a single array instead of an
tree. The computation should not take too long, so that O(n) may be
acceptable in practice.

Quote:

Unfortunately this algorithm is slow, O(n), and the typical use is for
a sequence of row/line numbers in an OnPaint of a control. Consequently
I'm looking for an faster algorithm, or for an abbreviation that uses
the previously computed line number as the starting point of the next
calculation.

Any ideas?


Probably some scheme to cache the row-to-line mapping on every visible
line (row). With a valid flag, and on-idle precompute.

I consider to cache the local variables from the last call, of for the
TopRow of the control, in the data structure that also contains the
Folded flags array. Then the loop can be initialized with the proper
block index etc., bypassing the processing of all preceding blocks. On
significant changes (scrolling, insertion/deletion of lines) the cache
can be invalidated explicitly. Many more adjustments may be required
when the blocks move due to some action...

Quote:

If you expand a collapsed block, one row will be replaced by N rows (I
guess). Mappings on subsequent rows will need to be adjusted by N-1,
and mappings computed incrementally for the inserted rows. It sounds like
you want quick access to nested blocks, so perhaps a tree would be better
than a flat list.

Can you imagine any other use for the mapping procedure, except the
row-to-line mapping in painting the text, or when a row is selected in
the GUI? These cases IMO will not deserve much optimization, perhaps the
time spent in the maintenance of more complex data structures will slow
down many (other) operations much more?

Quote:

Do you keep a block pointer with a row that represents a collapsed block?

Not now. Some links may be required later, when the defintions and uses
of symbols come into play (hyperlinks...). Then every line in the file
StringList can have an associated object, with more background information.

I intend to restrict the row-to-line mapping to the GUI; access to the
block list is already O(log(n)), based on a given line number. In this
case a tree structure would slow down this lookup, instead of speeding
it up. Similar considerations apply to relative block locations, so that
insertions or deletions of lines would not require to update all the
line numbers in the following blocks - OTOH such an organization would
prevent a binary search in the block list, as is possible in the single
flat block array...


My primary goal is an analyzation and translation tool for projects of
various languages. Restricting file access to read-only will simplify
many procedures considerably, and I'll leave it to other people to
extend the base classes into an fault-tolerant code editor - such tools
are already available. When a project consists of many files, memory
limitations may require to discard and rebuild any detailed data
structures on demand. That's why I want to keep memory usage as low as
possible, to reduce the chance for later bottlenecks or race conditions.


Thanks for your comments :-)

DoDi

Back to top
Maarten Wiltink
Guest





PostPosted: Fri Jan 20, 2006 9:17 am    Post subject: Re: Folding Editor Reply with quote

"Hans-Peter Diettrich" <DrDiettrich (AT) nowhere (DOT) nix> wrote

[...]
Quote:
On significant changes (scrolling, insertion/deletion of lines) the
cache can be invalidated explicitly.

If scrolling becomes a significant change, it's time to go back to
something simpler. Inserting or deleting lines is allowed to be more
significant, but I know from experience that it's very annoying when
it takes too long in a large file.


Quote:
Can you imagine any other use for the mapping procedure, except the
row-to-line mapping in painting the text, or when a row is selected in
the GUI? These cases IMO will not deserve much optimization, perhaps
the time spent in the maintenance of more complex data structures will
slow down many (other) operations much more?

Wherever the actual line number is needed, I'd say. That would be many
places: status bar display, searching, error reporting.

The use of more complex datastructures need not cost more time updating
them, but it should (or you shouldn't use them) result in more efficient
retrieval.


Quote:
[...] every line in the file StringList can have an associated
object, with more background information.

But many languages aren't line-oriented?


Quote:
[...] access to the block list is already O(log(n)), based on a
given line number.

I hadn't realised that. Skipping over blocks inside a hidden block
is still O(n), though.


Quote:
In this case a tree structure would slow down this lookup, instead
of speeding it up.

That's why I asked about the block pointer with every collapsed-block
row. Constant access time.

Groetjes,
Maarten Wiltink



Back to top
Hans-Peter Diettrich
Guest





PostPosted: Sat Jan 21, 2006 10:28 am    Post subject: Re: Folding Editor Reply with quote

Maarten Wiltink schrieb:

Quote:
If scrolling becomes a significant change, it's time to go back to
something simpler. Inserting or deleting lines is allowed to be more
significant, but I know from experience that it's very annoying when
it takes too long in a large file.

Okay, for now I'll leave the implementation open for later optimization.
I only have to take care, that later optimization will be possible,
without changing the base classes.


Quote:
Can you imagine any other use for the mapping procedure, except the
row-to-line mapping in painting the text, or when a row is selected in
the GUI? These cases IMO will not deserve much optimization, perhaps
the time spent in the maintenance of more complex data structures will
slow down many (other) operations much more?


Wherever the actual line number is needed, I'd say. That would be many
places: status bar display, searching, error reporting.

In most of these cases the line number is required, not the row number
in the display. Okay, scrolling to a certain line, e.g. the location of
an error, will require inverse mapping; at the same time it must be
assured that all surrounding blocks are unfolded.

Quote:

The use of more complex datastructures need not cost more time updating
them, but it should (or you shouldn't use them) result in more efficient
retrieval.

ACK.


Quote:
[...] every line in the file StringList can have an associated
object, with more background information.


But many languages aren't line-oriented?

Does this really matter? The display at least is line oriented, so that
everything selectable from, or otherwise related to a line, can be
described in one place.

At least for now I don't intend to collapse columns. There exist other
language features, which suggest the construction of a preprocessed
version of certain files, for display purposes. Such a preprocessing can
consume much time, as can be seen when compiling C code ;-)


Quote:
[...] access to the block list is already O(log(n)), based on a
given line number.


I hadn't realised that. Skipping over blocks inside a hidden block
is still O(n), though.

No, I just found that skipping to the next sibling block will also be
possible in O(log(n)), because this action also is based on invariant
line numbers (the ending line number of every block is known!).


Quote:
In this case a tree structure would slow down this lookup, instead
of speeding it up.


That's why I asked about the block pointer with every collapsed-block
row. Constant access time.

Can you become a bit more concrete? I'm not sure that I understand your
idea.


DoDi

Back to top
Stephen Posey
Guest





PostPosted: Sun Jan 22, 2006 11:41 pm    Post subject: Re: Folding Editor Reply with quote

On Thu, 19 Jan 2006 05:20:07 +0100, Hans-Peter Diettrich
<DrDiettrich (AT) nowhere (DOT) nix> wrote:

Quote:
Hi there,

after some tries I have found an very simple algorithm for handling
folded (collapsed, hidden, ...) blocks in an editor. Please correct me
if I missed something:

A foldable block is described by start and size (e.g. by line numbers).
All blocks are stored in a list (array), sorted by the starting value.
The block state can be stored in a second array, of the same length.

The most frequently used function maps an row number, e.g. in a control,
to a line number in the underlying file:

The line number is initialized with the row number.
The blocks are examined from lowest to highest one, and for every
collapsed block:
1) the block size (number of hidden lines) is added to the line number, and
2) all hidden blocks (ending before the end of this block) are skipped.
The search stops at the first block, that starts after the actual line
number.

Can it really be so simple?

<snipz>

Have you given any thought to how you'll track changes to the sections
(insertions and deletions)?

That can change your block lengths in arbitrary and unpredictable
ways.

Realize you'll at least have to handle insertions from the keyboard
and the clipboard; possibly also drag and drop.

Deletions can occur from the keyboard or as part of clipboard
operations; or due to drag and drop as well.

Stephen Posey
[email]slposey (AT) concentric (DOT) net[/email]


Back to top
Hans-Peter Diettrich
Guest





PostPosted: Mon Jan 23, 2006 12:32 pm    Post subject: Re: Folding Editor Reply with quote

Stephen Posey schrieb:

Quote:
Have you given any thought to how you'll track changes to the sections
(insertions and deletions)?

Yes, of course, even if in the first implementation the files are
read-only, for simplicity.


Quote:
That can change your block lengths in arbitrary and unpredictable
ways.

Realize you'll at least have to handle insertions from the keyboard
and the clipboard; possibly also drag and drop.

The number of inserted or deleted lines can be used immediately, to
update the block descriptors. When the length of an block becomes
negative, the block can be deleted immediately, or it must be skipped in
the evaluation of the block list.

With automatic creation of blocks, e.g. for every begin-end pair in
source code, a rescan of the affected blocks or even larger parts of the
text may be required. With a chance for unpaired begins or ends...

Syntax highlighting requires to break down the displayed text into
tokens. Currently I do this on the fly, whenever a line is displayed,
with a simple line-oriented scanner, that recognizes comments, literals,
identifiers, keywords and "other" tokens. This scanner could add
information about possible block anchors to every line, e.g. for
"begin", "end", "procedure". When that information changes, a rescan of
the text is required.

More updates are required for the hyperlinks, i.e. for jumps to the
definition of a symbol, and back again. Here both the position of the
links as well as the position of the targets can be affected by changes
to the text. Since Pascal is quite easy to parse, it may be possible to
rescan an entire file in the background, after every change. This is
what I have in mind, when I say that I want to keep the structural
information simple and compact, so that memory management (creation and
destruction of objects...) can be kept at a minimum. Sourcecode of other
languages is not so fast to parse, and so I decided to disallow changes
to the text, for now. Instead I'll concentrate on the implementation of
the always required functionality, so that other people can extend that
framework as they like, without reinventing the wheel.

With regards to a model-view-controller (MVC) classification, I'm
implementing a viewer component, and a basic model for the handling of
files and symbol definitions, sufficient to make the viewer work. The
remaining tasks, like the handling of changes to the source text, IMO
falls into the controller branch of the MVC model, and doesn't affect
the other branches.

DoDi


Back to top
Maarten Wiltink
Guest





PostPosted: Mon Jan 23, 2006 12:42 pm    Post subject: Re: Folding Editor Reply with quote

"Hans-Peter Diettrich" <DrDiettrich (AT) nowhere (DOT) nix> wrote

Quote:
Maarten Wiltink schrieb:

[...] every line in the file StringList can have an associated
object, with more background information.

But many languages aren't line-oriented?

Does this really matter? The display at least is line oriented, so that
everything selectable from, or otherwise related to a line, can be
described in one place.

I don't know if it would matter. As a matter of problem-solving style,
my first attempt would be to associate objects with the source at the
same granularity as the object's scope. There would just seem to be a
mismatch hanging data off lines when they apply to part of the line.
Perhaps even several lines.

I would expect to introduce new problems that require working around. I
don't mind solving hard problems when they're intrinsic. I do mind
everything I have to solve that could have been avoided.


Quote:
[...] access to the block list is already O(log(n)), based on a
given line number.

I hadn't realised that. Skipping over blocks inside a hidden block
is still O(n), though.

No, I just found that skipping to the next sibling block will also be
possible in O(log(n)), because this action also is based on invariant
line numbers (the ending line number of every block is known!).

But they aren't sorted in the primary block list, are they? Does that
mean you need an extra list, sorted in ending lines?


[...]
Quote:
That's why I asked about the block pointer with every collapsed-block
row. Constant access time.

Can you become a bit more concrete? I'm not sure that I understand your
idea.

Assuming you have the primary block list with starting and ending lines,
sorted on starting line, and you want to go from a row number to a line
number, you need to look it up in the block list. This is an O(log n)
operation, a binary search.

If you store the list index of the closest surrounding block with every
line, it becomes O(1). I said it of rows representing collapsed blocks,
but it applies to all rows.

Groetjes,
Maarten Wiltink



Back to top
Hans-Peter Diettrich
Guest





PostPosted: Mon Jan 23, 2006 1:19 pm    Post subject: Re: Folding Editor Reply with quote

Now I have some problems with the base class(es) for a folding
viewer/editor component.

The component shall allow to display multiple files, just like the
Delphi IDE does. Consequently the actually displayed contents should be
separated from the component itself, so that the contents can be
exchanged whenever the user selects an different file, from the Tab
control or by following a hyperlink. For this purpose I've designed a
TLines class, which not only provides the text to the component, but
which also stores the state of the viewer (caret position...), so that
with the activation of an different content also the viewer state can be
restored to the last state of that content.

My current problems are the internals and the construction of an
instance of TLines, which is independent from a specific content
(file...) class, so that any application can implement the content
management as desired. How would you manage, that the basic TLines class
will be independent from other content classes, and how to determine
which concrete TLines subclass shall be used for the constrution of such
an object?

DoDi
Back to top
Maarten Wiltink
Guest





PostPosted: Mon Jan 23, 2006 2:06 pm    Post subject: Re: Folding Editor Reply with quote

"Hans-Peter Diettrich" <DrDiettrich (AT) nowhere (DOT) nix> wrote


Quote:
[...] I've designed a TLines class, ...

My current problems are the internals and the construction of an
instance of TLines, which is independent from a specific content
(file...) class, so that any application can implement the content
management as desired. How would you manage, that the basic TLines
class will be independent from other content classes, and how to
determine which concrete TLines subclass shall be used for the
constrution of such an object?

Does TLines derive from TStrings?

Is it possible to tell a TMemo what TStrings subclass to use for its
Lines property?

Groetjes,
Maarten Wiltink



Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> comp.lang.pascal.delphi.misc All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.