 |
BorlandTalk.com Borland discussion newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Phil Manger Guest
|
Posted: Tue Dec 19, 2006 2:20 am Post subject: AnsiString.c_str() |
|
|
I recently encountered a problem with the AnsiString.c_str() function
that proved all the more frustrating because I couldn't reproduce it in
experimental application.
Here's what I was doing: I had derived a component from
TCustomMaskEdit, and one of the things I want to do within the derived
component is check EditText to see if it contains nothing but mask
characters and underscores. I figured the easiest way to do this was to
put the characters I was looking for in a set, and then assign the
pointer returned from c_str() to a char* pointer, and then check each
character one-by-one by incrementing the pointer to the end of the string.
What I was getting, through, was garbage. I ran it through the
debugger, watching EditText and *ptr, and although EditText always
displayed the correct text, *ptr would return some strange character.
I know AnsiString uses reference counting (actually, one of its more
desirable features) and it is therefore dangerous to try to use a
pointer returned by c_str() to manipulate its contents, but I figured as
long as I was just looking and not touching there would be no problem.
Especially since the loop to evaluate the individual characters
immediately followed the call to c_str() and there was no time for
EditText to go off and do whatever it is it does.
Incidentally, I've used pointers returned by c_str() to check characters
dozens of times before without any problem. I'm wondering if in this
case it was because I was dealing with a VCL object.
I could not reproduce the problem in another application. I found a
workaround, but it seems like a waste of memory and coding effort. The
workaround is to allocate a buffer large enough to hold the string, and
then copy it into the buffer. The pointer returned the expected values
when I did that.
Phil |
|
| Back to top |
|
 |
Chris Uzdavinis (TeamB) Guest
|
Posted: Tue Dec 19, 2006 3:20 am Post subject: Re: AnsiString.c_str() |
|
|
Phil Manger <pmanger (AT) comcast (DOT) net> writes:
| Quote: | I know AnsiString uses reference counting (actually, one of its more
desirable features) and it is therefore dangerous to try to use a
pointer returned by c_str() to manipulate its contents, but I figured
as long as I was just looking and not touching there would be no
problem. Especially since the loop to evaluate the individual
characters immediately followed the call to c_str() and there was no
time for EditText to go off and do whatever it is it does.
|
I think you're calling c_str() on a temporary AnsiString, which is
created to hold the results, but is the only reference to the memory.
You extract the pointer, then the AnsiString goes out of scope, the
data is deleted, and your pointer is left pointing to garbage.
Instead, you need to keep a reference to the AnsiString, so declare a
local AnsiString which is initialized to EditText, and then you can
use a pointer into it, since your local AnsiString will ensure the
memory isn't deleted.
--
Chris (TeamB); |
|
| Back to top |
|
 |
Old Wolf Guest
|
Posted: Tue Dec 19, 2006 3:46 am Post subject: Re: AnsiString.c_str() |
|
|
Phil Manger <pmanger (AT) comcast (DOT) net> wrote:
| Quote: | I know AnsiString uses reference counting (actually, one of
its more desirable features)
|
Well, ref counting makes them not be thread-safe, and they can
be slower at runtime because when you change one, it has to
check whether to 'fork' or not. So these days most people
consider it a disadvantage.
| Quote: | and it is therefore dangerous to try to use a pointer
returned by c_str() to manipulate its contents,
|
You should treat that pointer as if it were "const char *" ,
ie. only use it for reading.
| Quote: | but I figured as long as I was just looking and not touching
there would be no problem. Especially since the loop to
evaluate the individual characters immediately followed the
call to c_str() and there was no time for EditText to go off
and do whatever it is it does.
|
If your code was something like:
char const *ptr = foo->EditText.c_str();
then you are in trouble. In their wisdom, Borland made things
like EditText secretly translate into function calls which
return the AnsiString by value. So you are actually pointing
into a temporary object which only exists for the duration of
the above declaration. As soon as you go onto the next line,
the string no longer exists.
It isn't possible to point directly into the text of the edit
control, or whatever it is.
If you had been compiling with CodeGuard it may have told you
that you had accessed invalid memory at this point.
You can "hold on" to the temporary string like this:
AnsiString const &text = foo->EditText;
which will give the temporary string a name, so it will
continue to exist until that name goes out of scope. |
|
| Back to top |
|
 |
Chris Uzdavinis (TeamB) Guest
|
Posted: Tue Dec 19, 2006 4:23 am Post subject: Re: AnsiString.c_str() |
|
|
"Old Wolf" <oldwolf (AT) inspire (DOT) net.nz> writes:
| Quote: | You can "hold on" to the temporary string like this:
AnsiString const &text = foo->EditText;
|
This is true, but it just feels safer to make a copy. Especially
considering that a copying refcounted object is extremely lightweight.
AnsiString text = foo->EditText;
char const * ptr = text.c_str();
// use ptr safely
I'm not sure I'm comfortable storing a reference to the temporary to
extend its life. It should work, but compilers have varying levels of
conformance and this seems like an area that might not be as reliable
as a copy-constructed local variable. (Then again, BCB may handle it
perfectly fine, but knowing how it handles some other
reference-implementation details in a questionable manner (like
allowing a temporary to be bound to a non-const reference) it just
feels brittle.
--
Chris (TeamB); |
|
| Back to top |
|
 |
Old Wolf Guest
|
Posted: Tue Dec 19, 2006 4:46 am Post subject: Re: AnsiString.c_str() |
|
|
Chris Uzdavinis (TeamB) wrote:
| Quote: | "Old Wolf" <oldwolf (AT) inspire (DOT) net.nz> writes:
You can "hold on" to the temporary string like this:
AnsiString const &text = foo->EditText;
This is true, but it just feels safer to make a copy.
Especially considering that a copying refcounted object is
extremely lightweight.
|
Retaining the original object is even more lightweight than
creating a second object and destroying the original.
If the compiler supports RVO then it might do the right thing
either way, but as you point out later, I wouldn't trust
Borland to do that :)
| Quote: | AnsiString text = foo->EditText;
char const * ptr = text.c_str();
// use ptr safely
I'm not sure I'm comfortable storing a reference to the
temporary to extend its life.
It should work, but compilers have varying levels of
conformance and this seems like an area that might not be as
reliable as a copy-constructed local variable. (Then again,
BCB may handle it perfectly fine, but knowing how it handles
some other reference-implementation details in a questionable
manner it just feels brittle.
|
Well, properties are a Borland extension so there are no other
compilers to refer to
I'm sure the compiler implements naming of temporaries
correctly when it is returned from a function call, but it is
conceivable that they muck it up with __properties, like how
they mucked up +=.
You could probably test it by looking at the generated assembly,
but that isn't a guarantee either, viz. the bug where
destructors can be called twice for array __properties.
| Quote: | (like allowing a temporary to be bound to a non-const reference)
|
Probably done because MSVC allowed that, and I suppose Borland
want to emulate MSVC as close as possible so that it's easier
to get transition customers. |
|
| Back to top |
|
 |
Phil Manger Guest
|
Posted: Tue Dec 19, 2006 8:04 am Post subject: Re: AnsiString.c_str() |
|
|
Old Wolf wrote:
| Quote: |
If your code was something like:
char const *ptr = foo->EditText.c_str();
then you are in trouble. In their wisdom, Borland made things
like EditText secretly translate into function calls which
return the AnsiString by value. So you are actually pointing
into a temporary object which only exists for the duration of
the above declaration. As soon as you go onto the next line,
the string no longer exists.
|
That was exactly what my code looked like. And I'm now kicking myself
for forgetting that "properties" are really function calls in disguise
and that a property like EditText is really the return value of a
function like GetEditText() or whatever it's called. That said however,
since AnsiStrings count references, and EditText is (according to
Borland's documentation) a value and not a reference, I would assume
that the return value shares the "real" data buffer with GetEditText(),
and even though the return value of EditText is freed once we grab the
data pointer, the underlying data buffer is not.
| Quote: |
If you had been compiling with CodeGuard it may have told you
that you had accessed invalid memory at this point.
|
I did compile with CodeGuard, and that's what it told me (actually it
told me I was trying to access memory that had been freed), but I could
not figure out why.
| Quote: |
You can "hold on" to the temporary string like this:
AnsiString const &text = foo->EditText;
which will give the temporary string a name, so it will
continue to exist until that name goes out of scope.
|
What I actually did is this:
char *buf;
buf = new char [EditText.Length() + 1];
strcpy(buf, EditText.c_str());
Borland says to do this with an AnsiString, but what about an AnsiString
that is a "property" of a vcl class? I haven't looked at the internal
workings of strcpy(), but since it's standard Ansi C, I'd presume it
just does something like
strcpy(char *dst, char *src) {
ptr1 = src;
ptr2 = dst;
while (*ptr1) *ptr2++ = *ptr1++;
}
If so, wouldn't this lead to the same problem of accessing memory that
had been freed?
Looking over my code in this and other projects, I noticed I've made
this mistake before. I guess I'd better go back and fix things before
something else blows up.
Phil |
|
| Back to top |
|
 |
Old Wolf Guest
|
Posted: Tue Dec 19, 2006 8:40 am Post subject: Re: AnsiString.c_str() |
|
|
Phil Manger <pmanger (AT) comcast (DOT) net> wrote:
| Quote: | Old Wolf wrote:
You can "hold on" to the temporary string like this:
AnsiString const &text = foo->EditText;
which will give the temporary string a name, so it will
continue to exist until that name goes out of scope.
What I actually did is this:
char *buf;
buf = new char [EditText.Length() + 1];
strcpy(buf, EditText.c_str());
wouldn't this lead to the same problem of accessing memory that
had been freed?
|
No -- the temporary object created when you access the property,
persists until the end of the statement that it's in. So it is
still alive when you are in the strcpy.
The only possible gripe I'd have here is that EditText might
change in between the two calls (hopefully it wouldn't, but
with a function call you never know).
I advise against the code you posted just now: because manual
memory management introduces problems you don't need (what if
you forget to call delete[], or calling EditText.c_str() throws
an exception?)
It is simplest, as Chris stated, to make a new AnsiString in the
local scope so that it will exist for longer.
In general, don't use "new char[]". use std::vector or
boost::array instead. They automatically allocate and free
their memory, so they can't go wrong. |
|
| Back to top |
|
 |
Remy Lebeau (TeamB) Guest
|
Posted: Tue Dec 19, 2006 8:43 am Post subject: Re: AnsiString.c_str() |
|
|
"Phil Manger" <pmanger (AT) comcast (DOT) net> wrote in message
news:458748BE.4000202 (AT) comcast (DOT) net...
| Quote: | I'm now kicking myself for forgetting that "properties" are really
function
calls in disguise
|
Not always.
| Quote: | and that a property like EditText is really the return value of a
function like GetEditText() or whatever it's called.
|
That is where documentation and header files come into play. EditText is
documented as the following declaration:
__property AnsiString EditText = {read=GetEditText, write=SetEditText};
| Quote: | That said however, since AnsiStrings count references, and EditText is
(according to Borland's documentation) a value and not a reference, I
would assume that the return value shares the "real" data buffer with
GetEditText()
|
You assume incorrectly. Most AnsiString properties that use getter methods
do so because they have to make a fresh copy of the underlying data each
time. For edit controls (including TMaskEdit), that means issuing the
WM_GETTEXT message to the control's window. Which means that the resulting
AnsiString always has a reference count of 1 because it is not sharing an
internal AnsiString, but is a new AnsiString being constructoed from an
internal OS buffer.
| Quote: | even though the return value of EditText is freed once we grab the
data pointer, the underlying data buffer is not.
|
That is almost never the case. In most cases, the AnsiString that is
returned is holding an original buffer that contains a copy of the data.
Which is why it is usually dangerous to hold on to a char* pointer from a
call to c_str() on a temporary AnsiString - the buffer itself is just as
temporary as the AnsiString to wraps it.
| Quote: | What I actually did is this:
char *buf;
buf = new char [EditText.Length() + 1];
strcpy(buf, EditText.c_str());
|
You don't need to allocate a new buffer. Simply use an AnsiString
(non-reference) variable to increment the original buffer's reference count:
AnsiString buf = EditText;
| Quote: | Borland says to do this with an AnsiString, but what about an AnsiString
that is a "property" of a vcl class?
snip
If so, wouldn't this lead to the same problem of accessing memory that
had been freed?
|
No, because the AnsiString that you are copying the data from has not been
freed yet. You are accessing the c_str() pointer in the same statement that
is calling strcpy(). The temporary AnsiString that c_str() is being called
on is still in scope until the statement is finished. Had you called
c_str() in strcpy() in separate statements, then you would be accessing
invalid memory again (unless you assign the temporary AnsiString to a local
AnsiString to keep the data in scope).
Gambit |
|
| Back to top |
|
 |
Phil Manger Guest
|
Posted: Tue Dec 19, 2006 8:44 am Post subject: Re: AnsiString.c_str() |
|
|
Phil Manger wrote:
| Quote: |
What I actually did is this:
char *buf;
buf = new char [EditText.Length() + 1];
strcpy(buf, EditText.c_str());
Borland says to do this with an AnsiString, but what about an AnsiString
that is a "property" of a vcl class? I haven't looked at the internal
workings of strcpy(), but since it's standard Ansi C, I'd presume it
just does something like
strcpy(char *dst, char *src) {
ptr1 = src;
ptr2 = dst;
while (*ptr1) *ptr2++ = *ptr1++;
}
If so, wouldn't this lead to the same problem of accessing memory that
had been freed?
|
The following is taken directly from Borland's on-line documentation:
| Quote: | If you need a persistent pointer, you MUST copy the string into its own buffer:
char* cp = new char[ Edit1->Text.Length() + 1 ];
strcpy( cp, Edit1->Text.c_str() );
|
It's not stated explicitly in the example, but presumably Edit1 is a
TEdit object, which would make Text a property, i.e., the return value
of a function.
Anybody have any ideas on this? I'm worried now that my fix might not
have fixed anything, except superficially.
Phil |
|
| Back to top |
|
 |
Phil Manger Guest
|
Posted: Tue Dec 19, 2006 9:10 am Post subject: Re: AnsiString.c_str() |
|
|
Remy Lebeau (TeamB) wrote:
| Quote: |
That is where documentation and header files come into play. EditText is
documented as the following declaration:
__property AnsiString EditText = {read=GetEditText, write=SetEditText};
|
Thanks for the reminder...if I'd looked at the documentation recently
I'd have remembered that. And you're right: not all properties are
disguised functions.
| Quote: | You don't need to allocate a new buffer. Simply use an AnsiString
(non-reference) variable to increment the original buffer's reference count:
AnsiString buf = EditText;
Yes, I can see that since we're just incrementing the reference count, |
this would involve less overhead than copying the characters into a buffer.
Thanks all for enlightening me.
Phil |
|
| Back to top |
|
 |
Phil Manger Guest
|
Posted: Tue Dec 19, 2006 9:10 am Post subject: Re: AnsiString.c_str() |
|
|
Remy Lebeau (TeamB) wrote:
| Quote: |
That is where documentation and header files come into play. EditText is
documented as the following declaration:
__property AnsiString EditText = {read=GetEditText, write=SetEditText};
|
Thanks for the reminder...if I'd looked at the documentation recently
I'd have remembered that. And you're right: not all properties are
disguised functions.
| Quote: | You don't need to allocate a new buffer. Simply use an AnsiString
(non-reference) variable to increment the original buffer's reference count:
AnsiString buf = EditText;
Yes, I can see that since we're just incrementing the reference count, |
this would involve less overhead than copying the characters into a buffer.
Thanks all for enlightening me.
Phil |
|
| Back to top |
|
 |
Phil Manger Guest
|
Posted: Tue Dec 19, 2006 9:10 am Post subject: Re: AnsiString.c_str() |
|
|
Remy Lebeau (TeamB) wrote:
| Quote: |
That is where documentation and header files come into play. EditText is
documented as the following declaration:
__property AnsiString EditText = {read=GetEditText, write=SetEditText};
|
Thanks for the reminder...if I'd looked at the documentation recently
I'd have remembered that. And you're right: not all properties are
disguised functions.
| Quote: | You don't need to allocate a new buffer. Simply use an AnsiString
(non-reference) variable to increment the original buffer's reference count:
AnsiString buf = EditText;
Yes, I can see that since we're just incrementing the reference count, |
this would involve less overhead than copying the characters into a buffer.
Thanks all for enlightening me.
Phil |
|
| Back to top |
|
 |
Remy Lebeau (TeamB) Guest
|
Posted: Tue Dec 19, 2006 9:10 am Post subject: Re: AnsiString.c_str() |
|
|
"Phil Manger" <pmanger (AT) comcast (DOT) net> wrote in message
news:45875208.8020707 (AT) comcast (DOT) net...
| Quote: | The following is taken directly from Borland's on-line documentation:
|
Borland's examples tend to have issues/bugs in them more times than not.
| Quote: | char* cp = new char[ Edit1->Text.Length() + 1 ];
strcpy( cp, Edit1->Text.c_str() );
|
That is not an efficient use of the Text property. Because the property is
being queried twice, temporary memory is being allocated and freed twce in
addition to the memory you are allocating yourself. At the very least, the
code should be using the GetTextLen() method instead of querying the Text
property twice:
char* cp = new char[ Edit1->GetTextLen() + 1 ];
strcpy( cp, Edit1->Text.c_str() );
//...
delete[] cp;
Even better, eliminate the Text property altogether and use GetTextBuf()
directly instead (which the Text property uses internally):
int len = Edit1->GetTextLen();
char* cp = new char[ len + 1 ];
Edit1->GetTextBuf( cp, len + 1 );
//...
delete[] cp;
But, since the above is EXACTLY what the Text property does internally,
there is no point in allocating your own buffer when the VCL can do it for
you automatically.
| Quote: | It's not stated explicitly in the example, but presumably Edit1 is a
TEdit object, which would make Text a property, i.e., the return value
of a function.
|
That is exactly what is happening.
Gambit |
|
| Back to top |
|
 |
Chris Uzdavinis (TeamB) Guest
|
Posted: Tue Dec 19, 2006 8:47 pm Post subject: Re: AnsiString.c_str() |
|
|
Phil Manger <pmanger (AT) comcast (DOT) net> writes:
| Quote: | The following is taken directly from Borland's on-line documentation:
If you need a persistent pointer, you MUST copy the string into its
own buffer: char* cp = new char[ Edit1->Text.Length() + 1 ];
strcpy( cp, Edit1->Text.c_str() );
|
That's a bad example, and bad advice in their documentation.
Especially since "MUST" is in capital letters, as if there are no
other choices.
We could quibble over the condition that introduces the example, "if
you need a PERSISTENT pointer...", and then you know it must be heap
allocated, but it could still be managed by an AnsiString object, or
an std::string, which both heap allocate and manage the memory for
you. But if you store a pointer into an object, you have to ensure
the object's lifetime is at least as long as the pointer you're
storing, and that you don't do operations on the string that might
invalidate the pointer.
| Quote: | It's not stated explicitly in the example, but presumably Edit1 is a
TEdit object, which would make Text a property, i.e., the return value
of a function.
Anybody have any ideas on this? I'm worried now that my fix might not
have fixed anything, except superficially.
|
using new[] isn't wrong, it's just a lower-level and error prone
solution that has better alternatives. If an exception is thrown
before you delete[] the pointer, for example, it will be leaked. It's
also easy to accidently use delete instead of delete[] on it, or do
other things to mismanage the pointer.
--
Chris (TeamB); |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|