BorlandTalk.com Forum Index BorlandTalk.com
Borland discussion newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How does a www server know it's not a web client?

 
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Internet Winsock
View previous topic :: View next topic  
Author Message
Vincent Delporte
Guest





PostPosted: Sun Aug 06, 2006 8:10 am    Post subject: How does a www server know it's not a web client? Reply with quote



Hi

I naively used Indy's Get() to download the contents of a web
site and parse out its data, but the server could tell it was a script
doing the navigation instead of a web browser.

How does a server tell the difference? Because a navigator uploads
information such as the PC's OS, screen definition, etc? Is there a
way for Indy or any other Dephi-friendly Internet component to fake a
web client?

Thank you.
Back to top
Francois PIETTE [ICS - Mi
Guest





PostPosted: Sun Aug 06, 2006 2:08 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote



Quote:
I naively used Indy's Get() to download the contents of a web
site and parse out its data, but the server could tell it was a script
doing the navigation instead of a web browser.

How does a server tell the difference? Because a navigator uploads
information such as the PC's OS, screen definition, etc? Is there a
way for Indy or any other Dephi-friendly Internet component to fake a
web client?

You have to reproduce the exact same request as a browser is doing. Pay
attention to header lines and cookies. Use a sniffer or other spy tool (such
as SocketSpy http://www.overbyte.Be, follow UserMade link and then search
SocketSpy) to see what request your browser really send and build the same.
Cookies are a two step process: you go to some page, gran the cookie and
then send the cookie back for other pages.

Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be
Back to top
Lucian
Guest





PostPosted: Sun Aug 06, 2006 7:04 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote



Quote:
tool (such as SocketSpy http://www.overbyte.Be, follow UserMade link
and then search SocketSpy)

seems it's missing TWSocketServer. can you rather make available the
compiled exe instead?


regards,

Lucian
Back to top
Francois PIETTE [ICS - Mi
Guest





PostPosted: Sun Aug 06, 2006 7:12 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Quote:
tool (such as SocketSpy http://www.overbyte.Be, follow UserMade link
and then search SocketSpy)

seems it's missing TWSocketServer. can you rather make available the
compiled exe instead?

TWSocketServer is one of the ICS component. Download ICS.ZIP and install it.
It take only a few minutes: just unzip ICS.ZIP in a directory of your choice
(respect the directory structure in the zip when dezipping), add
subdirectory vc32 to you Delphi search path, locate IcsDel70.dpk (or
whatever number corresponding to your Delphi version), build it and install
it. Then reopen SocketSpy project and build it.

Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be
Back to top
Lucian
Guest





PostPosted: Sun Aug 06, 2006 7:29 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Quote:
TWSocketServer is one of the ICS component. Download ICS.ZIP and
install it. It take only a few minutes: just unzip ICS.ZIP in a

I downloaded OverbyteIcsV6beta.zip. Is that it?

Lucian
Back to top
Francois PIETTE [ICS - Mi
Guest





PostPosted: Sun Aug 06, 2006 7:36 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Quote:
TWSocketServer is one of the ICS component. Download ICS.ZIP and
install it. It take only a few minutes: just unzip ICS.ZIP in a

I downloaded OverbyteIcsV6beta.zip. Is that it?

You downloaded the version 6 beta.
I think SocketSpy is done using the released version 5. The file is ICS.ZIP.
The list is in red in the middle of the ICS page and libelled "Download the
latest ICS-V5 Distribution". You can't miss it !!

Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be
Back to top
Lucian
Guest





PostPosted: Sun Aug 06, 2006 7:40 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Ok, the thing compiles here. How do I use it to see the headers sent
from my IE? Basically I am trying to automatically check some web-sites
for newer versions of some pdf files. One specific website uses https.
I want to see how exactly the request looks like, cuz IE can download
the file no problem, but I fail using Indy.

regards,

Lucian
Back to top
Francois PIETTE [ICS - Mi
Guest





PostPosted: Sun Aug 06, 2006 7:58 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Quote:
Ok, the thing compiles here. How do I use it to see the headers sent
from my IE?

You run SocketSpy on the same computer as IE and configure IE to use
127.0.0.1 as proxy.
You configure SocketSpy to listen on port 80 and to connect to the host you
give in the URL in IE.
When IE send his request, he will connect to SocketSpy and send his request.
SocketSpy will display the request in a form (you can easily grab the data
displayed and write it to a file should you want to further examine it).


Quote:
Basically I am trying to automatically check some web-sites
for newer versions of some pdf files. One specific website uses https.

HTTPS is another problem. You need a HTTPS enabled HTTP component. Both Indy
and ICS have one.
But spying on the HTTPS connection will not reveal anything since it is
encrypted !

Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be
Back to top
Jamie Dale
Guest





PostPosted: Sun Aug 06, 2006 8:42 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Actually I thought it was a case of the header sending the client type? -
like this:
Sent: User-Agent: Mozilla/3.0 (compatible; Indy Library)


Thats what my IdHTTP component sends in the header and I never get any
problems.

I'm pretty sure it's a user-agent issue that you need to specify. Thats
generally the main way that webservers determine the client that is
connected to it via http. Screen size, OS, and others are generally done by
cookies but for that reason, cookies can be disabled so that you do not
report this information back. The only general thing that is identified to
webservers is the User-Agent.

I'd suggest you try looking into this.

Jamie

"Francois PIETTE [ICS - MidWare]" <francois.piette (AT) overbyte (DOT) be> wrote in
message news:44d5b1f2$1 (AT) newsgroups (DOT) borland.com...
Quote:
I naively used Indy's Get() to download the contents of a web
site and parse out its data, but the server could tell it was a script
doing the navigation instead of a web browser.

How does a server tell the difference? Because a navigator uploads
information such as the PC's OS, screen definition, etc? Is there a
way for Indy or any other Dephi-friendly Internet component to fake a
web client?

You have to reproduce the exact same request as a browser is doing. Pay
attention to header lines and cookies. Use a sniffer or other spy tool
(such as SocketSpy http://www.overbyte.Be, follow UserMade link and then
search SocketSpy) to see what request your browser really send and build
the same. Cookies are a two step process: you go to some page, gran the
cookie and then send the cookie back for other pages.

Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be


Back to top
Vincent Delporte
Guest





PostPosted: Mon Aug 07, 2006 2:45 am    Post subject: Re: How does a www server know it's not a web client? Reply with quote

On Sun, 6 Aug 2006 11:08:07 +0200, "Francois PIETTE [ICS - MidWare]"
<francois.piette (AT) overbyte (DOT) be> wrote:
Quote:
You have to reproduce the exact same request as a browser is doing.

OK, it's too much work for that project. I'll put that on the
backburner. Thanks for the info.
Back to top
Remy Lebeau (TeamB)
Guest





PostPosted: Mon Aug 07, 2006 11:45 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

"Vincent Delporte" <justask (AT) acme (DOT) com> wrote in message
news:2b5bd2pg9v19j6268dq8vppveokvmhgo7g (AT) 4ax (DOT) com...

Quote:
How does a server tell the difference?

Web browser's identify themselves by including the 'User-Agent' header in
their request. You can use TIdHTTP's Request.UserAgent property to mimic
the ID of any web browser. Programs like GetRight do this, for instance.
Some web servers provide browser-specific content, so they look for the
'User-Agent' header.


Gambit
Back to top
Jamie Dale
Guest





PostPosted: Tue Aug 08, 2006 6:55 pm    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Quote:
Web browser's identify themselves by including the 'User-Agent' header in
their request. You can use TIdHTTP's Request.UserAgent property to mimic
the ID of any web browser. Programs like GetRight do this, for instance.
Some web servers provide browser-specific content, so they look for the
'User-Agent' header.

Which is what I was also getting at in my reply too.

Jamie
Back to top
Lucian
Guest





PostPosted: Fri Aug 11, 2006 5:03 am    Post subject: Re: How does a www server know it's not a web client? Reply with quote

Quote:
HTTPS is another problem. You need a HTTPS enabled HTTP component.
Both Indy and ICS have one. But spying on the HTTPS connection will
not reveal anything since it is encrypted !

I got my stuff working. WinInet direct calls. No third parties.

Lucian
Back to top
Display posts from previous:   
Post new topic   Reply to topic    BorlandTalk.com Forum Index -> Delphi Internet Winsock All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.