| View previous topic :: View next topic |
| Author |
Message |
Vincent Delporte Guest
|
Posted: Sun Aug 06, 2006 8:10 am Post subject: How does a www server know it's not a web client? |
|
|
Hi
I naively used Indy's Get() to download the contents of a web
site and parse out its data, but the server could tell it was a script
doing the navigation instead of a web browser.
How does a server tell the difference? Because a navigator uploads
information such as the PC's OS, screen definition, etc? Is there a
way for Indy or any other Dephi-friendly Internet component to fake a
web client?
Thank you. |
|
| Back to top |
|
 |
Francois PIETTE [ICS - Mi Guest
|
Posted: Sun Aug 06, 2006 2:08 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | I naively used Indy's Get() to download the contents of a web
site and parse out its data, but the server could tell it was a script
doing the navigation instead of a web browser.
How does a server tell the difference? Because a navigator uploads
information such as the PC's OS, screen definition, etc? Is there a
way for Indy or any other Dephi-friendly Internet component to fake a
web client?
|
You have to reproduce the exact same request as a browser is doing. Pay
attention to header lines and cookies. Use a sniffer or other spy tool (such
as SocketSpy http://www.overbyte.Be, follow UserMade link and then search
SocketSpy) to see what request your browser really send and build the same.
Cookies are a two step process: you go to some page, gran the cookie and
then send the cookie back for other pages.
Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be |
|
| Back to top |
|
 |
Lucian Guest
|
Posted: Sun Aug 06, 2006 7:04 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
seems it's missing TWSocketServer. can you rather make available the
compiled exe instead?
regards,
Lucian |
|
| Back to top |
|
 |
Francois PIETTE [ICS - Mi Guest
|
Posted: Sun Aug 06, 2006 7:12 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | tool (such as SocketSpy http://www.overbyte.Be, follow UserMade link
and then search SocketSpy)
seems it's missing TWSocketServer. can you rather make available the
compiled exe instead?
|
TWSocketServer is one of the ICS component. Download ICS.ZIP and install it.
It take only a few minutes: just unzip ICS.ZIP in a directory of your choice
(respect the directory structure in the zip when dezipping), add
subdirectory vc32 to you Delphi search path, locate IcsDel70.dpk (or
whatever number corresponding to your Delphi version), build it and install
it. Then reopen SocketSpy project and build it.
Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be |
|
| Back to top |
|
 |
Lucian Guest
|
Posted: Sun Aug 06, 2006 7:29 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | TWSocketServer is one of the ICS component. Download ICS.ZIP and
install it. It take only a few minutes: just unzip ICS.ZIP in a
|
I downloaded OverbyteIcsV6beta.zip. Is that it?
Lucian |
|
| Back to top |
|
 |
Francois PIETTE [ICS - Mi Guest
|
Posted: Sun Aug 06, 2006 7:36 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | TWSocketServer is one of the ICS component. Download ICS.ZIP and
install it. It take only a few minutes: just unzip ICS.ZIP in a
I downloaded OverbyteIcsV6beta.zip. Is that it?
|
You downloaded the version 6 beta.
I think SocketSpy is done using the released version 5. The file is ICS.ZIP.
The list is in red in the middle of the ICS page and libelled "Download the
latest ICS-V5 Distribution". You can't miss it !!
Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be |
|
| Back to top |
|
 |
Lucian Guest
|
Posted: Sun Aug 06, 2006 7:40 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
Ok, the thing compiles here. How do I use it to see the headers sent
from my IE? Basically I am trying to automatically check some web-sites
for newer versions of some pdf files. One specific website uses https.
I want to see how exactly the request looks like, cuz IE can download
the file no problem, but I fail using Indy.
regards,
Lucian |
|
| Back to top |
|
 |
Francois PIETTE [ICS - Mi Guest
|
Posted: Sun Aug 06, 2006 7:58 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | Ok, the thing compiles here. How do I use it to see the headers sent
from my IE?
|
You run SocketSpy on the same computer as IE and configure IE to use
127.0.0.1 as proxy.
You configure SocketSpy to listen on port 80 and to connect to the host you
give in the URL in IE.
When IE send his request, he will connect to SocketSpy and send his request.
SocketSpy will display the request in a form (you can easily grab the data
displayed and write it to a file should you want to further examine it).
| Quote: | Basically I am trying to automatically check some web-sites
for newer versions of some pdf files. One specific website uses https.
|
HTTPS is another problem. You need a HTTPS enabled HTTP component. Both Indy
and ICS have one.
But spying on the HTTPS connection will not reveal anything since it is
encrypted !
Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be |
|
| Back to top |
|
 |
Jamie Dale Guest
|
Posted: Sun Aug 06, 2006 8:42 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
Actually I thought it was a case of the header sending the client type? -
like this:
Sent: User-Agent: Mozilla/3.0 (compatible; Indy Library)
Thats what my IdHTTP component sends in the header and I never get any
problems.
I'm pretty sure it's a user-agent issue that you need to specify. Thats
generally the main way that webservers determine the client that is
connected to it via http. Screen size, OS, and others are generally done by
cookies but for that reason, cookies can be disabled so that you do not
report this information back. The only general thing that is identified to
webservers is the User-Agent.
I'd suggest you try looking into this.
Jamie
"Francois PIETTE [ICS - MidWare]" <francois.piette (AT) overbyte (DOT) be> wrote in
message news:44d5b1f2$1 (AT) newsgroups (DOT) borland.com...
| Quote: | I naively used Indy's Get() to download the contents of a web
site and parse out its data, but the server could tell it was a script
doing the navigation instead of a web browser.
How does a server tell the difference? Because a navigator uploads
information such as the PC's OS, screen definition, etc? Is there a
way for Indy or any other Dephi-friendly Internet component to fake a
web client?
You have to reproduce the exact same request as a browser is doing. Pay
attention to header lines and cookies. Use a sniffer or other spy tool
(such as SocketSpy http://www.overbyte.Be, follow UserMade link and then
search SocketSpy) to see what request your browser really send and build
the same. Cookies are a two step process: you go to some page, gran the
cookie and then send the cookie back for other pages.
Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html
--
francois.piette (AT) overbyte (DOT) be
The author for the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be
|
|
|
| Back to top |
|
 |
Vincent Delporte Guest
|
Posted: Mon Aug 07, 2006 2:45 am Post subject: Re: How does a www server know it's not a web client? |
|
|
On Sun, 6 Aug 2006 11:08:07 +0200, "Francois PIETTE [ICS - MidWare]"
<francois.piette (AT) overbyte (DOT) be> wrote:
| Quote: | You have to reproduce the exact same request as a browser is doing.
|
OK, it's too much work for that project. I'll put that on the
backburner. Thanks for the info. |
|
| Back to top |
|
 |
Remy Lebeau (TeamB) Guest
|
Posted: Mon Aug 07, 2006 11:45 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
"Vincent Delporte" <justask (AT) acme (DOT) com> wrote in message
news:2b5bd2pg9v19j6268dq8vppveokvmhgo7g (AT) 4ax (DOT) com...
| Quote: | How does a server tell the difference?
|
Web browser's identify themselves by including the 'User-Agent' header in
their request. You can use TIdHTTP's Request.UserAgent property to mimic
the ID of any web browser. Programs like GetRight do this, for instance.
Some web servers provide browser-specific content, so they look for the
'User-Agent' header.
Gambit |
|
| Back to top |
|
 |
Jamie Dale Guest
|
Posted: Tue Aug 08, 2006 6:55 pm Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | Web browser's identify themselves by including the 'User-Agent' header in
their request. You can use TIdHTTP's Request.UserAgent property to mimic
the ID of any web browser. Programs like GetRight do this, for instance.
Some web servers provide browser-specific content, so they look for the
'User-Agent' header.
|
Which is what I was also getting at in my reply too.
Jamie |
|
| Back to top |
|
 |
Lucian Guest
|
Posted: Fri Aug 11, 2006 5:03 am Post subject: Re: How does a www server know it's not a web client? |
|
|
| Quote: | HTTPS is another problem. You need a HTTPS enabled HTTP component.
Both Indy and ICS have one. But spying on the HTTPS connection will
not reveal anything since it is encrypted !
|
I got my stuff working. WinInet direct calls. No third parties.
Lucian |
|
| Back to top |
|
 |
|