Confusion regarding following a link

Taha Masood (taha.masood@streaming-networks.com)
Fri, 20 Apr 2001 11:37:17 -0700


This is a multi-part message in MIME format.

------=_NextPart_000_0031_01C0C98E.40CEAF40
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi folks ,

I have a little confusion regarding HTTP , I would appreciate if someone =
could help me solve it.
The problem is as follows:

e.g. I give the following request to a browser:
http://directory.google.com/Top/Computers/Algorithms/

it Builds the a request that aprt from other things contains the =
following in which I am interested now:

GET /Top/Computers/Algorithms/ HTTP/1.1
Host: directory.google.com


Fine , the server responds back and gives an HTML page to me back .
Now I render the HTML to my GUI .
The HTML contains the following link:

<a =
href=3D"/Top/Science/Math/Applications/Communication_Theory/Cryptography/=
Algorithms/">Cryptography</a>

Now the confusion is that if  my user "clicks" on the hyperlink given =
above , what request should I generate :

What I used to do till now was to classify the situation into three =
portions:

  Whenever we are currently viewing a certain page on the web , and we =
try=20
  to follow a link to another page , there can be three cases. For all =
the=20
  cases , the current page is say : www.abc.com/help/u1/myHelp.html

  FIRST CASE:
  The link I try to follow is : "/yourHelp.com"
  Effective URL should be:
  www.abc.com/help/u1/yourHelp.com

  SECOND CASE:
  The link I try to follow is : "../../TopLevelHelp.com"
  Effective URL should be:
  www.abc.com/TopLevelHelp.com

  THIRD CASE:
  The link I try to follow is : "www.beta.com/OtherHelp.com"
  Effective URL should be:
  www.beta.com/OtherHelp.com

I had implemented a little parsing in my application which works in a =
way that it is given the URL of the resource currently being displayed =
and the link which we are trying to follow , which given in the HTML =
after " <a href=3D " tag. , and then it returns an Effective URL which =
actually has to be shown . From that URL , I separate the Host part and =
the relative part , and  build an HTTP request and pass it on to the =
server . IT used to work pretty fine till now , but I encountered an =
error today , that led me to believe that I was probably NOT =
understanding the things probably.

The problem occurred when I got to the page :

http://directory.google.com/Top/Computers/Algorithms/

The above contains a line in HTML as :
<a =
href=3D"/Top/Science/Math/Applications/Communication_Theory/Cryptography/=
Algorithms/">Cryptography</a>

Now when  my user "clicks" on the hyperlink given above , according to =
my CASES , this thing falls into the FIRST CASE , and what I do is that =
the EFFECTIVE URL made is:

http://directory.google.com/Top/Computers/Algorithms/Top/Science/Math/App=
lications/Communication_Theory/Cryptography/Algorithms/

Fine , so I remove the host and relative part and Build the HTTP request =
:

GET =
/Top/Computers/Algorithms/Top/Science/Math/Applications/Communication_The=
ory/Cryptography/Algorithms/  HTTP/1.1
Host: directory.google.com

The server replies that this resource is not there .

When I follow  the same link through MS Internet Explorer , the request =
it generates is :

GET =
/Top/Science/Math/Applications/Communication_Theory/Cryptography/Algorith=
ms/ HTTP/1.1
Host: directory.google.com


I fail to understand what are the General Rules for following links ? =
What portion of the RFC refers to it ?

I would really appreciate if someone could explain this to me .

Thanks in advance ,

Regards,
Taha









------=_NextPart_000_0031_01C0C98E.40CEAF40
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
Hi folks ,
 
I have a little confusion regarding = HTTP , I would=20 appreciate if someone could help me solve it.
The problem is as follows:
 
e.g. I give the following request to a=20 browser:
http://dir= ectory.google.com/Top/Computers/Algorithms/
 
it Builds the a request that aprt from = other things=20 contains the following in which I am interested now:
 
GET /Top/Computers/Algorithms/=20 HTTP/1.1
Host: directory.google.com
 
 
Fine , the server responds back and = gives an HTML=20 page to me back .
Now I render the HTML to my GUI = .
The HTML contains the following = link:
 
<a=20 href=3D"/Top/Science/Math/Applications/Communication_Theory/Cryptography/= Algorithms/">Cryptography</a>
 
Now the confusion is that if  my = user "clicks"=20 on the hyperlink given above , what request should I generate = :
 
What I used to do till now was to = classify the=20 situation into three portions:
 
  Whenever we are currently = viewing a certain=20 page on the web , and we try
  to follow a link to another page = , there=20 can be three cases. For all the
  cases , the current page is = say : www.abc.com/help/u1/myHel= p.html
 
  FIRST CASE:
  The link I = try to=20 follow is : "/yourHelp.com"
  Effective URL should be:
  = www.abc.com/help/u1/your= Help.com
 
  SECOND CASE:
  The link = I try to=20 follow is : "../../TopLevelHelp.com"
  Effective URL should=20 be:
  www.abc.com/TopLevelHelp.com=
 
  THIRD CASE:
  The link I = try to=20 follow is : "www.beta.com/OtherHelp.com= "
 =20 Effective URL should be:
  www.beta.com/OtherHelp.com=
 
I had implemented a little parsing in = my=20 application which works in a way that it is given the URL of the = resource=20 currently being displayed and the link which we are trying to follow , = which=20 given in the HTML after " <a href=3D " tag. , and then it returns an = Effective=20 URL which actually has to be shown . From that URL , I separate the Host = part=20 and the relative part , and  build an HTTP request and pass it on = to the=20 server . IT used to work pretty fine till now , but I encountered an = error today=20 , that led me to believe that I was probably NOT understanding the = things=20 probably.
 
The problem occurred when I got to the = page=20 :
 
http://dir= ectory.google.com/Top/Computers/Algorithms/
 
The above contains a line in HTML as = :
<a=20 href=3D"/Top/Science/Math/Applications/Communication_Theory/Cryptography/= Algorithms/">Cryptography</a>
 
Now when my user "clicks" on the = hyperlink=20 given above , according to my CASES , this thing falls into the FIRST = CASE , and=20 what I do is that the EFFECTIVE URL made is:
 
http://d= irectory.google.com/Top/Computers/Algorithms/Top/Science/Math/Application= s/Communication_Theory/Cryptography/Algorithms/
 
Fine , so I remove the host and relative part and Build the HTTP = request=20 :
 
GET=20 /Top/Computers/Algorithms/Top/Science/Math/Applications/Communication_The= ory/Cryptography/Algorithms/=20 HTTP/1.1
Host: directory.google.com
 
The server replies that this resource is not there .
 
When I follow  the same link through MS Internet Explorer , = the=20 request it generates is :
 
GET=20 /Top/Science/Math/Applications/Communication_Theory/Cryptography/Algorith= ms/=20 HTTP/1.1
Host: directory.google.com
 
 
I fail to understand what are the General Rules for following links = ? What=20 portion of the RFC refers to it ?
 
I would really appreciate if someone could explain this to me = .
 
Thanks in advance ,
 
Regards,
Taha
 
 
 
 
 
 
 
 
------=_NextPart_000_0031_01C0C98E.40CEAF40--