[E-Lang] remote comms: Timeouts and Connection Failure

Karp, Alan alan_karp@hp.com
Fri, 20 Apr 2001 09:55:29 -0700


MarcS wrote:
> 
> This ratio of many application failures per "connection" 
> failure could be
> caused by either bad code or good connections;
>

I didnt' say "lost connections"; I said "undetected lost connections".
Connections get dropped all the time, but I almost always get a null message
on the socket (or a dial tone in your case).  That's a BURP (See my earlier
note for a definition.) that I know how to deal with.  

It's the NAP that's a problem.  It is distinguishing the slow from the
erroneous application that requires a default policy that has nothing to do
with keepalives.  And it's not just the application.  NAP could be caused by
message queue overflow in the network, excessive retries by a router, etc.,
etc., etc.  Keepalives, being short, often are optimized differently than
the reply I'm looking for.

_________________________
Alan Karp
Principal Scientist
Decision Technology Department
Hewlett-Packard Laboratories MS 1U-2
1501 Page Mill Road
Palo Alto, CA 94304
(650) 857-3967, fax (650) 857-6278
https://ecardfile.com/id/Alan_Karp
http://www.hpl.hp.com/personal/Alan_Karp/