U.S. patent application number 12/313434 was filed with the patent office on 2009-10-08 for reliable isp access cloud state detection method and apparatus.
Invention is credited to Sajit Bhaskaran, Prashanth Krishnamorthy, Anmol Kumar.
Application Number | 20090252044 12/313434 |
Document ID | / |
Family ID | 41133164 |
Filed Date | 2009-10-08 |
United States Patent
Application |
20090252044 |
Kind Code |
A1 |
Bhaskaran; Sajit ; et
al. |
October 8, 2009 |
Reliable ISP Access Cloud state detection method and apparatus
Abstract
A Multi-Homing System is equipped with an Adaptive ISP Access
Cloud State Detection apparatus (ACSD) that improves the
reliability of the availability of digital connections (links)
between computer sites, such as a Computer Premises Network and the
Internet, in which such connections are made by connecting through
a multiplicity of ISP Access Clouds (links). Reliability is
improved over prior art methods by using data elements of Internet
Protocol datagrams, e.g. record fields or bits of fields, that are
regularly and normally exchanged between the ISP Access Clouds and
the CPN without creating additional data traffic. Data Elements
from each ISP Access Cloud are used by processing functions of the
by the ACSD to test for conditions that indicate that it may be in
a DOWN status, when a DOWN status is suspected, other functions in
the ACSD initiate transmission of a set of PROBE packets that can
reliably determine if the suspect link is actually DOWN or merely
giving a response that would be interpreted as DOWN by prior art
methods.
Inventors: |
Bhaskaran; Sajit;
(Sunnyvale, CA) ; Kumar; Anmol; (Santa Clara,
CA) ; Krishnamorthy; Prashanth; (Santa Clara,
CA) |
Correspondence
Address: |
GEORGE M STERES
20200 PIERCE RD
SARATOGA
CA
95070
US
|
Family ID: |
41133164 |
Appl. No.: |
12/313434 |
Filed: |
November 20, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11012554 |
Dec 14, 2004 |
|
|
|
12313434 |
|
|
|
|
Current U.S.
Class: |
370/248 |
Current CPC
Class: |
H04L 43/0817 20130101;
H04L 43/0811 20130101 |
Class at
Publication: |
370/248 |
International
Class: |
H04L 12/26 20060101
H04L012/26 |
Claims
1. An adaptive network communications system communicating through
a 1.sup.st plurality of communication access paths to a 2.sup.nd
plurality of destinations each identified by a destination address
that uses a subset of observed round trip times (RTT) for a set of
messages sent through said 1.sup.st plurality of communication
access paths to detect UP or DOWN status of communication through
any of said communication access paths, said system comprising: a.
means for determining said Up or Down status of one of said
communications paths as Up when more than one message directed to
one of said destinations through said one of said communication
paths is acknowledged by said destination within a wait delay
period. TIMEOUT, and determining said Up or Down status of said one
of said communications paths as Down when said more than one
message directed to said one of said destinations through said one
of said communication paths is not acknowledged by said destination
within said wait delay period. TIMEOUT, wherein TIMEOUT equals a
multiple of the maximum RTT of said subset of observed times
2. The network communications system set forth in claim 1, in which
said multiple is a small number.
3. The network communications system set forth in claim 2, in which
said number is about 3.
4. An adaptive MULTIHOMING SYSTEM for reliable monitoring and
reporting UP/DOWN status of a plurality N of ISP-ACCESS CLOUDs
connected to said adaptive MULTIHOMING SYSTEM, comprising: a. an
ACCESS CLOUD STATE DETECTION apparatus comprising: b. a permanent
memory store PM-1; c. a dynamic memory store DM-1; d. a computing
resource CR-1; comprising: e. a CPU-& I/O chip set; f. a
Control Program and a plurality of Program_Modules in cooperation
with said CPU-& I/O chip set that control the ACCESS CLOUD
STATE DETECTION to execute operations, comprising: i. for each one
of said N ISP-A/C's. A continually and continuously read packet
data flow information for packets directed to and from each one of
said N ISP-A/C's; ii. on a first condition for a packet directed to
and from each ISP-A/C-n of said N ISP-A/C's, store said directed
packet's flow information in a TCP-Flow-Table; iii. on a second
condition for a packet directed to each ISP-ACCESS CLOUD-n of said
N ISP-ACCESS CLOUDs, add TCP destination IP address of said packet
to and delete an old TCP destination IP address from, a list of TCP
destination IP addresses allocated to a memory portion DYNAMIC SEED
LIST in said dynamic memory; iv. on a third condition for packets
directed to an ISP-ACCESS CLOUD-n of said N ISP-ACCESS CLOUDs,
update a permanent memory portion DEFAULT SEED LIST in said memory
store PM-1 with said list of TCP destination IP addresses stored in
Dynamic Memory-1; v. for each inbound packet from each ISP-ACCESS
CLOUD-n of said N ISP-ACCESS CLOUDs, determine the inbound byte
count, and add said count to a corresponding ISP-ACCESS CLOUD-n
inbound-byte-counter, and on a regular, periodic interval of
integer k-seconds, compare its inbound-byte-counter value at
time=t. [ISP-n(t)] with its inbound-byte-counter value at time=t+k.
[ISP-n(t+k)], and if said values are the same, set a Blackout_Hint
variable to YES, and initiate an UP/DOWN status test of said
ISP-ACCESS CLOUD-n to determine if said ISP-ACCESS CLOUD-n is UP or
DOWN. vi. on a fourth condition for packets directed to a TCP
destination IP address, send a set PROBE of packet sequences, to
said TCP destination IP address through each one of said N
ISP-ACCESS CLOUDs, measure a Round Trip Time value to and from said
TCP destination IP address for each one of said N ISP-ACCESS
CLOUDs, and store said value for each one of said N ISP-ACCESS
CLOUDs and each said destination address in a ROUND TRIP TIME
table; vii. wherein: 1. each PROBE set probe packet has a TCP
source port number that is a number P, such that flow information
<X, Y, P, Q> for each said probe packet is not found in said
TCP-Flow-Table. 2. wherein P is found by: a. said ACCESS CLOUD
STATE DETECTION computing resource CR-1 sets variables X, Y, Q
given by X=TCP source address. Y=TCP destination IP. Q=a TCP
destination port; b. said ACCESS CLOUD STATE DETECTION computing
resource CR-1 instantiates a label GENERATE: and calls a random
number generator procedure RANDOM (P); c. said ACCESS CLOUD STATE
DETECTION computing resource CR-1 calls a Program Module SEARCH
Flow Table for <X, Y, P, Q> in which variable flow-found=YES,
if <X, Y, P, Q> is found in said Flow Table and then CR-1
sends control back to said label GENERATE. ELSE CR-1 sets
flow-found=NO; SET TCP source port=P.
Description
BACKGROUND OF THE INVENTION
Field of the Invention: Connections To The Internet
[0001] FIG. 1 illustrates a typical Customer Premises Network (CPN)
1-100, communicating with the Internet 1-101. The CPN connects to
the Internet in the typical manner, through a set of Internet
Service Providers, i.e., the ISP Access Clouds ISP-I, ISP-2, . . .
ISP-n. The term Access Cloud (often referred to as an Internet
link) is used here to distinguish from other, more general terms
that have been used to denote Internet connections, but those more
general terms also may introduce different, unwanted
connotations.
[0002] The elements of the entire Internet-ISP Access Cloud
connections-CPN system include: The Internet represented as the
upper cloud icon INTERNET, the ISP-1 Access Clouds IISP-i-2-3-4],
and the Customer Premises Network that includes a prior art
Multihoming System (MHS) connecting the ISP access clouds to
Customer User Equipment (CPE). The CPE usually has a Customer owned
Hub, Switch or Router connected to a multiplicity of Customer USER
servers, computers, work stations and the like, represented here by
USER-1, 2, USER-m. The Customer Premises Equipment (CPE) resides in
the CPN, as does some ISP-owned equipment, indicated by the overlap
between the ISP Access Clouds and the CPN.
[0003] Each ISP-n Access Cloud has a communication path or
connection for Internet traffic (indicated by double-headed arrow
ISP-n) that is identified as such by the MHS. As shown in FIG. 2
below, the ISP-n connection from the MHS to the Access cloud is
usually a single router (router-n) owned by the particular ISP but
located in the customer-premises, a "last mile" link e.g. Ti, DSL
connecting router-n to a phone company central office or ISP-n
point of presence, an Internet router at the ISP-n point of
presence (Aggregation-router-n), and all the neighboring routers
belonging to ISP-n up to the point where ISP-n connects to another
ISP. Each ISP-n, router-n combination is represented by the `ISP
Access cloud` icon named, e.g., ISP-1. As shown in FIG. 1, each ISP
Access cloud, ISP-n, forms a uniquely identified communication path
between the MI-IS and the Internet.
[0004] The communication path ISP-1 through the first ISP Access
Cloud consists of 15 the first link or connection to the MHS (the
overlap of the Access Cloud and the Customer Premises) and a second
link or connection to the Internet (the overlap of the Access Cloud
and the Internet cloud.
[0005] On the other side of the MHS there are connections to the
CPE in the CPN. In the example shown in FIG. 1, customer premises
equipment (USER-1, USER-2, . . . USER-M) accesses Internet traffic
(double-headed arrows) by separate connections to the MHS through a
router, hub or switch. Each of the connections to the MI-IS from
USER equipment may also include a separate firewall (not
shown).
[0006] Each of the MHS-access cloud connections may also have
Ethernet switches, routers or hubs interposed between.
[0007] The Access Clouds are shown partly shared by the Internet
and partly shared with the CPN indicating that equipment
identifying each ISP is distributed, with some Customer Premises
equipment (e.g., usually a router) located in the customer premises
1-104. In the CPN of FIG. 1 Multihoming system (MHS) 1-106 is the
entity within the CPN directly communicating with the ISP Access
Clouds on the one side and CPN User equipment USER-1, USER-2, . . .
USER-M (servers, PCs, workstations, etc.) communicating directly
with the MHS.
[0008] FIG. 1 represents what is typically found in a CPN ranging
from a moderate size 5 to enterprise-wide Customer Premises Network
incorporating a Multihoming System (MHS) connected to the Internet
through a parallel multiplicity of ISP Access Clouds (links).
[0009] Definition of an ISP Access Cloud
[0010] Referring to now to FIG. 2, a more detailed diagram of a
typical ISP Access Cloud 10 lb-100 is shown. An Access Cloud is
that collection of elements, which are jointly responsible for
delivering Internet traffic to and from the Customer Premises
Network 1-100. The first four elements of that collection are a
series or chain including, in this example, Customer Owned Ethernet
switch ib-102, ISP-owned customer premises router lb-104, a Telco
facility ib-106 providing a wide area line (DSL, Ti, T3, Wireless,
etc), and an ISP point of presence router lb-108. Note that in most
cases, some Customer Premises Equipment (CPE) ib-102, although
physically located at a Customer site, will belong to the ISP
Access Cloud lb-100. After the router lb-108, communication to the
rest of the Internet proceeds by parallel paths, e.g., ISP backbone
routers ib-110, 112. If any one element of the series chain in an
ISP Access Cloud fails, Internet traffic will not be successfully
routed through the ISP Access Cloud to the Customer Premises
Network. Hence the entire ISP Access Cloud forms a single
reliability chain.
[0011] FIG. 2 does not cover all cases exhaustively, as ISP Access
Clouds are extremely diverse; however it is typical. What is common
in all cases is that many routers ib-108, 110, 112, Ethernet
switches lb-102, and sometimes phone company switching equipment
Ib-106, are involved in the reliability chain, some on customer
premises, some on Incumbent Local Exchange Carrier (ILEC) premises,
with the majority of routers lb-108, 110, 112 being on ISP
premises.
[0012] In terms of reliability An ISP Access Cloud can be only in
one of two states: UP or DOWN.
[0013] In the UP state, when all the elements in the reliability
chain are functioning, Internet traffic is successfully delivered
to multiple destinations in each direction.
[0014] When at least one element in the reliability chain fails,
the ISP Access Cloud will be in the DOWN state.
[0015] Note that unlike traditional networks prior to the Internet,
the reliability chain spans multiple domains of responsibility. In
FIG. 2, there are 3 domains: a) the customer (who owns and controls
the CPE router and CPE Ethernet switch), b) the 10 Incumbent Local
Exchange Carrier who delivers T1 or DSL lines wholesale to an ISP
(Telco facility lb-112, and c) the ISPs themselves (including the
ISPs hub, switch or router, e.g., router lb-104).
[0016] Typically, the MHS maintains a list of User-ISP Address
(UIA-1, UIA-2 . . . UIA-m), which is a sub-set of the Internet's
Destination IP address list. For the particular CPN 1-100, the
Internet `cloud` includes a Designated List of active ISPs
(servers) denoted as ISP [N]. The members of that set may be
enumerated as ISP-n, for n ranging from 1 to N.
[0017] Elements of Typical ISP Access Cloud
[0018] FIG. 2 shows elements of a typical ISP Access Cloud, these
elements include: more routers to Internet; ISP back bone
router[s]; ISP Point of Presence router; Telco facility: Wide area
line (example DSL, TI, T3, Wireless link. On the Customer Premises,
a Router; Ethernet Switch; Customer Premises Network; MHS, router
& Users.
[0019] Prior Art Internet Connection Reliability Measures
[0020] Periodic ICMP requests to Fixed IP List Configured by
User
[0021] Some existing prior art in ISP Access Cloud status detection
involve includes sending periodic ICMP (Internet Control Message
Protocol) Echo requests to the fixed list of IP addresses, which is
maintained and stored by the CPN, generally in the MI-IS unit
memory storage system. This is a common process well known in the
art.
[0022] Drawbacks of Prior Art Reliability Measures
[0023] When these requests are sent through a specific ISP, and
fail to elicit an ICMP Echo response, that ISP is declared down. A
major drawback of the ICMP request approach in previous systems is
that it is unreliable in common situations.
[0024] One common situation arises because of router blocking of
ICMP packets. Many ISPs configure their routers to block (i.e.
drop) ICMP request packets, especially during times when the
Internet as a whole or a single ISP is experiencing problems.
[0025] When this happens the ICMP requests will time out and the
Users MHS will falsely conclude that the ISP is DOWN, even though
it is really UP.
[0026] A second drawback of previous systems is that the user has
to configure a list of destination ISP addresses that need to be
checked. The User usually configures this fixed list as part of
their normal setup and/or operation procedures. This is an extra
burden on system operations personnel.
[0027] A third drawback of such previous systems is that once the
list of ISP destination addresses is generated, the list is fixed.
Over some sustained time period, some or all of the machines
supporting the addresses on the fixed list can be taken out of
service and be replaced by a machine with a different address
providing the same communication path. In that case a false DOWN
indication would be detected by an MHS relying on the ICMP
packet.
[0028] A fourth drawback in the previous systems is non-randomness
of flows with systems relying on cache storage of flows. The ICMP
requests involve fixed values in the IP address fields that do not
change over time. Because of this the following class of fault
conditions will not be detected by such a system. Under hostile
conditions on the Internet, sometimes these caches storing flows
fill up, and new flows are no longer admitted into the router. Old
flows will continue to appear to function though, including the
ICMP request and response packets. The multi homing system in this
case will report a false UP status, i.e., it will fail to detect a
true ISP-Access Cloud DOWN status.
[0029] It is highly desirable to have a reliable method of
detecting the communication status of a network connection as UP or
DOWN in the presence of the conditions described above.
[0030] 1A system of reliably verifying UP/DOWN status of a
particular ISP is greatly desired and would provide more robust
Internet communications for users and suppliers.
BRIEF SUMMARY OF THE INVENTION
[0031] One object of the present invention is to provide a method
and apparatus to 20 reliably detect ISP Access Cloud states as
either UP or DOWN.
[0032] A second object of the invention is to provide an
auto-learning and adaptive approach for generating a User list of
ISP addresses to check for reliable connections thereby removing
that burden from User network system operations, freeing the
customer of the time and effort to create and maintain a meaningful
list.
[0033] 2 It is an advantage to the User of the present invention
that it provides a method and apparatus that completely solves this
problem.
[0034] Another object of the invention is to provide a multi homing
system that automatically learns and caches the most recently used
destination IP addresses. This keeps the list of addresses `fresh`,
i.e., those most currently active and thus less likely to be taken
out of service, automatically removing old addresses that are more
likely to become `stale`, and subject to false DOWN status
indications as in prior art systems.
[0035] Another object of the invention is to randomize selection of
flows in such a way that even if an ISP Access Cloud device's
internal tables become full, such that it prevents new user
sessions from accessing the Internet, then auto-detection and
auto-recovery from that condition is possible.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING
[0036] FIG. 1 is a diagram of a typical Customer Premises Network
incorporating a Prior Art Multihoming System (MHS) connected to the
Internet through a multiplicity of ISP Access Clouds (links).
[0037] FIG. 2 illustrates a Typical ISP Access Cloud shown in FIG.
1
[0038] FIG. 3 shows the Customer Premises Network of FIG. 1
connected to the Internet through an Adaptive Multihoming System
having an embedded ISP-ACSD in accordance with an aspect of the
present invention.
[0039] FIG. 4 is a detailed block diagram of the ISP-ACSD shown in
FIG. 3.
[0040] FIG. 5A is an exemplary flow chart for a Power_On_Sequence
program module used to verify ISP-ACSD UP or Down status in the
ACSD of FIG. 4.
[0041] FIG. 5B depicts an AUTO DET & SEED LIST UPDATE program
module for the Computing Resource CR-1 shown in FIG. 4.
[0042] FIG. 5C shows a flow chart of a PROBE Sequence program
module used to verify ISP-ACSD UP or Down status in the ACSD of
FIG. 4.
[0043] FIG. 5D is a flow chart of a Random TCP Source Port
Selection program module used to verify ISP-ACSD UP or Down status
in the ACSD of FIG. 4.
[0044] FIG. 5E depicts an Update Inbound Packet Byte Count program
module used to verify ISP-ACSD UP or Down status in the ACSD of
FIG. 4.
[0045] FIG. 5F illustrates a HINT_DOWNJDET and Aggressive Probe
program module used to verify ISP-ACSD UP or Down status in the
ACSD of FIG. 4.
[0046] FIG. 6 is a DFSL diagram used to verify ISP-ACSD UP or Down
status in the ISP ACSD of FIG. 4.
[0047] FIG. 7 illustrates DFSL & Dynamic Seed List Adaptation
program module used to verify ISP-ACSD UP or Down status in the
ACSD of FIG. 4.
[0048] FIG. 8A depicts the standard IP Header Format commonly used
in the art.
[0049] FIG. 8B depicts the standard TCP Header Format commonly used
in the art.
DETAILED DESCRIPTION OF THE INVENTION
A Top Level View of an Embodiment of the Invention
[0050] Referring now to FIG. 3 there is shown a top-level block
diagram of an embodiment of an Adaptive MI-IS (A-MHS) 2-104
according to the present invention located in a CPN as is the prior
art MHS in the diagram of FIG. 1.
[0051] The A-MHS is adapted to incorporate an embodiment of an
Access Cloud Status Detector according to the present invention,
the embodiment shown as ISP-ACSD.
[0052] The CPN system is typically connected to a multiplicity of
separate ISP Access Clouds. Each ISP has an identifier (a name or a
number that is unique within the MHS system. This is also well
known in the art).
[0053] The adapted MHS has first connection means CM-1
communicating with Users 5 computer equipment 1 b-104 (User-1,
User-2) at the User site and second connection means (ISP-1, 2, . .
. of FIG. 1) represented here by arrow CM_2 communicating to the
Internet 1-102 of FIG. 1.
[0054] First connection means CM-1 generally includes one or more
User hubs, switches or routers connecting multiple Internet access
request sources, e.g., User-1, User-2 . . . (User computers,
servers and the like) to the A-MHS.
[0055] 2nd connection means CM-2 consists of a multiplicity of ISP
Access Clouds. The ISP Access Clouds previously have often been
referred to as access links, or sometimes as access ports. Such
connections are generally configured as servers, e.g., ISP-1, -2,
-3, -4 with respective routers (router-1, -2, -3, -4).
[0056] The ISP-ACSD 2-100 and internal elements 2-104 of the MHS
communicate data and control commands through an internal ACSD
connection represented by arrow 2-106.
[0057] In FIG. 3 there is shown a more detailed block diagram 300
of the ISP-ACSD 2-100 of FIG. 2.
[0058] The ISP-ACSD includes Non-volatile storage memory PM-1
(Permanent Memory Storage space allocated to Default Seed List data
DFSL), dynamic memory storage DM-1, and a computing resource CR-1
with a control program CF-1, a data bus DB-1 and read/write/control
bus R/W&C-1 connecting between the computing resource CR-1 and
the memories. DB-1 and R/W&C-s also connect to the internal MHS
functions as shown on FIG. 2. I/O interfaces 1/0-1 and 1/0-2 shown
in FIG. 3 connect the MHS internal functions 2-104 to the Internet
and User equipment through CM-1 and CM-2 of FIG. 2.
[0059] Although preferred embodiments of the present invention are
described as including a computing processor module, the invention
is understood to apply to multi homing solutions that include
either single or multi-processor computing modules.
[0060] FIG. 3 shows the same kind of structure as the Prior Art of
FIG. 1, except that FIG. 3 illustrates an adapted MHS (A-MHS)
including an embodiment of the ISP-ACSD invention that replaces the
prior art MHS of FIG. 1.
[0061] Elements in FIG. 3 having the same identifying reference
characters are the same as in FIG. 1, and include:
INTERNET, ISP-n Access Clouds, CPN, customer owned equipment, such
as i.e., USER-I, USER-rn, 1/0-1, 1/0-2, and the Customer Hub,
Switch or Router.
[0062] Elements in FIG. 3 different than in FIG. 1 are:
The ISP-ACSD, and the ACSD CONNECTION to the MHS elements
cooperating with ISP-ACSD.
[0063] Detail Block Diagram of ACSD; FIG. 4
[0064] FIG. 4 illustrates a block diagram of Adaptive MHS of FIG. 3
and FIG. 4, adapted to incorporate an embodiment of the present
invention, and specifically a preferred embodiment shown in FIG. 2
as ISP-ACSD embedded in the MHS.]
[0065] The ISD-ACSD Embodiment of FIG. 4 includes:
[0066] PM 1: Non-Volatile-Memory-1 is Permanent Memory Storage with
space allocated to DFSL data and the Adaption_Complete_Flag used in
the PowerOnSequence module (described below)
[0067] Dynamic Memory-1 is dynamic memory with dynamic storage
space (SSDM) allocated to Dynamic Seed List data, and to Round Trip
Time History data (the RTF table), Inbound_Byte_Counters for
storing Inbound Packet Byte count for TSP access clouds 1-N, and
storage space DM-FLAGS for various flags used as described
below.
[0068] Other elements of the ACSD include: COMPUTING RESOURCE CR-1,
typically a CPU & I/O chip set connected to a DATA BUS (D-BUS)
and a CONTROL BUS (C-BUS) that communicate with memories DM-I and
PM-1.
[0069] The D-BUS and C-BUS also connect through INTERFACE (INTF-1)
to selected MHS elements (E-MHS) that are generally inherent in the
MHS. The pertinent connections and MHS elements (E-MHS) are those
that provide data values, flags, register contents, drivers and the
like that the ACSD and the adaptive MHS utilize in performing their
functions in embodiments of the present invention.
[0070] Knowledgeable computer networking hardware and software
design practitioners are familiar with the needed MHS elements
(E-MHS) and how to structure the INTF-1 in order to design, build
and operate a particular implementation of the present invention.
The MHS elements (E-MHS) and the INTF-1 required for a particular
embodiment of the present invention will become clear from the
detailed description of the ACSD invention's structure, operation
and its relationship to the A-MHS which follows.
[0071] The ACSD Computing Resource CR-1 operates the ISP-ACSD
control Program CP-1. The CF-1 includes a number of Program Modules
& Procedures (PRMP-1, 2, 3 . . . ) described below that enables
the ACSD to provide the features and benefits of the present
invention with the A-MHS.
[0072] The ACSD has a Read/Write memory configuration including a
permanent or non-volatile part, PM-1, and a high-speed dynamic
part, DM-1
[0073] The permanent (or durably persistent) read-write digital
memory store, i.e., Non-Volatile-Memory-1 (PM-1) is allocated to
store Default Seed List (DFSL) data, and permanent memory Flags
(PM-Flags) data indefinitely with power off.
[0074] Dynamic Memory-1 DM-1 is allocated to store Dynamic Seed
List data and a Round-Trip-Time-History table (RTT) for storing
Round-Trip-Time-History data (described below).
[0075] The ACSD has an internal Control Bus and an internal
Read/Write Data bus. The Control_Bus transmits Control Commands to,
and from, all units connected to it. The Control_Commands
transmitted and received by units connected on the control_bus
includes Read/Write Control and Request commands for reading and
writing data on the Data_Bus.
[0076] Knowledgeable practitioners of the computer arts can
configure particular implementations of PRMP modules to run on one
or another of a number of well-known operating systems, for example
Unix.TM., Linux.TM. or Microsoft Windows.TM. by 1 understanding the
detailed description of the present invention that follows.
[0077] The D-BUS communicates Read and Write data (RWDATA) to and
from the units connected to it, i.e., the memories PM-1 & DM-1,
the Computing Resource CR-I, and through the Interface TNTF-1, to
the MHS elements.
[0078] In a similar manner, the C-BUS communicates Read & Write
and Control Commands (R/W&C) to the units connected to it,
i.e., the memories PM-1 & DM-I, the Computing Resource CR-1,
and through the Interface INTF-1, to the MHS elements.
[0079] The communication links 1/0-1, 1/0-2 to the MHS 1-104 are
connected so that ALL traffic from the customer premises users
User-1, 2, User-n must pass through it before being transmitted by
the MHS 1-104 to the Internet 1-102. As a consequence, ALL inbound
and outbound web traffic 2-108 will pass through the MHS.
[0080] ACSD Control and Data Communication with the MHS
[0081] The ACSD communicates with the MHS elements 2-104 through
the interface INTF'-1 so that the MHS 1-104 will detect web browser
traffic originating from that customer site that is destined for
Web servers these specific customers normally access. This is
observed by the system 1-104 as IF traffic 2-108 destined to the
well-known TCP port 80.
[0082] The ACSD CONTROL_PROGRAM uses A New Address Detection
module, described below, to capture the source TCP port, the source
IP address and the destination IF address of all outbound IP
traffic requests (i.e., flows where the destination TCP port is 80)
and the time of the destination address request and stores them in
an internal Destination Traffic state table. See table 1,
below.
[0083] Stale Address Aging Algorithm
[0084] An Aging_Algorithm (not shown), for aging Destination
Traffic state table entries (see Table below) periodically examines
the entries in the Destination Traffic state table and deletes
those that become stale, i.e., when the Address_Request_Time value
1 indicates their age exceeds some Address_Age_Time_Limit, beyond
which entries are considered stale. When entries become stale, the
Aging_Algorithm deletes them.
[0085] It is well known that in the case of Web traffic, packet
flows tend to be extremely short lived, so the aging and deletion
of stale flows is important. Otherwise more memory storage space
must be allocated to store otherwise stale entries. To persons
schooled in the art of building systems like an MHS, or a firewall,
or a router, there are numerous techniques, algorithms and methods
that are widely known and available for the creation and
organization of such state tables and for creating such
Aging_Algorithms. Any of a number of such techniques, algorithms
and methods will do. The Table 1 below depicts an example of part
of one such table for the ACSD.
TABLE-US-00001 TABLE 1 Internal State Table example: outbound dest
ip Src Port Dest Port Src IP Dest IP Address_Request_Time 12344 80
64.3.4.5 128.186.5.2 T1 13425 80 65.6.7.2 193.2.3.4 T2 10347 80
64.3.4.5 66.125.23.129 T3
[0086] Default Seed List of IP Destinations: General
Description.
[0087] The Adapted MHS system 1-104 would generally come to a
User's site from a manufacturer or supplier by having Default Seed
List data installed in the ISP-ACSD unit. Referring to FIG. 6 and
again to FIG. 2, the Default Seed List 600 (DFSL) is a list of IF
addresses 602, each of which are known active servers on the
Internet that a web browser can expect to connect to. A preferred
method of supplying an initial Default Seed List is to store the
Default Seed List data in a storage space allocated on permanent
(non-volatile) volatile media, e.g. PM-1. Other forms of
persistent, but alterable, memory e.g. a hard disk, EEPROM, Flash
Memory and the like may also be used.
[0088] The computing resource CR-1 is typically a PC board (or
boards) containing a CPU, memory & chip set that runs a control
program CP-1 the program CP-1 includes a set of control program
modules, listed in table 1 and described below.
[0089] Program Modules in ISP-ACSD:
[0090] Control Program Modules
[0091] A representative Power_On_Sequence Program Module 500 for
the CP-1 shown in FIG. 4 is shown in FIG. 5A
[0092] Power-On update of seed lists: Refresh Of Adaptive Dynamic
Seed List (Access Cloud IP Destinations) At every power on or reset
event 502, a copy of the default seed list in permanent storage
(DFSL) is made in dynamic memory (DYSL) 506. A simplified sequence
for the Power_On_Sequence Program Module 502 is shown below:
[0093] 502: Power-On or Reset event;
[0094] 504: MIJS Program Starts;
[0095] 506: Copy DFSL from Non-Volatile-Memory-I to Dynamic Seed
List in Dynamic Memory1; Copy Adaptation_Complete_Flag from
Non-Volatile-Memory-1 location PM-Flags to Dynamic Memory-1
location DM-Flags.
[0096] 508: end of Initialization (or Reset).
[0097] DFSL to DYSL Transfer
[0098] Every time the A-MHS system of FIG. 3 is powered on, the
Power-Up Sequencer module in the ACSD retrieves the latest copy of
the DFSL from the Permanent (non-volatile) Memory and stores a copy
as the Dynamic Seed List into dynamic memory DM-1.
[0099] Continuous Update of DYSL
[0100] As the customers web traffic is observed, the dynamic memory
list is constantly 15 updated with recently observed traffic, so
that the seed list of IP addresses may eventually disappear,
leaving only the 256 most recently accessed IP addresses in dynamic
memory. This update is done by an Auto Detect & Seed List
Update module 502, one of the modules PRMP in the CP-I of FIG.
4.
[0101] Auto Det & Seed List Update
[0102] Auto Det & Seed List Update Program Module FIG. 5B.
[0103] FIG. 5B is a pseudo-code flow-chart for the Auto Det &
Seed List Update program module 520, one of the modules PRMP shown
in FIG. 4. It can also referred to as Web Traffic Detection (Or New
Address Detection)) And Seed List Update.
[0104] 502: MHS starts;
[0105] 520: next packet received;
[0106] 524: test if the received packet is an outbound tcp syn
directed at tcp port 80; if YES, branch to step 528; if NO branch
to step 526;
[0107] 526 (inbound packet data store operation): [0108] read
packet data from received packet; [0109] store packet data in
specified TCP Flow Table; [0110] branch to step 522;
[0111] 528 (test for RTF table): [0112] is the destination IP
address found in the RTT table?; [0113] if YES branch to step 526;
[0114] IF NO branch to step 529;
[0115] 529 (update See Lists& flag, initiate RTF measurement)
[0116] add this IP address to the DYSL; [0117] if
Adaptation_Complete_Flag=NO, then: [0118] update the DFSL; [0119]
add this address to DFSL; [0120] delete one old IP address from the
DFSL; [0121] increment count by 1;
[0122] if COUNT 256, then: set Adaption_Complete_Flag=YES in both
PM-FLAGS and DM-FLAGS;
[0123] add this IF address to the RTT table;
[0124] branch to 526;
[0125] END of module;
[0126] Referring to FIG. 5C, there is shown a diagram of the test
packet sequence PROBE SEQ.
[0127] Referring also to FIG. 8A and FIG. 8B there is shown the
well-known IP and TCP Header format and are provided here for
convenient reference in describing operations, record fields and
their values.
[0128] The fields definitions for the IP & TCP headers are well
known, but are repeated here for convenience:
[0129] The following abbreviations are used for the different
fields of the TCP and IP header:
TABLE-US-00002 TABLE 2 TCP/ICP abbreviations ACK field 32 bit
acknowledgement number: src ip 16 bit TCP source address: dest ip
16 bit destination IP address: src port 16 bit TCP source port:
dest port 16 bit TCP destination port: SYN, ACK, FIN SYN, ACK, FIN
are single bit fields defined in TCP Header
[0130] The probe sequence PROBE is a sequence of packets 530 shown
in FIG. 5c that does the following steps: [0131] 532: Send SYN;
[0132] 534: wait until SYN ACK is Received; [0133] 536: Send FIN
ACK; [0134] 538: wait until FIN ACK Received;
[0135] The values in the Header fields: dest ip, src port, and dest
port are assigned according to the following list:
TABLE-US-00003 TABLE 3 Header field values 1. send SYN(src ip = X,
dest ip = Y, src port=RANDOM, dest port =80) 2. receive SYN ACK
(src ip = Y, dest ip X, src port = 80, dest port = RANDOM) 3. send
FIN ACK(src ip = X, dest ip = Y, src port = RANDOM, dest port = 80)
4. receive FIN ACK (src ip = Y, dest ip = X, src port = 80, dest
port = RANDOM)
[0136] Values of X and Y are received by the MHS from the
requesting USER equipment in the usual manner well known in the
art.
[0137] The value for RANDOM is generated by the RND SEL program
module described elsewhere.
[0138] A SYN is sent by setting the single bit SYN field to 1.
[0139] A SYN ACK is sent by setting both single bits SYN and ACK
fields to 1, a FIN is sent by setting the FIN bit to 1, and a FIN
ACK is sent by setting both FIN and ACK bits to 1. In sending out
the initial SYN probe the 32-bit Sequence number in the TCP packet
header is picked as a random 32-bit number by the RND SEL program
module.
[0140] In sending any ACK packet, the ACK field is computed by
adding 1 to the 15 received 32-bit sequence number in the packet
being acknowledged.
[0141] The probe packet sequence PROBE is sent both during the
Normal_sampling operation and when the Access Cloud State Detector
suspects an ISP-ACESS CLOUD is DOWN, but the mode of sending is
modified by the HINT_DOWN_DET module described elsewhere.
[0142] For RTT measurement the PROBE packet sequence is sent as an
IP datagram the ACSD. This improves the reliability of the probe
because 6 packets are sent instead of only 2 as in Prior Art ICMP
packets, means for transmitting an IP datagram is well known in the
art.
[0143] The PROBE Sequence Set 530 exchanges 6 data packets. Prior
Art ICMP 5 protocol exchanges only REQ & ACK packets.
[0144] These PROBE sequence set are sent via all possible ISP paths
for each new destination IP address, at the time the user sent a
web browser request to a new destination (address); in other words,
the sample taking of round trip time measurements via all possible
ISP paths is done in an event-driven manner, each and every time
the User Equipment sends a new web connection request; that is only
if it is a new destination not found in the Web traffic RTT
table.
[0145] It should be clearly understood that the destination IP
address are not the same as the ISP addresses for the "ISP paths"
(i.e., the ISP Access Clouds) in this description.
[0146] "All" in this instance means those ISP paths known to the
User Site's Equipment, to clarify this a little further, as noted
above the description of the MHS the ISPs are all listed to the
system, either enumerated by distinct numbers as 1, 2, 3 etc or by
a finite set of unique names. In the case of the latter, the names
are translated to unique internal numbers 1, 2, 3 . . . as is well
known in the art.
[0147] As described elsewhere above, the individual "each
destination IP address" are stored in and retrieved from the DYSL
by one or another of numerous well-known means that need not be
enumerated here.
[0148] To understand "ISP Path", refer to FIG. 1, which depicts an
MHS connected to different ISPs via Router-1, Router 2, Router 3,
Router-4. By sending the probe sequence to Router-1, for example,
the path via ISP 1 is selected. For each destination Web address, A
probe sequence is sent via Router-i's destination MAC address, and
therefore traverses the ISP-1 Access Cloud and eventually reaches
the web server owning the web address. The active web servers then
participate in the TCP based probe sequence. The likelihood of the
servers being active is very high because we did ensure the list
included only the most recently used servers.
[0149] Note that the round trip time RTT is the time elapsed in
milliseconds between steps 1 and 2 in the probe sequence PROBE
above, that is, the time elapsed between sending a TCP SYN and
receiving a TCP SYN ACK from the same address.
[0150] This results in the RU table, where the entry in each ISP
column shows the sampled round trip time in milliseconds:
[0151] FIG. 5D, illustrates a flow chart for a Randomized Source
TCP Port Selection Program Module, referenced in FIG. 4 as one of
the PRMP modules.
[0152] Each new set of probes must use a new random set of source
TCP ports. It is unacceptable to use a fixed set of TCP port
numbers as a source port number, as this results in failure to
detect a (important) subset of failures--that when an ISP router
gets into a stuck condition because its state tables are full (It
can handle old connections but it cannot add new connections into
its cache).
[0153] FIG. 5d is a pseudo-code flow chart for a program module,
Rnd_Tcp_Sel 540, one of the PRMP modules in the embodiment of the
present invention shown in FIG. 4, that provides a random number
used for each probe set PROBE Procedure RND_TCP_SEL randomize TCP
port addresses to ensure ISP-ACSD router caches will not go `stale`
and cause false UP status to be reported when new requests are not
accepted and caches are full, as in the prior art.
[0154] The RND_TCP_SEL module makes use of TCP flows stored in a
flow_state_table (not shown). The flow_state_table is typically
located as one of the MHS ELEMENTS shown in FIG. 4. the generation,
control and use of TCP flows and flow_state_tables is well
understood in the art.
[0155] The steps of the RND_TCP_SEL module include:
[0156] 542: Choose dest ip (the destination IP address) from the
Dynamic Seed List, DFSL.
[0157] 544: Next, the Procedure select src TCP port is called:
[0158] L1: Generate a random 16-bit number P and Search the flow
state table using
[0159] the flow: <X,Y,P,Q>
[0160] IF flow <X,Y,P,Q> is found in the flow state table,
then
[0161] SET flow-found YES;
[0162] GOTO L1: (pick another random 16 bit number and repeat
search of flow table) ELSE: SET flow_found=NO;
[0163] SETsrc=P
[0164] END procedure.
[0165] The Procedure 540 generates a random 16-bit number P and
searches the flow state table using the flow: <X,Y,P,Q>. if
the flow <X,Y,P,Q> is found in the flow state table, then
variable flow-found is set equal to YES and the Procedure brances
back to label Li where it picks another random 16 bit number and
repeats a search of flow table.
[0166] If variable flow-found is equal to NO, the new RANDOM number
P is safe to use as a source TCP port in the PROBE sequence.
[0167] A TCP flow is typically stored in an internal flow state
table described elsewhere 20 and located typically in one of the
cooperating elements of MHS E-MHS, and looks like: <X,Y,P,Q'
[0168] The three parameters X=src ip, Y=dest ip, and Q dest TCP
port 80, are known by the MHS and ACSD prior to the procedure
call.
[0169] The fourth parameter, P=src TCP port, is obtained from the
RND_TCP_SEL Procedure.
[0170] NOTE: The source IP address is any of the active source IP
addresses from the Customer Premises Network that have recently
communicated with any outside Web server
[0171] FIG. 5E illustrates a flow chart 550 describing another of
the PRMP modules of FIG. 4, the Update_Inbound_Packet_Byte_Count
module.
[0172] At step 526, module 550 receives a packet from the
ISP-access cloud and begins the update inbound byte counter process
at step 552. A following step 554 determines the identity of the
ISP-n that transmitted the packet. Next, at step 526, the procedure
550 determines the byte count of the inbound packet and adds the
packet byte count to the inbound byte counter related to the ISP-N
at step 558.
[0173] With reference again to FIG. 4 and to FIG. 5E, The A-MHS
continuously monitors Inbound_Byte_count for inbound packets from
the ISP-Access Clouds and stores Inbound_Byte_count data in the
Inbound_Byte_Counter located in the Dynamic Memory-1.
[0174] Alternatively, the Inbound_Byte_count may be stored in other
registers or memory locations. For example they may be stored in
Inbound_Byte_Counter memory locations or registers allocated within
the Cooperating Elements Of Mhs indicated in FIG. 4.
[0175] In the method of the present invention, no packets leave the
MHS w.about. that would otherwise contribute extra traffic to what
might already be a busy network when all the ISP Access Clouds are
reliably working, i.e., receiving and sending Internet traffic
to/from the MHS. Only if there is suspect ISP, will the ACSD
initiate extra traffic to reliably detect the UP/DOWN status for
the suspect ISP-Access Cloud.
[0176] Instead, the Update_Inbound_Packet_Byte_counters are
maintained for each ISP. It is relatively frequent that these
counters will already be part of typical MHS systems. In such
cases, they can be used here as part of the COOPERATING ELEMENTS of
FIG. 4.
[0177] If the UpdateJnbound_Packet_Byte_Count counters do not exist
in the MHS to which the ACSD device cooperates, then the counters
will be alternatively be implemented in memory DM-1 as shown in
FIG. 4 or as additional registers (not shown).
[0178] If there are 4 ISPs 0, 1, 2 and 3 then there will be 4
Inbound Byte counters maintained for example: IN(0), IN(1), IN(2),
IN(3).
[0179] Every Internet packet communicating between any ISP-Access
Cloud and the 10 CPN transits the A-MHS. Cooperating with the ACSD,
the A-MHS examines the Byte length of each packet. For inbound
packets, the A-MHS determines the Inbound_Byte_count length, from
which ISP-n it is received, and calls the
Update_Inbound_Packet_Byte_Count module. The
[0180] Update_Inbound_Packecflyte_Count then adds the
Inbound_Byte_count to the corresponding ISP-n inbound_byte_counter
and exits until the next inbound packet is received.
[0181] FIG. 5F HINT_DOWN_DET Module
[0182] The HINT_DOWNDET module 560 is shown in FIG. 5F and relies
on the Inbound_Byte_count data in the inbound_byte_counter to
decide if an ISP-Access Cloud is suspect.
[0183] The ACSD Control_Program uses
Update_Inbound_Packet_Byte_Count program module, and HINT_DOWN_DEI'
module to cooperate continuously and reliably veri.about.UP/DOWN
status of each of the ISP-n Access Clouds with minimal invasive
loading of Internet traffic.
[0184] The HINT_DOWN_DET procedure begins with a normal_sampling
step for sampling the inbound_byte_counters when there is no hint
of an unreliable ISP-n Access Cloud. The Normal_sampling step
periodically examines each inbound_byte_counter for each ISP-n.
[0185] Referring to FIG. 5F the separate HINT_DOWrCDET process in
the A-MHS normally samples all inbound_byte_counters periodically,
e.g., once per second at step S61, then branches to step 562. After
sampling the byte count, at any time prior to some predetermined
interval (e.g., an interval equal to or greater than a variable
TIMEOUT), e.g., every 3 seconds, step 562 returns to the normal
sampling step 561. After the TIMEOUT expires, step 562 branches to
step 564 where the module HINT_DOWN_DET checks the inbound byte
counters, for each ISP, to see if there is a difference in the byte
count between the last two entries.
[0186] For example, letting k represent time in sees.:
Compute Inbound_Byte_count
(k+3)-Inbound_Byte_count(k)=Count_Difference.
[0187] If the Count_Difference is not zero at step 564, then there
has been Internet traffic activity coming from the ISP-n. This is a
good indication that the ISP-n Access Cloud in working and the
status is UP; step 564 will branch to step 566, which sets Blackout
Hint=zero, and returns to normal sampling at step 561.
[0188] If the Count Difference is zero, then this is a hint that
the corresponding ISP Access Cloud might be in state DOWN.
[0189] At step 564 the HINT_DOWN_DET module sets variable Blackout
Hint to YES, and branches to step 568 where it starts a PROBE
sequence to more reliably verifr the suspect Access Cloud
status.
TABLE-US-00004 BEGIN PROCEDURE: Step 561 Normal_sampling: For n=1
to N; DO Sample ISP-n inbound_byte_counter for Inbound_Byte_count
(Count(kfl; advance to Consecutive Interval Count Step; step 562:
Consecutive Interval Count Step: 3 consecutive 1 second intervals?
if Yes goto Count Compare step; If No return to Normal_sampling;
step 564: Count Compare Step: For ISP-n, DO IF Count (k+3) - Count
(Ic) = 0? THEN Blackout Hint = YES; go to AGGRESSIVE_PROBE step;
step 556: ELSE return to Normal_Sampling; step 568:
AGGRESSIVE_PROBE step: set TIMER send out Np probe sets via ISP-n
suspected of blackout; SET alarm signal Blackout Hint = YES;
advance to TIMER= TIMEOUT compare step; step 570: TIMER=TIMEOUT
compare step: TIMER = TIMOUT? and all Np probe sets fail? if NO
goto CLEAR HINT step 574; if YES goto DECLARE DOWN step 572; 572:
DECLARE DOWN step: Declare the ISP Link State as DOWN return to
Normal_sampling step; 574: CLEAR HINT step: Clear the Blackout
Hint, and; return to normal sampling 561; END
[0190] At the end of the Byte_count_period, for each of the ISPs
from 1 to N, the inbound_Byte_count at the end of the period is
compared to the Inbound_Byte_count
[0191] 15 at the beginning of the period, if the difference is not
zero, for all the ISPs, this is a good indication that they are all
working and can be relied on. In that case, normal sampling is
continued.
[0192] The Hint Down Detect method of the present invention does
not introduce any traffic that is not already there as long as all
ISP Access clouds are indicating regular
[0193] 20 traffic by continuously increasing Inbound_Byte_Counts.
This is in contrast to prior art status detection methods that
require extra Internet traffic to frequently and regularly probe
each Access Cloud.
[0194] If the difference is zero for any one of the ISP-n Access
Clouds, this is a hint that the ISP-n may be down, since it is
unlikely that there would be no activity for suchalong period.
[0195] When the Inbound_Byte_count difference for an ISP-n is zero,
the process branches to the Aggressive_Probe step. In the
AGGRESSIVE_PROBE step, the ACSD causes the MHS to send out Np probe
sets via the ISP-n suspected of blackout; starts a timer TIMER; and
sets an alarm signal Blackout_Hint=YES.
[0196] At the next step, when the TIMER reaches a (predetermined)
wait delay TIMEOUT, the ACSD checks the status of every single
probe set of the Np probe sets. If every single probe set of that
group of Np probes failed then the ACSD sets the ISP Link State
status for that ISP as DOWN (e.g., Set ISP-n_Link_State=Down) and
returns to the Normal_Sampling step.
[0197] The primary advantage of this process in the present
invention is that it only adds extra traffic to the Internet
traffic flow when there is a Mint Down detection. This 1 makes the
A-MHS system more efficient than prior art systems in terms of
Internet traffic flow without sacrificing reliability.
[0198] If every single probe set of the Np group did not fail, the
HINT_DOWN_DET process branches instead to the step where the ACSD
clears Blackout_Hint and returns to Normal_Sampling.
[0199] Adaptive Seed List
[0200] FIG. 6 illustrates an example of an Adaptive Seed List 600
(Default or Dynamic) for an embodiment of the present invention
such as the ISP-ACSD of FIG. 4.
[0201] This example of an Adaptive DFSL consists of 256 Internet
ISP addresses 602. The DFSL could be larger or smaller than 256
depending on factors of initial and
[0202] 2operational cost and convenience for the maker and
end-user.
[0203] When the manufacturer first configures the ACSD system, the
permanent memory PM-1 of FIG. 1 is loaded with an INITIAL DFSL
before shipping to the end-user. The INITIAL DFSL would be
populated with a collection of Internet ISP addresses that are well
known and likely to be used by most end-users. For a given User
operating environment, any of a large number of popular sites could
be employed.
[0204] Adaptive Replacement of DEST IP addresses
[0205] FIG. 7a And FIG. 7b illustrate the adaptive replacement of
old dest ip address with new ones in the bFSL and the Dynamic Seed
List for the ISP-ACSD of FIG. 4.
[0206] At initial power on, the initial default list (DFSL) is
copied from Non-Volitile-Memory-i into the Dynamic Seed List in
Dynamic Memory-i.
[0207] When a new destination Web address (new dest ip address) is
learned by the ACSD, it replaces one of the old dest ip addresses
stored in the Dynamic Seed List in the Dynamic Memory-I.
[0208] 1 If Adaptation_Complete_Flag is set to NO, the same old
dest ip address in the DFSL in the Non-Volitile-Memory-1 is also
updated with the new dest ip address.
[0209] If Adaptation_Complete_Flag is YES, the DFSL is not updated;
otherwise it too is updated as shown here
[0210] When 256 new entries have filled the Dynamic List and have
been used to 20 update the Default List, the
Adaptation_Complete_Flag is set to YES.
[0211] When new destination IP addresses are observed, they are
used to replace the `old` addresses in the default `seed list` in
permanent storage and in dynamic memory. Because most sites access
web servers frequently, over time the default seed list becomes
replaced with a new seed list that is adapted to a specific
customer site. As 25 soon as 256 specific new IP addresses are
learnt in this way, the seed list becomes fixed again, and stored
in permanent storage.
[0212] FIG. 8a & Sb show the conventional IP & TCP Header
format and are repeated here for convenient reference.
TABLE-US-00005 TABLE 4 RTT History Time Table Destination ISP
address ISP 1 ISP 2 ISP 3 ISP 4 . . . ISP(m) 65.12.3.4 25 35 28 41
129.1.3.8 112 134 45 98 67.123.54.2 32 28 31 43 68,34.12.55 45 51
67 29 67.33.124.23 55 34 28 112
[0213] The length of the Rfl table is implementation dependent and
not germane to the description of the present invention. It could
be up to 4,000 entries or even more, if desired.
[0214] Round Trip Time History; Table & Update.
[0215] The means for implementation of the Rfl table in the ACSD
invention is well 10 known in the art and needs no further
explanation other than that already given here.
[0216] For the dynamic memory list, and for each destination IP
address, the probe sequence PROBE is sent and the round trip time
RTT sampled (in milliseconds) by measuring the time elapsed between
TCP SYN sent and TCP SYN ACK received.
[0217] Isp Access Cloud Down Hint Detection
[0218] A common way to detect link down is to constantly send out
probe traffic. This is to be avoided as the user is paying for
useful bandwidth that should not be overused by a linkdown
detection technique.
[0219] Instead, a hint that the link may be down is detected in the
following method. 10 The system implements counters of bytes seen,
to and from each ISP router. If there are 2 ISP routers, 1 and 2,
let ln-i and ln-2 be the inbound byte counts, and let 0-1 and O-2be
the outbound byte counts. Sample these values every 1 second. If in
3 consecutive 1-second intervals, these counters do not increment,
this suggests inactivity, and inactivity might be a result of a
link failure. Declare a Blackout Hint 15 state (set Blackout Hint
State=1) for the ISP identifier.
[0220] Aggressive Probe Method
[0221] Upon detection of a Blackout Hint (Blackout Hint State=1),
the system transitions to Aggressive Probes (e.g., Aggressive Probe
sub-routine?) of the ISP (Access Port) in question, lets say its
identifier is ISPX. (Then send) Up to N (max of N=256) probes
(Probe Sequence set, let acronym=PSS) to different destination IP
addresses. N is set by default to 10. For IP Address I, I=1 to MN,
find the max RTT e.g., max RTT, 11 to M, is RTT (x) b.tkx) for ISPX
in this set ito MN. SetTinieout=3.times.max PIT.
[0222] As an example, in Table 1, if ISPX was actually ISP 1, the
max RTT is 112 and 25 the Timeout is (set to) 336 milliseconds.
[0223] Send out all N 10 probes, one after another. As a general
rule--the spacing between these needs to be reasonably short and
all N of them should complete in at most 2 seconds.
[0224] Now, start the timer and wait for the Timeout seconds to
expire. Check (for) the 5 matching (SYN ACK) responses (SYN packets
are sent, SYN ACK packets are expected in response from the
target).
[0225] If, and only if, out of N probes, zero SYN ACK responses
were received, then we can declare ISPX Link State is DOWN (SET
ISPX Link State=DOWN).
[0226] Mathematical Proof of Reliability
[0227] The following is a mathematical proof showing improved
reliability for embodiments of the present A-MHS invention.
[0228] We use Chebychev's Inequality
P(l X-mean l>k*STDDEV)c 1/(square of k)
[0229] Here, X is the actual time that it takes for the probe
sequence to elicit a response from the probed address.
[0230] Since there are (at least) 10 probes, there will be an equal
number of (at least) 10 random variables X (1), X (10).
[0231] The probability of a false detection of ISP Access Cloud
failure is the probability that all the probe samples returned
unusually late, i.e. all of them return only after 3 times Worst
Case RTT has elapsed.
[0232] Note that for each of the 10 probe destinations, the RTT
sample size is just 1, Therefore sample value=Mean.
[0233] For the standard deviation, since we have no other
information, we have to estimate it as a reasonable positive
number, in order to make use of Chebyshev's inequality
correctly.
[0234] Hence let us estimate the Std Deviation=Sample Value. It has
been found from a sampling of observations of these RTT values that
they tend to have a standard deviation that is much smaller than
any individual sample value, so this estimate is therefore a
conservative one.
[0235] Hence, for each X (i), let u (i) be the corresponding mean
and standard deviation.
[0236] Let Worst Case RfTT=u.
[0237] From
P(X-u>k*u)<l/(square of k)
We derive
P(X>(k+1)*u)<1/(square of k)
[0238] Also, since each u (i)<u, we obtain
P(X(i)>(k+1)*u(i))<P(X(i)>(k+l)*u)<1/(square of k)
[0239] Since we set our timeout for all 10 probes to be 3 times
Worst Case RTT, the (k+1) factor=3, and hence k=2.
[0240] The probability that a single probe timed out falsely (i.e.
the probe actually returned after 3 times worst case RTT) is less
than 1/4.
[0241] Therefore, the probability that all 10 probes timed out
falsely is less than
(1/4)**10(i.e. 1/4 to the power of iO), is about 0.00000095.
[0242] Therefore, the reliability of the method is greater than
(1-0.00000095)=0.999999
[0243] Note that although preferred embodiments of the present
invention can be described as including a single computing
processor system and specific program modules for enabling the
various features and benefits of the present invention, the
invention is understood to apply to Adaptive MHS systems and
adaptive ACSD units that include either single or multi-processor
computing modules. It is also understood that the functions and
features of the various Program Modules of the present invention
can also be implemented in hard-wired circuitry, e.g., Large-Scale
FPGA's, ASIC's and the like.
* * * * *