U.S. patent number 10,848,552 [Application Number 15/994,073] was granted by the patent office on 2020-11-24 for determining whether to perform address translation to forward a service request or deny a service request based on blocked service attributes in an ip table in a container-based computing cluster management system.
This patent grant is currently assigned to Hewlett Packard Enterprise Development LP. The grantee listed for this patent is Hewlett Packard Enterprise Development LP. Invention is credited to Koteswara Rao Kelam, Sangeeta Maurya, John Joseph McCann, III, Syed Ahmed Mohiuddin Peerzade, Selvakumar Sanmugam, Pradeep Sathasivam, Praveen Kumar Shimoga Manjunatha, Sonu Sudhakaran, Krishna Mouli Tankala, Ravikumar Vallabhu.
United States Patent |
10,848,552 |
Shimoga Manjunatha , et
al. |
November 24, 2020 |
Determining whether to perform address translation to forward a
service request or deny a service request based on blocked service
attributes in an IP table in a container-based computing cluster
management system
Abstract
In an example, a container cluster management system includes a
cluster manager providing access to services provided by containers
within a container cluster and a plurality of nodes. Each node has
access to an IP table, and is to forward a service request for a
service received via the cluster manager to at least one container
sub-cluster by translating a destination address of the service
request to an IP address of a container sub-cluster. At least one
of the nodes comprises a proxy manager, to manage an IP table of
the node and a service firewall, to add a service-specific rule to
the IP table.
Inventors: |
Shimoga Manjunatha; Praveen
Kumar (Bangalore Karnataka, IN), Sudhakaran; Sonu
(Bangalore Karnataka, IN), Sanmugam; Selvakumar
(Bangalore Karnataka, IN), Tankala; Krishna Mouli
(Bangalore Karnataka, IN), Sathasivam; Pradeep
(Bangalore Karnataka, IN), McCann, III; John Joseph
(Fort Collins, CO), Maurya; Sangeeta (Bangalore Karnataka,
IN), Kelam; Koteswara Rao (Bangalore Karnataka,
IN), Mohiuddin Peerzade; Syed Ahmed (Bangalore
Karnataka, IN), Vallabhu; Ravikumar (Bangalore
Karnataka, IN) |
Applicant: |
Name |
City |
State |
Country |
Type |
Hewlett Packard Enterprise Development LP |
Houston |
TX |
US |
|
|
Assignee: |
Hewlett Packard Enterprise
Development LP (Houston, TX)
|
Family
ID: |
1000005204986 |
Appl.
No.: |
15/994,073 |
Filed: |
May 31, 2018 |
Prior Publication Data
|
|
|
|
Document
Identifier |
Publication Date |
|
US 20190306231 A1 |
Oct 3, 2019 |
|
Foreign Application Priority Data
|
|
|
|
|
Mar 29, 2018 [IN] |
|
|
201841012010 |
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04L
61/2007 (20130101); H04L 61/255 (20130101); H04L
67/16 (20130101); G06F 9/5083 (20130101); H04L
63/0281 (20130101); H04L 67/1031 (20130101); H04L
63/0263 (20130101); H04L 67/1034 (20130101); H04L
67/32 (20130101) |
Current International
Class: |
H04L
29/08 (20060101); H04L 29/12 (20060101); G06F
9/50 (20060101); H04L 29/06 (20060101) |
References Cited
[Referenced By]
U.S. Patent Documents
Foreign Patent Documents
|
|
|
|
|
|
|
103957237 |
|
Jul 2014 |
|
CN |
|
106888254 |
|
Jun 2017 |
|
CN |
|
2012/057956 |
|
May 2012 |
|
WO |
|
Other References
Amazon Web Services, Inc., "Elastic Load Balancing," 2017, pp. 1-7.
cited by applicant .
Armstrong, D., et al., "Towards Energy Aware Cloud Computing
Application Construction," Jun. 23, 2017, pp. 1-23. cited by
applicant .
Banerjee, P. K.; "Deploying a Service on a Kubernetes Cluster,"
Mar. 8, 2016, pp. 1-10. cited by applicant .
Bhatia, J.; "A Dynamic Model for Load Balancing in Cloud
Infrastructure," NIRMA University Journal of Engineering and
Technology, Jan.-Jun. 2015, pp. 15-19, vol. 4, No. 1. cited by
applicant .
F5 Networks, Inc. "Load Balancing 101: Nuts and Bolts," White
Paper, May 10, 2017, pp. 1-8. cited by applicant .
Kabar, N.; "Docker Reference Architecture: Service Discovery and
Load Balancing with Docker Universal Control Plane (UCP 1.1),"
2017, pp. 1-16. cited by applicant .
Openshift; "Application Health," 2017, pp. 1-2 [online], Retrieved
from the Internet on Dec. 5, 2017 at URL:
<docs.openshift.com/container-platform/3.4/dev_guide/application_healt-
h.html>. cited by applicant .
Openshift; "Using a Service External IP to Get Traffic into the
Cluster," 2017, pp. 1-7 [online], Retrieved from the Internet on
Dec. 5, 2017 at URL:
<docs.openshift.com/container-platform/3.4/dev_guide/expose_servi-
ce/expose_internal_ip_service.html>. cited by applicant .
Scholes, M.; "Deploy an App Into Kubernetes Using Advanced
Application Services," Jun. 22, 2017, pp. 1-11. cited by applicant
.
Slideshare.net; "Kubernetes and Software Load Balancers," Mar. 24,
2017, pp. 1-10 [online], Retrieved from the Internet on Dec. 5,
2017 at URL:
<slideshare.net/haproxytech/kubernetes-and-software-load-balancers-735-
98367>. cited by applicant .
The Openstack; "Load-Balancing Policy," Dec. 4, 2017, pp. 1-6.
cited by applicant .
Total Uptime, "Load Balancing and Failover Between AWS Regions,"
Sep. 19, 2015, pp. 1-3 [online], Total Uptime Technologies, LLC,
Retrieved from the Internet on Dec. 5, 2017 at URL:
<totaluptime.com/load-balancing-and-failover-between-aws-regions/>.
cited by applicant .
Michael Churchman, "Load-Balancing in Kubernetes," Aug. 14, 2017,
pp. 1-5, Retrieved from the Internet on Apr. 20, 2020 at URL:
<web.archive.org/web/20180327002500/http://rancher.com/load-balancing--
in-kubernetes/>. cited by applicant .
Coles et al., "Rapid Node Reallocation Between Virtual Clusters for
Data Intensive Utility Computing", 2006 IEEE International
Conference on Cluster Computing, Sep. 1, 2006 (Sep. 1, 2006), 10
pages. cited by applicant .
European Search Report and Search Opinion Received for EP
Application No. 19165945.7, dated Jul. 8, 2019, 11 pages. cited by
applicant .
European Search Report and Search Opinion Received for EP
Application No. 19165946.5, dated Jul. 10, 2019, 10 pages. cited by
applicant.
|
Primary Examiner: Murray; Daniel C.
Attorney, Agent or Firm: Hewlett Packard Enterprise Patent
Department
Claims
The invention claimed is:
1. A method comprising: receiving, at a node executing on a
hardware processor in a container-based computing cluster, a
request for a service; determining, by the node, if the request for
the service comprises blocked service attributes in an IP table of
the node; in response to the request not comprising blocked service
attributes, performing, by the node, an address translation to
forward the request to a container sub-cluster of the
container-based computing cluster; and in response to the request
comprising blocked service attributes, denying the request by the
node.
2. A method according to claim 1, further comprising advertising at
least one of an availability and an accessibility of an IP address
of a service to a network.
3. A method according to claim 1, further comprising: adding, by a
service firewall of the node, service-specific rules to the IP
table.
4. A method according to claim 3, further comprising establishing a
cluster manager and a plurality of nodes to provide the
container-based computing cluster; and detecting a service creation
event adding, by the node, service-specific rules to the IP table
follow detection of the service creation event.
5. A method according to claim 3, wherein the service-specific
rules restrict address translation to requests specifying a single
predetermined port and/or or a single predetermined protocol.
6. The method according to claim 3, wherein the service-specific
rules added to the IP table includes a deny rule for accessing the
service using a first protocol or a first port.
7. A method according to claim 1, further comprising, in an event
of a failure of a first node, reassigning an IP address of the
first node to a second node by updating a mapping table associating
a virtual router associated with the IP address such that the IP
address is associated with the second node.
8. A method according to claim 7, further comprising, in an event
that the first node resumes service, reassigning the IP address
from the second node to the first node.
9. The method according to claim 1, further comprising monitoring,
by a service monitor of the node, for at least one external
service; and updating, by a service firewall of the node, the IP
table in response to detection of an external service by the
service monitor.
10. A non-transitory machine readable medium storing instructions
that, when executed by a processor on which a node of a
container-based computing cluster is executing, cause the node to:
receive a request for a service; determine if the request for the
service comprises blocked service attributes in an IP table of the
node; respond to the request not comprising blocked service
attributes by performing an address translation to forward the
request to a container sub-cluster of the container-based computing
cluster; and respond to the request comprising blocked service
attributes by denying the request.
11. The non-transitory machine readable medium of claim 10 storing
instructions that, when executed, cause the node to advertise at
least one of an availability and an accessibility of an IP address
of a service to a network.
12. The non-transitory machine readable medium of claim 10 storing
instructions that, when executed, cause the node to add, by a
service firewall, service-specific rules to the IP table.
13. The non-transitory machine readable medium of claim 12, wherein
the service-specific rules restrict address translation to requests
specifying a single predetermined port or a single predetermined
protocol.
14. The non-transitory machine readable medium of claim 10 storing
instructions that, when executed, cause the node to respond to a
failure of a first node by reassigning an IP address of the first
node to a second node by updating a mapping table associating a
virtual router associated with the IP address such that the IP
address is associated with the second node.
15. The non-transitory machine readable medium of claim 14 storing
instructions that, when executed, cause the node to respond to the
first node resuming service by reassigning the IP address from the
second node to the first node.
16. A system comprising: a processing resource; and a
non-transitory machine readable medium storing instructions that,
when executed by the processing resource, cause the processing
resource to: receive a request for a service, determine if the
request for the service comprises blocked service attributes in an
IP table of the node, respond to the request not comprising blocked
service attributes by performing an address translation to forward
the request to a container sub-cluster of the container-based
computing cluster, and respond to the request comprising blocked
service attributes by denying the request.
17. The system of claim 16, wherein the non-transitory machine
readable medium stores instructions that, when executed, cause the
processing resource to advertise at least one of an availability
and an accessibility of an IP address of a service to a
network.
18. The system of claim 16, wherein the non-transitory machine
readable medium stores instructions that, when executed, cause the
processing resource to add, by a service firewall, service-specific
rules to the IP table.
19. The system of claim 18, wherein the service-specific rules
restrict address translation to requests specifying a single
predetermined port or a single predetermined protocol.
Description
BACKGROUND
Containerized computing systems provide a readily scalable
architecture for the individual deployment of a plurality of
programs which may utilise the same underlying resource.
BRIEF DESCRIPTION OF DRAWINGS
Non-limiting examples will now be described with reference to the
accompanying drawings, in which:
FIG. 1 is a block diagram of an example of a containerized
computing management system;
FIG. 2 is a flow chart of an example of a method of managing a
containerized computing system;
FIG. 3 is a block diagram of an example of a worker node of a
containerized computing system;
FIG. 4 is a block diagram of another example of a worker node of a
containerized computing system;
FIG. 5 is a flow chart of another example of a method of managing a
containerized computing system;
FIG. 6 is a flow chart of an example of a method of reassigning IP
addresses in a containerized computing system; and
FIG. 7 is a block diagram of another example of a containerized
computing management system.
DETAILED DESCRIPTION
The following discussion is directed to various examples of the
disclosure. The examples disclosed herein should not be
interpreted, or otherwise used, as limiting the scope of the
disclosure, including the claims. In addition, the following
description has broad application, and the discussion of any
example is meant only to be descriptive of that example, and not
intended to intimate that the scope of the disclosure, including
the claims, is limited to that example. Throughout the present
disclosure, the terms "a" and "an" are intended to denote at least
one of a particular element. In addition, as used herein, the term
"includes" means includes but not limited to. The term "based on"
means based at least in part on.
Some computing systems employ `containerization`. Containerization
can take place at the operating system level. In some examples,
mutually isolated computing instances, known as containers (or in
some examples, by other terms such as virtualisation engines or
partitions), operate as separate computers from the point of view
of programs deployed thereon. While a deployed program may utilise,
and be aware of, the resources of its container, it may generally
be unaware of the resources of any other container, even where an
underlying physical resource is shared.
Thus, a computing resource such as a computer, a server, a memory
or the like, may have part of its resources allocated to one
container and another part allocated to another. Programs running
within containers (and in some examples, there may be several
programs running within each container) have access only to the
resources allocated to the container. Such computing resources
allow for ease of scalability and accessibility of the same
underlying resource by mutually distrusting entities with little
additional overhead. An example of a container manager and
deployment system is Kubemetes.
Solutions for load balancing in containerized computing systems
include plugins utilising elastic load balancers to provide
external access for containerized services. Such plugins are
specialised, advanced features which are specifically tailored to a
particular implementation/service provider. Moreover, such
solutions may introduce latency into the deployment of
containerized services.
In examples described herein, a processing resource may include,
for example, one processing resource or multiple processing
resources included in a single computing device or distributed
across multiple computing devices. As used herein, a "processing
resource" may be at least one of a central processing unit (CPU), a
semiconductor-based microprocessor, a graphics processing unit
(GPU), a field-programmable gate array (FPGA) configured to
retrieve and execute instructions, other electronic circuitry
suitable for the retrieval and execution of instructions stored on
a machine-readable storage medium, or a combination thereof. In
examples described herein, entities may communicate with one
another via direct connections, via one or more computer networks,
or a combination thereof. In examples described herein, a computer
network may include, for example, a local area network (LAN), a
virtual LAN (VLAN), a wireless local area network (WLAN), a virtual
private network (VPN), the Internet, or the like, or a combination
thereof.
In examples described herein, a "node" entity is a virtualised
processing resource, which may run on all or part of a computing
device, such as a server, storage array, storage device, desktop or
laptop computer, switch, router, or any other processing device or
equipment including a processing resource.
FIG. 1 is an example of a container cluster management system 100
comprising a cluster manager 102 and a plurality of nodes 104a-e.
In use of the system 100, the cluster manager 102 provides access
to services provided by containers 110.
Each node 104 of the plurality of nodes 104a-e has access to an IP
table. The IP table may, for example be stored in a storage
resource such as a volatile or non-volatile memory which the node
104 can access. In some examples, a node 104 may be provided as a
virtual machine on a processing device and the IP table may be
stored in a storage resource of the processing device.
At least one (and in the illustrated example, each) node 104a-e
comprises a proxy manager 106a-e and a service firewall 108a-e.
Although shown separately herein, a service firewall 108 may
constitute part of a proxy manager 106. In use of the system 100,
each proxy manager 106 manages the IP table of its node and each
service firewall 108 adds service-specific rules to the IP table of
its node. While each node is shown as comprising a service firewall
108, this may not be the case in all examples. In some examples,
some nodes of the system 100 may comprise a service firewall 108
whereas other nodes of the system 100 may not.
A service may comprise a predetermined set of `pods`, where a pod
is a logical host of a set of containers or, expressed another way,
a pod comprises a sub-cluster of related containers. For example, a
pod's containers may be co-located and co-scheduled, and run in a
shared context. Containers within a pod may share an IP address
and/or port space, and may be able to communicate with one another
(whereas, generally, containers in different pods may have distinct
IP addresses and are not typically in direct communication with one
another, instead communicating via Pod IP addresses and the like).
Applications deployed within a pod may have access to shared
`volumes`, which are usually directories, in some examples holding
data. Such volumes may be logically considered to be part of a pod,
and may have the same life cycle as a pod.
To consider a particular example, a pod may comprise frontend and
backend containers, where the frontend containers may provide user
interfaces and the like and the backend containers may provide
databases, data processing and the like. The containers may work
together to provide a service.
A pod (as well as an individual container) may be a temporary
configuration. Pods may be created, assigned a unique ID, and
scheduled to nodes 104 where they remain until termination
(according to restart policy) or deletion. In some examples, if a
node 104 fails, the pods scheduled to that node 104 may be
scheduled for deletion, after a timeout period.
In use of the system 100, a node 104 forwards service requests for
a service received via the cluster manager 102 to at least one
container sub-cluster by translating the destination address of the
service request to an IP address of a container sub-cluster (which
may comprise one or more pods). For example this may utilise
Destination Network Address Translation (DNAT) and redirect the
incoming traffic to the pod or pods which make up the service
identified by the IP address. In some examples, the pods have
different IP addresses and, where there is more than one pod, this
redirection may be intended to forward traffic to a given pod with
a predetermined probability (e.g. an equal probability). A pod's
reply may be routed back to a service IP address, i.e. the node
104, and then forwarded thereby to a client.
In this example, the proxy manager 106 manages the IP table of the
node and the service firewall 108 adds at least one
service-specific rule to the IP table.
This allows load balancing of services through post-routing
Destination Network Address Translation (DNAT) rules. In some
examples, the at least one rule may comprise a `deny` rule. A deny
rule may prevent access to a service via any port and/or using any
protocol other than an expected port/protocol. For example, a
service may be accessible using an IP address and at least one of a
specified port and/or a specified protocol. A deny rule may be
configured so as to prevent user access to ports other than the
specified port, even if the request comprises the `correct` IP
address of the service. Another rule may be provided for protocols
other than the specified protocol (for example, User Datagram
Protocol (udp), or Transmission Control Protocol (tcp), or any
other network protocol). There may be a single specified protocol
and/or port for each service.
In some examples, a service may be detected on creation, or may be
detected on making its first request to a node 104. In some
examples, whenever a service is created, at least one address
translation rule (e.g. a DNAT rule, which may comprise a rule for
carrying out an address translation) and at least one deny rule may
be added to the IP table.
Such a solution de-couples load balancing from a provider (e.g. a
cloud provider), and can be handled within the cluster manager 102.
The IP table may be a Linux IP table, which may increase the
portability and ease of scalability of the solution.
There may be minimal lag in implementing the rules; updating the IP
table is an efficient process and therefore the solution may be
lightweight and easily maintainable. In some examples, deny rules
may be structured so as to deny access to a service via any port
other than a predetermined port (in some example, a predetermined
single port). In such an example, security may be enhanced. In some
examples, deny rules may be structured so as to deny access to a
service using any protocol other than a predetermined protocol (in
some example, a predetermined single protocol). In some examples,
deny rules may be structured so as to deny access to a service
using any protocol or port combination other than a predetermined
protocol and port combination. The predetermined port/protocol may
be specified in a service specification, which may be stored or
accessible to the cluster manager 102.
Each of the cluster manager 102, nodes 104a-e, proxy manager 106
and/or the service firewall 108 may be any combination of hardware
and programming to implement the described functionalities. In
examples described herein, such combinations of hardware and
programming may be implemented in a number of different ways. For
example, programming may be processing resource executable
instructions stored on at least one non-transitory machine-readable
storage medium and the hardware may include at least one processing
resource to execute those instructions. In some examples, the
hardware may also include other electronic circuitry to at least
partially implement at least one of the cluster manager 102, nodes
104a-e, proxy manager 106 and/or the service firewall 108. In some
examples, the at least one machine-readable storage medium may
store instructions that, when executed by the at least one
processing resource, at least partially implement some or all of
the cluster manager 102, nodes 104a-e, proxy manager 106 and/or the
service firewall 108. In such examples, a computing device at least
partially implementing the processing cluster manager 102 and/or a
node 104a-e may include the at least one machine-readable storage
medium storing the instructions and the at least one processing
resource to execute the instructions. In other examples, cluster
manager 102, nodes 104a-e, proxy manager 106 and/or the service
firewall 108 may be implemented by electronic circuitry.
FIG. 2 is a flowchart showing an example of a method, which may be
a method of managing a container-based computing cluster. Block 202
comprises receiving, at a node of a container-based computing
cluster, a request for a service. Block 204 comprises determining
if the service request comprises blocked service attributes. For
example, this may comprise determining if a request for a service
is a request to access the service via a port or protocol which is
listed as a blocked port or protocol in a deny rule of an IP table
of the node. In another examples, a port or protocol may be defined
as an accepted port or protocol, and only such ports or protocols
may be accepted for forwarding. When the port or protocol is not a
blocked a port or protocol, an address translation is performed to
forward the request to a container sub-duster of the
container-based computing cluster in block 206. Otherwise (i.e.
when the request is a blocked request), block 208 comprises denying
the request for the service.
For example, a service request may be made by referring to a
protocol, an IP address of the service and the port. Such a request
may have the format: <PROTOCOL>://<IP
Address>:<Port>, and the identity of the IP address as
well as the port and/or protocol may be defined in a service
specification. Port(s) and/or protocol(s) other than the specified
port and/or protocol may be specified (positively or negatively)
within one or a set of deny rules.
For example, the IP table may be initialised as follows: A: sudo
iptables -t nat -N KUBE-MARK-DROP-LB
This inserts a new rule called "KUBE-MARK-DROP-LB" B: sudo iptables
-t nat -A KUBE-MARK-DROP-LB -j MARK --set-xmark 0x8000/0x8000
This rule specifies that packets marked with a 32 bit value (in
this example, 0x8000, but this value could be different in other
examples) should be dropped. This value is a value which is
utilised in Kubemetes to identify and drop packages, although in
principle another value could be used. C: sudo iptables -t nat -A
KUBE-MARK-DROP-LB -j ACCEPT
Packets with other values may however be accepted in this rule
chain, so that they will be forwarded to the next rule set for
onward processing.
Then, on creation of a service, the following code may be executed:
D: sudo iptables -t nat -N<SERVICE-NAME_NAMESPACE>
This creates a new chain with a unique name based on a combination
of the service name and a namespace value. This chain will identify
packets to forward to the KUBE-MARK-DROP-LB chain described above.
E: sudo iptables -t nat -l PREROUTING --destination <ExtIP>
-j <SERVICE-NAME_NAMESPACE>
This inserts a rule in a rule chain "PREROUTING", which captures
and redirects any packet with a destination address <ExtIP>
to the new chain <SERVICE-NAME_NAMESPACE>
F: sudo iptables -t nat -A <SERVICE-NAME_NAMESPACE> -p <
PROTOCOL> -m multiport --dports <SER-PORT,NODE-PORT> -j
RETURN
This appends a rule to the new chain <SERVICE-NAME_NAMESPACE>
which returns the processing of the packet to the standard
processing route for address translation if the protocol matches
<PROTOCOL> and destination ports match <SER-PORT,
NODE-PORT>
G: sudo iptables -t nat -A <SERVICE-NAME> -j
KUBE-MARK-DROP-LB
This serves to forward any packets which have not be redirected for
normal processing at F to the "KUBE-MARK-DROP-LB" chain described
above. The package will be marked using the value 0x8000 (although
in principle other markings could be used), such that the packet
will be dropped by KUBE-MARK-DROP-LB.
In other words, a packet directed to a particular external IP
address (ExtIP) will be captured by rule E and handled by the chain
<SERVICE-NAME_NAMESPACE>. If this packet specifies a protocol
which matches <PROTOCOL> and destination ports which match
<SER-PORT, NODE-PORT>, it is accepted and forwarded for
address translation by rule F. If however it does not, it will be
forwarded to the chain KUBE-MARK-DROP-LB, where, if it contains a
value 0x8000, it will be dropped.
Therefore, if a request received in block 202 has an IP address as
defined in the service specification but with a different port or
protocol, such deny rule(s) may be applied such that the request is
dropped (block 208). In some examples, as shown in the rule set
above, there may be a single port or protocol which is allowed
(i.e. not in a deny rule) for each service as identified by its IP
address. I.e. the blocked ports and protocols are described
negatively, with reference to an accepted port and protocol.
Such a method may be carried out by a node 104 of a system 100 as
described in relation to FIG. 1.
FIG. 3 shows an example of a worker node 300 for deployment in a
container cluster management system. The worker node 300 comprise
storage 302 to store an IP table. For example, this may comprise
non-volatile and/or volatile memory. The worker node 300 further
comprises a proxy manager 304, which, when the node 300 is
deployed, manages an IP table of the node stored in the storage
302, and a service firewall 306. In this example, when the node 300
is deployed, the service firewall 306 blocks a service request for
a service by adding a deny rule for the service to the IP table. In
some examples, the service firewall 306 is provided by the proxy
manager 304.
The worker node 300 further comprises a processing resource 308 to
forward a service request to at least one container sub-cluster by
translating the destination address of the service request to an IP
address of a container sub-cluster (e.g. a `pod` as described
above) using the IP table, unless the service request is a blocked
service request. Such a node 300 may provide a node 104 of FIG. 1,
or may carry out the method of FIG. 2.
FIG. 4 is another example of a worker node 400. In addition to the
storage 302, proxy manager 304, service firewall 306 and processing
resource 308, the node 400 further comprises a service monitor 402
and a node health monitor 404.
The service monitor 402 monitors for at least one external service.
On detection of a service, the service firewall 306 is triggered to
update the IP table in response to the detection. Detection of a
service may for example comprise receiving a notification, which
may be generated by a container management system with which the
service monitor or associated apparatus is registered or associated
(for example, the cluster manager 102). Such notifications may be
generated for update and delete events, as well as for create
events. In some examples, the service monitor 402 may comprise a
component of the service firewall 306. In some examples, the node
health monitor 404 monitors the operational status of the node 400
and provides a healthcheck message indicative of normal operation
when in a normal operational state. The message may for example be
provided to a cluster manager, such as the cluster manager 102 of
FIG. 1, and/or to at least one other node 104. The node health
monitor 404 may also, in some examples, monitor the health of other
nodes 104, 400.
Each of the node 400, proxy manager 304, service firewall 306,
processing resources 308, service monitor 402 and the node health
monitor 404 may be any combination of hardware and programming to
implement the described functionalities. In examples described
herein, such combinations of hardware and programming may be
implemented in a number of different ways. For example, programming
may be processing resource executable instructions stored on at
least one non-transitory machine-readable storage medium and the
hardware may include at least one processing resource to execute
those instructions. In some examples, the hardware may also include
other electronic circuitry to at least partially implement at least
one of the node 400, proxy manager 304, service firewall 306,
processing resources 308, service monitor 402 and the node health
monitor 404. In some examples, the at least one machine-readable
storage medium may store instructions that, when executed by the at
least one processing resource, at least partially implement some or
all of the node 400, proxy manager 304, service firewall 306,
processing resources 308, service monitor 402 and the node health
monitor 404. In such examples, a computing device at least
partially implementing the node 400 may include the at least one
machine-readable storage medium storing the instructions and the at
least one processing resource to execute the instructions. In other
examples, the node 400, proxy manager 304, service firewall 306,
processing resources 308, service monitor 402 and/or the node
health monitor 404 may be implemented by electronic circuitry.
FIG. 5 is an example of a method, which may in some examples be
carried out at least in part by a node 400.
The method comprises, in block 502, establishing a cluster manager
and a plurality of nodes. Block 504 comprises detecting a service
creation event, for example on receipt of a notification from a
cluster management entity such as a cluster manager 102. Block 506
comprises adding address translation rules for a first detected
service to an IP table of a node. In some examples, the address
translation rules may be added at a plurality of nodes. In block
508, service-specific `deny` or `block` rules are added to the IP
table of at least one node (and in some examples, for a plurality
of nodes) for the service.
For the sake of example, a service specific rule DNAT rule added in
block 506 may comprise, in a Kubemetes container system:
-A KUBE-SERVICES -d 10.142.137.200/32 -p tcp -m comment --comment
"default/nginx: external IP" -m tcp -dport 80 -m physdev !
--physdev-is-in -m addrtype ! --src-type LOCAL -j
KUBE-SVC-4N57TFCL4MD7ZTDA
This rule checks if a request is received on IP address
10.142.137.200 and with protocol tcp and on port 80. If the request
is received on IP address 10.142.137.200 but with a protocol other
than tcp and/or a port other than port 80, the packet may be
diverted to other rule chains as set out above.
If the packet is accepted, the request is forwarded to another rule
set -KUBE-SVC-4N57TFCL4MD7ZTDA, in this example comprising three
rules set out below. In turn, if an IP table has been established
in a manner to drop packets which are associated with that IP
address, but associated with any protocol other than tcp and any
port other than port 80, such packets will be dropped.
In this example, the three rules of the further rule set are:
-A KUBE-SVC-4N57TFCL4MD7ZTDA -m comment --comment "default/nginx:"
-m statistic
--mode random --probability 0.25000000000 -j
KUBE-SEP-YWXOX2RATT4U5XWP
-A KUBE-SVC-4N57TFCL4MD7ZTDA -m comment --comment "default/nginx:"
-m statistic
--mode random --probability 0.33332999982 -j
KUBE-SEP-62W5WRRO7TZHHJIR
-A KUBE-SVC-4N57TFCL4MD7ZTDA -m comment --comment "default/nginx:"
-m statistic
-mode random -probability 0.50000000000 -j
KUBE-SEP-XL4V4VLBKROVH7NR
This rule set directs the request to a rule of a second rule set on
a probability basis. The probability is decided based on the number
of backend processing entities. The second rule set, in which the
address translation is carried out, may be as follows:
-A KUBE-SEP-YWXOX2RATT4U5XWP -p tcp -m comment --comment
"default/nginx:" -m tcp -j DNA T --to-destination 10.32.0.4:80
-A KUBE-SEP-XL4V4VLBKROVH7NR -p tcp -m comment --comment
"default/nginx:" -m tcp j DNAT --to-destination 10.36.0.5:80
-A KUBE-SEP-62W5WRRO7TZHHJIR -p tcp -m comment --comment
"default/nginx:" -m tcp j DNAT --to-destination 10.36.0.4:80
Each of the above three rules of the second rule set redirects the
request to a particular backend.
Service specific deny rules may, for example, deny access to a
service unless the request specifies a protocol and/or port which
is not associated with a deny rule.
Block 510 comprises detecting a failure of a node and block 512
comprises reassigning the IP address of the failed node to another
node. While in this example, node failure is detected, in another
example, it may be determined that a node has become unhealthy.
This provides for high-availability of the cluster. This may
comprise updating a mapping table. In some examples, this may
comprise updating a virtual router to node mapping table (a
Node_ID-to-VRID mapping table, as described in greater detail
below).
Block 514 comprises advertising at least one of an availability and
an accessibility of an IP address of a service to a network.
Advertising an availability may comprise sending or responding to a
Virtual Router Redundancy Protocol (VRRP) advertisement. This may
be carried out at any time, and in some examples periodically.
An example of reassigning an IP address, for example as set out in
relation to blocks 510 and 512, is now discussed. Such methods may
be used in conjunction with the described service specific rules
such as deny rules, or separately therefrom. In other words, the
methods of block 510 and 512 may be provided separately from the
methods describe in blocks 502 to 508 and in block 514. In
examples, there may be n nodes, each of which have IP addresses
configured for a specific service, and this information may be held
in a mapping table. Consider a scenario where:
Service-1 and service-2 have their IP addresses configured on
node-1.
Service-3 and service-4 have their IP addresses configured on
node-2.
Service-5 and service-6 have their IP addresses configured on
node-3.
If one of the nodes fail (which may be a total failure or a partial
failure, i.e. the node is unhealthy), the IP addresses which were
configured on the failed node may be configured on any one of the
other two nodes. In this way high availability is achieved. An
example of reassigning an IP address, for example in the event of
that a node fails, or becomes unhealthy, may comprise utilising a
mapping table associated with a node, as is now discussed with
reference to FIG. 6.
In block 602, nodes are with associated Virtual Routers via a
mapping table.
In one example it may be that each load balancing node 104, 400
(i.e. each node comprising a proxy manager 106 and a service
firewall 108 as described above) is associated with at least one
virtual router. There may be a virtual router associated with each
service (although, as described below, there may be multiple
instances of each virtual router).
In one example, each load balancing node 104 may act as a Master
node for a virtual router. In some examples, at least one node 104
may act as a back-up node for a virtual router. In some examples,
each node acts as a potential back-up node for each virtual
router.
In some examples, if there are n worker nodes 104, 400, there may
be n instances of a particular virtual router, where n is an
integer. Each instance is provided on a different node 104, 400,
and one of the nodes acts as the master node for that virtual
router. In some examples, cluster manager 102 may determine which
node 104 is the master node for each virtual router.
For example in a case where there are three load-balancing/worker
nodes, a first node, Node_ID1, may initially be the master node for
a virtual router VR_ID1, which is associated with a first service,
and a back-up node for virtual routers VR_ID2 and VR_ID3 which are
associated with a second and third service respectively. A second
node, Node_ID2, may initially be the master node for VR_ID2, and a
back-up node for VR_ID1 and VR_ID3. A third node, Node_ID3, may
initially be the master node for VR_ID3, and a back-up node for
VR_ID1 and VR_ID2.
The mapping between master nodes and virtual routers may be held in
a Node_ID-to-VRID mapping table, which may in some examples be
managed by the cluster manager 102. The cluster manager 102 may
also determine which external IP address(es) is/are associated with
each virtual router in a VRID-to-IP address mapping table, for
example as configuration information. In some examples, the
configuration may also define which interface to use to send
advertisements and/or a virtual IP address for each instance of the
virtual router.
Each load balancing node 104 may be configured to monitor the
mappings between the VR IDs and the IP addresses. In the event of a
change (e.g. a node goes out of service), the `mastership` of the
virtual router for which that node is the master node may be
reassigned to another node. In some examples, the new master node
may be determined using an `election` process between the
nodes.
For example, in the case that the first node, Node_ID1, goes out of
service (block 604), the second node, Node_ID2, may be elected to
act as master for virtual router VR_ID1 (and therefore is the
master node for both virtual routers VR_ID1 and VR_ID2). The
VRID-to-IP address mapping table may then be updated (block
606).
The IP addresses associated with the failed node may also be
reassigned to the new `master` node, based on the VRID-to-IP
address mapping table. In practice, to effect this change, a
background process (in some example, a so-called `daemon`) may be
reloaded with the new configuration.
The VRID-to-IP address mapping table may change if there are
changes in services. For example, if a new service is added then
the IP address of that service is mapped to a VRID and appended to
the list of IP addresses that are already mapped to that VRID.
If a service is deleted then the VRID-to-IP address mapping table
may be updated by removing the corresponding IP address.
In some examples, if the originally designated master node for a
virtual router resumes service, it will `reclaim` its mastership of
the virtual router, and the IP addresses associated with that
router by the VRID-to-IP mapping table will migrate back to the
originally designated master node. This may for example be achieved
by setting a priority level associated with each virtual router on
each node, such that the originally designated node has a higher
priority level than the other nodes.
In such an example, a background process may, by reassigning the
`mastership` of a virtual router, cause the associated IP addresses
to be reconfigured on a new node. This is relatively simple to
implement, and provides for a high-availability cluster. For
example, it may automatically allow a load to be rebalanced when a
node comes back on line.
In the example above, this may mean that in the case that the first
node Node_ID1 comes back into service (block 608), it may reclaim
the mastership for virtual router VR_ID1 from the second node
Node_ID2. In such an example, the Node_ID-to-VRID mapping table may
be altered to reflect the change (block 610).
FIG. 7 shows another example of a container cluster management
system 700, in this example comprising the cluster manager 102 of
FIG. 1 and a plurality of nodes 400 as described in relation to
FIG. 4. In this example, the container cluster management system
700 further comprises a high availability configuration manager 702
to, in the event of a failure of a node, reassign the IP address of
the failed node to another node. For example this may operate a
process such as keepalived or proxy ARP. In some examples, this may
comprise executing a method as described in FIG. 6. In addition,
the container cluster management system 700 comprises an
advertisement agent 704 to advertise at least one of an
availability of the node (for example, as a Virtual Router
Redundancy Protocol (VRRP) advertisement/response) and an
accessibility of an IP address of a service to a network, for
example as described in relation to block 514 above.
The high availability configuration manager 702 may be configured
as a backend process (e.g. at least one `daemon`, such as
keepalived), which responds to changes in a node_ID-to-VRID mapping
table by reassigning the mastership of a virtual router to another
node (for example, following an election of another node, or some
other designation process), and thereby configuring the IP
addresses associated with that virtual router on another node. In
some examples, the backend process may be reloaded to effect the
changes. In some examples, at least some functions of the high
availability configuration manager 702 may be provided on a node
400, which may be a load balancing node 104 as described above. In
other examples, other types of worker node (e.g. those without a
service firewall) may be provided in the container cluster
management system 700 in addition or instead of the nodes as
described in FIGS. 1, 3 and 4
Examples in the present disclosure can be provided as methods,
systems or machine readable instructions, such as any combination
of software, hardware, firmware or the like.
In some examples, a machine readable medium may be provided which
comprises instructions to cause a processing resource to perform
the functionalities described above in relation to one or more
blocks of FIG. 2, or 5. In some examples, the instructions may
comprise the programming of any of the functional modules described
in relation to FIGS. 1 and 3 and 4 and 7. In such examples,
processing resource may comprise the hardware of any of the
functional modules described in relation to FIGS. 1, 3, 4 and 7. As
used herein, a "machine readable storage medium" may be any
electronic, magnetic, optical, or other physical storage apparatus
to contain or store information such as executable instructions,
data, and the like. For example, any machine-readable storage
medium described herein may be any of Random Access Memory (RAM),
volatile memory, non-volatile memory, flash memory, a storage drive
(e.g., a hard disk drive (HDD)), a solid state drive, any type of
storage disc (e.g., a compact disc, a DVD, etc.), and the like, or
a combination thereof. Further, any machine-readable storage medium
described herein may be non-transitory. In examples described
herein, a machine-readable storage medium or media may be part of
an article (or article of manufacture). An article or article of
manufacture may refer to any manufactured single component or
multiple components.
The machine readable instructions may, for example, be executed by
a general purpose computer, a special purpose computer, an embedded
processor or processors of other programmable data processing
devices to realize the functions described in the description and
diagrams. In particular, a processor or processing circuitry may
execute the machine readable instructions. Thus functional modules
of the apparatus may be implemented by a processing resource
executing machine readable instructions stored in a memory, or a
processing resource operating in accordance with instructions
embedded in logic circuitry. The methods and functional modules may
all be performed by a single processor or divided amongst several
processors.
Such machine readable instructions may also be stored in a computer
readable storage medium that can guide the computer or other
programmable data processing devices to operate in a specific
mode.
Machine readable instructions may also be loaded onto a computer or
other programmable data processing devices, so that the computer or
other programmable data processing devices perform a series of
operations to produce computer-implemented processing, thus the
instructions executed on the computer or other programmable devices
realize functions specified by flow(s) in the flow charts and/or
block(s) in the block diagrams.
Further, the teachings herein may be implemented in the form of a
computer software product, the computer software product being
stored in a storage medium and comprising a plurality of
instructions for making a computer device implement the methods
recited in the examples of the present disclosure.
The present disclosure is described with reference to flow charts
and block diagrams of the method, devices and systems according to
examples of the present disclosure. Although the flow diagrams
described above show a specific order of execution, the order of
execution may differ from that which is depicted. Blocks described
in relation to one flow chart may be combined with those of another
flow chart. It shall be understood that at least some flows and/or
blocks in the flow charts and/or block diagrams, as well as
combinations of the flows and/or diagrams in the flow charts and/or
block diagrams can be realized by machine readable
instructions.
While the method, apparatus and related aspects have been described
with reference to certain examples, various modifications, changes,
omissions, and substitutions can be made without departing from the
spirit of the present disclosure. It is intended, therefore, that
the method, apparatus and related aspects be limited by the scope
of the following claims and their equivalents. It should be noted
that the above-mentioned examples illustrate rather than limit what
is described herein, and that those skilled in the art will be able
to design many alternative implementations without departing from
the scope of the appended claims. Features described in relation to
one example may be combined with features of another example.
The word "comprising" does not exclude the presence of elements
other than those listed in a claim, "a" or "an" does not exclude a
plurality, and a single processor or other processing resource may
fulfil the functions of several units recited in the claims.
The features of any dependent claim may be combined with the
features of any of the independent claims or other dependent
claims, in any combination.
* * * * *
References