U.S. patent application number 17/526297 was filed with the patent office on 2022-05-19 for method and system for storing snapshots in hyper-converged infrastructure.
The applicant listed for this patent is Diamanti Inc.. Invention is credited to Sambasiva Rao Bandarupalli, Shilpa Mayanna, Abhay Kumar Singh, Deepak Tawri.
Application Number | 20220156112 17/526297 |
Document ID | / |
Family ID | |
Filed Date | 2022-05-19 |
United States Patent
Application |
20220156112 |
Kind Code |
A1 |
Singh; Abhay Kumar ; et
al. |
May 19, 2022 |
METHOD AND SYSTEM FOR STORING SNAPSHOTS IN HYPER-CONVERGED
INFRASTRUCTURE
Abstract
In an embodiment of the present disclosure, a processor receives
a request to create a volume, wherein a volume placement policy
comprises a plurality of scheduler algorithms, each of the
scheduler algorithms selecting one or more worker nodes from a
plurality of worker nodes for volume storage, determines, based on
output from the plurality of scheduler algorithms, the one or more
worker nodes, wherein output of each of the plurality of scheduler
algorithms is assigned a weight in determining the one or more
worker nodes; and causes a node agent in each of the one or more
worker nodes to create the volume.
Inventors: |
Singh; Abhay Kumar; (San
Jose, CA) ; Mayanna; Shilpa; (San Jose, CA) ;
Bandarupalli; Sambasiva Rao; (Fremont, CA) ; Tawri;
Deepak; (Pune, IN) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Diamanti Inc. |
San Jose |
CA |
US |
|
|
Appl. No.: |
17/526297 |
Filed: |
November 15, 2021 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
63114295 |
Nov 16, 2020 |
|
|
|
63194611 |
May 28, 2021 |
|
|
|
International
Class: |
G06F 9/48 20060101
G06F009/48; G06F 3/06 20060101 G06F003/06 |
Claims
1. A method, comprising: receiving, by a processor, a request to
create a volume in accordance with a volume placement policy that
comprises a plurality of scheduler algorithms, each scheduler
algorithm of the plurality of scheduler algorithms to select one or
more worker nodes from a plurality of worker nodes for volume
storage; determining, by the processor based on output from the
plurality of scheduler algorithms, the one or more worker nodes,
wherein output of each of the plurality of scheduler algorithms is
assigned a weight in determining the one or more worker nodes; and
causing, by the processor, a node agent in each of the one or more
worker nodes to create the volume.
2. The method of claim 1, wherein the volume comprises a snapshot
volume, wherein the snapshot volume comprises a replicated volume,
wherein the plurality of worker nodes are associated with a
plurality of clusters corresponding to a common tenant, wherein
weights assigned to the plurality of scheduler algorithms are the
same, wherein at least one scheduler algorithm other than the
plurality of scheduler algorithms is assigned a weight less than
the weights assigned to the plurality of scheduler algorithms,
wherein the volume placement policy selects one or more clusters
from among the plurality of clusters for volume creation, and
wherein the plurality of scheduler algorithms comprise a plurality
of: a least recently used scheduler algorithm, a balance storage
resource scheduler algorithm, a skip initiator scheduler algorithm,
a most frequently used scheduler algorithm, a selector based
scheduler algorithm, an all plexes scheduler algorithm, and a round
robin scheduler algorithm.
3. The method of claim 1, wherein the volume comprises a mirrored
application volume, wherein the plurality of worker nodes are
associated with a common cluster corresponding to a common tenant,
wherein weights assigned to the plurality of scheduler algorithms
are different, and wherein the plurality of scheduler algorithms
comprise a plurality of: a least recently used scheduler algorithm,
a balance storage resource scheduler algorithm, a skip initiator
scheduler algorithm, a most frequently used scheduler algorithm, a
selector based scheduler algorithm, an all plexes scheduler
algorithm, and a round robin scheduler algorithm.
4. The method of claim 1, wherein the volume placement policy
comprises first and second placement policies and wherein the
determining comprises: determining, by the processor based on an
output of the first placement policy, a target cluster from among a
plurality of clusters for volume storage; and determining, by the
processor based upon an output of the second placement policy, the
one or more worker nodes within the target cluster for volume
storage, the one or more worker nodes being part of the target
cluster.
5. The method of claim 4, wherein the first and second placement
policies assign weights to different scheduler algorithms.
6. The method of claim 4, wherein the first and second placement
policies assign different weights to common scheduler
algorithms.
7. The method of claim 1, wherein the volume comprises a snapshot
volume, wherein the plurality of scheduler algorithms in the volume
placement policy perform one or more of: limits a number of mirrors
for snapshots, minimizes storage space needed for a snapshot,
maximizes snapshot storage distribution across the plurality of
worker nodes, dedicates the one or more worker nodes for a
plurality of snapshots, distributes snapshot volume storage
substantially uniformly across the plurality of worker nodes,
distributes snapshot volume storage giving more priority to worker
nodes in the plurality of worker nodes that have more storage space
available, selects the worker node used for creation and lifecycle
management of snapshot volumes to maintain Service Level Agreements
(SLAs) for an applications corresponding to the volume, selects
worker node storage resources to achieve a predetermined Recovery
Point Objective (RPO) and/or Recovery Time Objective (RTO) for
application recovery with the processor automatically deciding on
placement and type of snapshot volume(s) to be created depending on
the RPO and/or RTO to be achieved, wherein a first worker node of
the plurality of worker nodes comprises a first weighted score, a
second worker node of the plurality of worker nodes comprises a
different second weighted score, and wherein the one or more worker
nodes comprises the first but not the second worker node.
8. A server comprising: a communication interface to transmit and
receive communications; a processor coupled with the communication
interface; and a computer readable medium, coupled with and
readable by the processor and storing therein a set of instructions
that, when executed by the processor, causes the processor to:
receive a request to create a volume, wherein a volume placement
policy comprises a plurality of scheduler algorithms, each
scheduler algorithm of the plurality of scheduler algorithms to
select one or more worker nodes from among a plurality of worker
nodes for volume storage; determine, based on an output from the
plurality of scheduler algorithms, the one or more worker nodes,
wherein an output of each of the plurality of scheduler algorithms
is assigned a weight in determining the one or more worker nodes;
and instruct a node agent in each of the one or more worker nodes
to create the volume.
9. The server of claim 8, wherein the volume comprises a snapshot
volume, wherein the snapshot volume comprises a replicated volume,
wherein the plurality of worker nodes are associated with a
plurality of clusters corresponding to a common tenant, wherein
weights assigned to the plurality of scheduler algorithms are the
same, wherein at least one scheduler algorithm other than the
plurality of scheduler algorithms is assigned a weight different
than the weights assigned to the plurality of scheduler algorithms,
wherein the volume placement policy selects one or more clusters
from among the plurality of clusters for volume creation, and
wherein the plurality of scheduler algorithms comprise a plurality
of: a least recently used scheduler algorithm, a balance storage
resource scheduler algorithm, a skip initiator scheduler algorithm,
a most frequently used scheduler algorithm, a selector based
scheduler algorithm, an all plexes scheduler algorithm, and a round
robin scheduler algorithm.
10. The server of claim 8, wherein the volume comprises a mirrored
application volume, wherein the plurality of worker nodes are
associated with a common cluster corresponding to a common tenant,
wherein weights assigned to the plurality of scheduler algorithms
are different, and wherein the plurality of scheduler algorithms
comprise a plurality of: a least recently used scheduler algorithm,
a balance storage resource scheduler algorithm, a skip initiator
scheduler algorithm, a most frequently used scheduler algorithm, a
selector based scheduler algorithm, an all plexes scheduler
algorithm, and a round robin scheduler algorithm.
11. The server of claim 8, wherein the volume placement policy
comprises first and second placement policies and wherein
determining the one or more worker nodes comprises: determining,
based on an output of the first placement policy, a target cluster
from among a plurality of clusters for volume storage; and
determining, based upon an output of the second placement policy,
the one or more worker nodes within the target cluster for volume
storage, the one or more worker nodes being part of the target
cluster.
12. The server of claim 11, wherein the first and second placement
policies assign weights to different scheduler algorithms.
13. The server of claim 11, wherein the first and second placement
policies assign different weights to common scheduler
algorithms.
14. The server of claim 8, wherein the volume comprises a snapshot
volume, wherein the plurality of scheduler algorithms in the volume
placement policy perform one or more of: limits a number of mirrors
for snapshots, minimizes storage space needed for a snapshot,
maximizes snapshot storage distribution across the plurality of
worker nodes, dedicates the one or more worker nodes for a
plurality of snapshots, distributes snapshot volume storage
substantially uniformly across the plurality of worker nodes,
distributes snapshot volume storage giving more priority to worker
nodes in the plurality of worker nodes that have more storage space
available, selects the worker node used for creation and lifecycle
management of snapshot volumes to maintain Service Level Agreements
(SLAs) for an applications corresponding to the volume, selects
worker node storage resources to achieve a predetermined Recovery
Point Objective (RPO) and/or Recovery Time Objective (RTO) for
application recovery with the processor automatically deciding on
placement and type of snapshot volume(s) to be created depending on
the RPO and/or RTO to be achieved, wherein a first worker node of
the plurality of worker nodes comprises a first weighted score, a
second worker node of the plurality of worker nodes comprises a
different second weighted score, and wherein the one or more worker
nodes comprises the first but not the second worker node.
15. A method, comprising: receiving, by a processor, a request to
create a volume in accordance with a volume placement policy that
comprises a plurality of scheduler algorithms, each scheduler
algorithm of the plurality of scheduler algorithms to select one or
more clusters of worker nodes from among a plurality of worker node
clusters corresponding to a common tenant for volume storage;
determining, by a processor based on an output from the plurality
of scheduler algorithms, the one or more worker node clusters,
wherein an output of each of the plurality of scheduler algorithms
is assigned a weight in determining the one or more worker node
clusters; and causing, by the processor, a node agent in each of
the one or more worker node clusters to create the volume on a
worker node in each of the one or more worker node clusters.
16. The method of claim 15, wherein the volume comprises a snapshot
volume, wherein the snapshot volume comprises a replicated volume,
wherein a plurality of worker nodes are associated with the
plurality of worker node clusters, wherein weights assigned to the
plurality of scheduler algorithms are the same, wherein at least
one scheduler algorithm other than the plurality of scheduler
algorithms is assigned a weight less than the weights assigned to
the plurality of scheduler algorithms, wherein the volume placement
policy selects one or more worker nodes within a common cluster
from among the plurality of worker nodes for volume creation, and
wherein the plurality of scheduler algorithms comprise a plurality
of: a least recently used scheduler algorithm, a balance storage
resource scheduler algorithm, a skip initiator scheduler algorithm,
a most frequently used scheduler algorithm, a selector based
scheduler algorithm, an all plexes scheduler algorithm, and a round
robin scheduler algorithm.
17. The method of claim 15, wherein the volume comprises a mirrored
application volume, wherein the plurality of worker node clusters
are associated with a plurality of worker nodes, wherein weights
assigned to the plurality of scheduler algorithms are different,
and wherein the plurality of scheduler algorithms comprise a
plurality of: a least recently used scheduler algorithm, a balance
storage resource a scheduler algorithm, a skip initiator scheduler
algorithm, a most frequently used scheduler algorithm, a selector
based scheduler algorithm, an all plexes scheduler algorithm, and a
round robin scheduler algorithm.
18. The method of claim 15, wherein the volume placement policy
comprises first and second placement policies and wherein the
determining of the one or more worker node clusters comprises: the
processor determining, based on an output of the first placement
policy, a target worker node cluster from among a plurality of
worker node clusters for volume storage; and the processor
determining, based upon an output of the second placement policy,
one or more worker nodes within the target cluster for volume
storage, the one or more worker nodes being part of the target
cluster.
19. The method of claim 18, wherein the first and second placement
policies assign weights to different scheduler algorithms.
20. The method of claim 18, wherein the first and second placement
policies assign different weights to common scheduler
algorithms.
21. The method of claim 15, wherein the volume comprises a snapshot
volume, wherein the plurality of scheduler algorithms in the volume
placement policy perform one or more of: limits a number of mirrors
for snapshots, minimizes storage space needed for a snapshot,
maximizes snapshot storage distribution across a plurality of
worker nodes, dedicates one or more worker nodes for a plurality of
snapshots, distributes snapshot volume storage substantially
uniformly across the plurality of worker nodes, distributes
snapshot volume storage giving more priority to worker nodes in the
plurality of worker nodes that have more storage space available,
selects the worker node used for creation and lifecycle management
of snapshot volumes to maintain Service Level Agreements (SLAs) for
an applications corresponding to the volume, selects worker node
storage resources to achieve a predetermined Recovery Point
Objective (RPO) and/or Recovery Time Objective (RTO) for
application recovery with the processor automatically deciding on
placement and type of snapshot volume(s) to be created depending on
the RPO and/or RTO to be achieved, wherein a first worker node
cluster of the plurality of worker node clusters comprises a first
weighted score, a second worker node cluster of the plurality of
worker node clusters comprises a different second weighted score,
and wherein the one or more worker node clusters comprises the
first but not the second worker node cluster.
Description
CROSS REFERENCE TO RELATED APPLICATION
[0001] The present application claims the benefits of U.S.
Provisional Application Ser. No. 63/114,295, filed Nov. 16, 2020,
entitled "Method and System for Managing Cloud Resources" and U.S.
Provisional Application Ser. No. 63/194,611, filed May 28, 2021,
entitled "Method and System for Storing Snapshots in
Hyper-Converged Infrastructure," both of which are incorporated
herein by this reference in their entirety.
FIELD
[0002] The invention relates generally to distributed processing
systems and particularly to cloud computing systems.
BACKGROUND
[0003] Storage snapshots are point-in-time copies of volumes and an
integral part of data protection strategy. A snapshot could be a
full copy of the volume or it could be space-optimized and share
unmodified blocks with the volume. In case of a hyper-converged
infrastructure, volume data is located on direct-attached drives of
compute or worker nodes. Multiple copies of each data block are
maintained across nodes for mirrored volumes. Mirrored volumes
provide data redundancy in the event of a drive or node
failure.
[0004] Space-optimized snapshots are co-located with their parent
volumes since unmodified blocks are shared. For example, snapshot
for an N-way mirrored volume can be non-mirrored or have 1 to N-1
additional mirrors.
SUMMARY
[0005] In certain embodiments, the present disclosure relates to a
method that includes the steps of: receiving a request to create a
volume in accordance with a volume placement policy that comprises
a plurality of scheduler algorithms, each of which selects one or
more worker nodes from a plurality of worker nodes for volume
storage; determining, based on output from the plurality of
scheduler algorithms, the one or more worker nodes, with an output
of each of the plurality of scheduler algorithms being assigned a
weight in determining the one or more worker nodes; and causing, by
the processor, a node agent in each of the one or more worker nodes
to create the volume.
[0006] In some embodiments, a server includes: a communication
interface to transmit and receive communications; a processor
coupled with the communication interface; and a computer readable
medium, coupled with and readable by the processor and storing
therein a set of instructions. The set of instructions, when
executed by the processor, causes the processor to: receive a
request to create a volume, wherein a volume placement policy
comprises a plurality of scheduler algorithms, each of which
selects one or more worker nodes from among a plurality of worker
nodes for volume storage; determine, based on an output from the
plurality of scheduler algorithms, the one or more worker nodes,
wherein an output of each of the plurality of scheduler algorithms
is assigned a weight in determining the one or more worker nodes;
and instruct a node agent in each of the one or more worker nodes
to create the volume.
[0007] In some embodiments, a method includes the steps of:
receiving, by a processor, a request to create a volume in
accordance with a volume placement policy that comprises a
plurality of scheduler algorithms, each of which selects one or
more clusters of worker nodes from among a plurality of worker node
clusters corresponding to a common tenant for volume storage;
determining, by the processor based on an output from the plurality
of scheduler algorithms, the one or more worker node clusters, with
an output of each of the plurality of scheduler algorithms being
assigned a weight in determining the one or more worker node
clusters; and causing, by the processor, a node agent in each of
the one or more worker node clusters to create the volume on a
worker node in each of the one or more worker node clusters.
[0008] In some embodiments, a server includes: a communication
interface to transmit and receive communications; a processor
coupled with the communication interface; and a computer readable
medium, coupled with and readable by the processor and storing
therein a set of instructions. The set of instructions, when
executed by the processor, causes the processor to: receive a
request to create a volume in accordance with a volume placement
policy that comprises a plurality of scheduler algorithms, each of
which selects one or more clusters of worker nodes from among a
plurality of worker node clusters corresponding to a common tenant
for volume storage; determine, based on an output from the
plurality of scheduler algorithms, the one or more worker node
clusters, with an output of each of the plurality of scheduler
algorithms being assigned a weight in determining the one or more
worker node clusters; and cause a node agent in each of the one or
more worker node clusters to create the volume on a worker node in
each of the one or more worker node clusters
[0009] The present invention can provide a number of advantages
depending on the particular configuration. For example, the present
disclosure can provide a flexible architecture to allow users to
configure their snapshot volume placement depending on their needs.
Users can have a number of policies and can choose among them at
the time of snapshot volume creation.
[0010] These and other advantages will be apparent from the
disclosure of the invention(s) contained herein.
[0011] The phrases "at least one", "one or more", and "and/or" are
open-ended expressions that are both conjunctive and disjunctive in
operation. For example, each of the expressions "at least one of A,
B and C", "at least one of A, B, or C", "one or more of A, B, and
C", "one or more of A, B, or C" and "A, B, and/or C" means: A
alone, B alone, C alone, A and B together, A and C together, B and
C together, or A, B and C together.
[0012] The term "a" or "an" entity refers to one or more of that
entity. As such, the terms "a" (or "an"), "one or more" and "at
least one" can be used interchangeably herein. It is also notable
that the terms "comprising", "including", and "having" can be used
interchangeably.
[0013] The term "application containerization" may be used to refer
to an operating system-level virtualization method that deploys and
runs distributed applications or virtualized applications (e.g.,
containerized or virtual machine-based applications) without
launching an entire virtual machine for each application. Multiple
isolated applications or services may run on a single host and
access the same operating system kernel.
[0014] The term "automatic" and variations thereof may refer to any
process or operation done without material human input when the
process or operation is performed. However, a process or operation
can be automatic, even though performance of the process or
operation uses material or immaterial human input, if the input is
received before performance of the process or operation. Human
input is deemed to be material if such input influences how the
process or operation will be performed. Human input that consents
to the performance of the process or operation is not deemed to be
"material".
[0015] The term "computer-readable medium" may refer to any
tangible storage and/or transmission medium that participate in
providing instructions to a processor for execution. Such a medium
may take many forms, including but not limited to, non-volatile
media, volatile media, and transmission media. Non-volatile media
includes, for example, NVRAM, or magnetic or optical disks.
Volatile media includes dynamic memory, such as main memory. Common
forms of computer-readable media include, for example, a floppy
disk, a flexible disk, hard disk, magnetic tape, or any other
magnetic medium, magneto-optical medium, a CD-ROM, any other
optical medium, punch cards, paper tape, any other physical medium
with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, a
solid state medium like a memory card, any other memory chip or
cartridge, a carrier wave as described hereinafter, or any other
medium from which a computer can read. A digital file attachment to
e-mail or other self-contained information archive or set of
archives is considered a distribution medium equivalent to a
tangible storage medium. When the computer-readable media is
configured as a database, it is to be understood that the database
may be any type of database, such as relational, hierarchical,
object-oriented, and/or the like. Accordingly, the invention is
considered to include a tangible storage medium or distribution
medium and prior art-recognized equivalents and successor media, in
which the software implementations of the present invention are
stored.
[0016] The term "cluster" may refer to a group of multiple worker
nodes that deploy, run and manage containerized or VM-based
applications and a master node that controls and monitors the
worker nodes. A cluster can have an internal and/or external
network address (e.g., DNS name or IP address) to enable
communication between containers or services and/or with other
internal or external network nodes.
[0017] The term "container" may refer to a form of operating system
virtualization that enables multiple applications to share an
operating system by isolating processes and controlling the amount
of processing resources (e.g., central processing unit (CPU),
graphics processing unit (GPU), etc.), memory, and disk those
processes can access. While containers like virtual machines share
common underlying hardware, containers, unlike virtual machines
they share an underlying, virtualized operating system kernel and
do not run separate operating system instances.
[0018] The terms "determine", "calculate" and "compute," and
variations thereof are used interchangeably and include any type of
methodology, process, mathematical operation or technique.
[0019] The term "deployment" may refer to control of the creation,
state and/or running of containerized or VM-based applications. It
can specify how many replicas of a pod should run on the cluster.
If a pod fails, the deployment may be configured to create a new
pod.
[0020] The term "domain" may refer to a set of objects that define
the extent of all infrastructure under management within a single
context. Infrastructure may be physical or virtual, hosted
on-premises or in a public cloud. Domains may be configured to be
mutually exclusive, meaning there is no overlap between the
infrastructure within any two domains.
[0021] The term "domain cluster" may refer to the primary
management cluster. This may be the first cluster provisioned.
[0022] The term "Istio service mesh" may refer to a service mesh
layer for containers that adds a sidecar container to each cluster
that configures, monitors, and manages interactions between the
other containers.
[0023] The term "Knative" may refer to a platform that sits on top
of containers and enables developers to build a container and run
it as a software service or as a serverless function. It can enable
automatic transformation of source code into a clone container or
functions; that is, Knative may automatically containerize code and
orchestrate containers, such as by configuration and scripting
(such as generating configuration files, installing dependencies,
managing logging and tracing, and writing continuous
integration/continuous deployment (Cl/CD) scripts. Knative can
perform these tasks through build (which transforms stored source
code from a prior container instance into a clone container or
function), serve (which runs containers as scalable services and
performs configuration and service routing), and event (which
enables specific events to trigger container-based services or
functions).
[0024] The term "master node" may refer to the node that controls
and monitors worker nodes. The master node may run a scheduler
service that automates when and where containers are deployed based
on developer-set deployment requirements and available computing
capacity.
[0025] The term "module" may refer to any known or later developed
hardware, software, firmware, artificial intelligence, fuzzy logic,
or combination of hardware and software that is capable of
performing the functionality associated with that element. Also,
while the invention is described in terms of exemplary embodiments,
it should be appreciated that individual aspects of the invention
can be separately claimed.
[0026] The term "namespace" may refer to a set of signs (names)
that are used to identify and refer to objects of various kinds. In
Kubernetes, for example, there are three primary namespaces:
default, kube-system (used for Kubernetes components), and
kube-public (used for public resources). Namespaces are intended
for use in environments with many users spread across multiple
teams, or projects. Namespaces may not be nested inside one
another, and each Kubernetes resource may be configured to only be
in one
namespace. Namespaces may provide a way to divide cluster resources
between multiple users (via resource quota). The extension of
namespaces in the present disclosure is discussed at page 9 of
Exhibit "A". At a high level, the extension to namespaces enables
multiple virtual clusters (or namespaces) backed by a common set of
physical (Kubernetes) cluster.
[0027] The term "pods" may refer to groups of containers that share
the same compute resources and the same network.
[0028] The term "project" may refer to a set of objects within a
tenant that contains applications. A project may act as an
authorization target and allow administrators to set policies
around sets of applications to govern resource usage, cluster
access, security levels, and the like. The project construct can
enable authorization (e.g., Role Based Access Control or RBAC),
application management, and the like within a project. In one
implementation, a project is an extension of Kubernetes' use of
namespaces for isolation, resource allocation and basic
authorization on a cluster basis. Project may extend the namespace
concept by grouping together multiple namespaces in the same
cluster or across multiple clusters. Stated differently, projects
can run applications on one cluster or on multiple clusters. The
resources are allocated per project basis.
[0029] The term "project administrator" or "project admin" or PA
may refer to the entity or entities responsible for adding members
to a project, manages users to a project, manages applications that
are part of a project, specifies new policies to be enforced in a
project (e.g., with respect to uptime, SLAs, and overall health of
deployed applications), etc.
[0030] The term "project member" or PM may refer to the entity or
entities responsible for deploying applications on Kubernetes in a
project, responsible for uptime, service level agreements ("SLAs"),
and overall health of deployed applications. The PM may not have
permission to add a user to a project.
[0031] The term "project viewer" or PV may refer to the interface
that enables a user to view all applications, logs, events, and
other objects in a project.
[0032] The term "resource", when used with reference to Kubernetes,
may refer to an endpoint in the Kubernetes API that stores a
collection of API objects of a certain kind; for example, the
built-in pods resource contains a collection of pod objects.
[0033] The term "serverless computing" may refer to a way of
deploying code that enables cloud native applications to bring up
the code as needed; that is, it can scale it up or down as demand
fluctuates and take the code down when not in use. In contrast,
conventional applications deploy an ongoing instance of code that
sits idle while waiting for requests.
[0034] The term "service" may refer to an abstraction, which
defines a logical set of pods and a policy by which to access them
(sometimes this pattern is called a micro-service).
[0035] The term "service provider" or SP may refer to the entity
that manages the physical/virtual infrastructure in domains. In one
implementation, a service provider manages an entire node inventory
and tenant provisioning and management. Initially a service
provider manages one domain.
[0036] The term "service provider persona" may refer to the entity
responsible for hardware and tenant provisioning or management.
[0037] The term "snapshot" may refer to a point-in-time copy of a
volume. A snapshot can be used either to provision a new volume
(pre-populated with the snapshot data) or to restore an existing
volume to a previous state (represented by the snapshot).
[0038] The term "tenant" may refer to an organizational construct
or logical grouping used to represent an explicit set of resources
(e.g., physical infrastructure (e.g., CPUs, GPUs, memory, storage,
network, and cloud clusters, people, etc.) within a domain. Tenants
"reside" within infrastructure managed by a service provider. By
default, individual tenants do not overlap or share anything with
other tenants; that is, each tenant can be data isolated,
physically isolated, and runtime isolated from other tenants by
defining resource scopes devoted to each tenant. Stated
differently, a first tenant can have a set of resources, resource
capabilities, and/or resource capacities that is different from
that of a second tenant. Service providers assign worker nodes to a
tenant, and the tenant admin forms the clusters from the worker
nodes.
[0039] The term "tenant administrator" or "tenant admin" or TA may
refer to the entity responsible for managing an infrastructure
assigned to a tenant. The tenant administrator is responsible for
cluster management, project provisioning, providing user access to
projects, application deployment, specifying new policies to be
enforced in a tenant, etc.
[0040] The term "tenant cluster" may refer to clusters of resources
assigned to each tenant upon which user workloads run. The domain
cluster performs lifecycle management of the tenant clusters.
[0041] The term "virtual machine" may refer to a server abstracted
from underlying computer hardware so as to enable a physical server
to run multiple virtual machines or a single virtual machine that
spans more than one server. Each virtual machine typically runs its
own operating system instance to permit isolation of each
application in its own virtual machine, reducing the chance that
applications running on common underlying physical hardware will
impact each other.
[0042] The term "volume" may refer to an ephemeral or persistent
volume of memory of a selected size that is created from a
distributed storage pool of memory. A volume may comprise a
directory on disk and data or in another container and be
associated with a volume driver. In some implementations, the
volume is a virtual drive and multiple virtual drives can create
multiple volumes. When a volume is created, a scheduler may
automatically select an optimum node on which to create the volume.
A "mirrored volume" refers to synchronous cluster-local data
protection while a "replicated volume" refers to asynchronous
cross-cluster data protection.
[0043] The term "worker node" may refer to the compute resources
and network(s) that deploy, run, and manage containerized or
VM-based applications. Each worker node contains the services to
manage the networker between the containers, communication with the
master node, and assign resources to the containers scheduled. Each
worker node can include a tool that is used to manage the
containers, such as Docker, and a software agent called a Kubelet
that receives and executes orders from the master node (e.g., the
master API server). The Kubelet is a primary node agent which
executes on each worker node inside the cluster. The Kubelet
receives the pod specifications through an API server and executes
the container associated with the pods and ensures that the
containers described in the pods are running and healthy. If
Kubelet notices any issues with the pods running on the worker
nodes then it tries to restart the pod on the same node and if the
issue is with the worker node itself then the master node detects
the node failure and decides to recreate the pods on the other
healthy node.
[0044] The preceding is a simplified summary of the invention to
provide an understanding of some aspects of the invention. This
summary is neither an extensive nor exhaustive overview of the
invention and its various embodiments. It is intended neither to
identify key or critical elements of the invention nor to delineate
the scope of the invention but to present selected concepts of the
invention in a simplified form as an introduction to the more
detailed description presented below. As will be appreciated, other
embodiments of the invention are possible utilizing, alone or in
combination, one or more of the features set forth above or
described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0045] FIG. 1 is a block diagram of a cloud-based architecture
according to an embodiment of this disclosure;
[0046] FIG. 2 is a block diagram of an embodiment of the
application management server;
[0047] FIG. 3 is a block diagram of a cloud-based architecture
according to an embodiment of this disclosure;
[0048] FIG. 4 is a block diagram of a volume storage schema
according to an embodiment of this disclosure;
[0049] FIG. 5 depicts a snapshot mapping schema according to an
embodiment of the disclosure;
[0050] FIG. 6A is a screenshot of a volume create request according
to an embodiment of the disclosure;
[0051] FIG. 6B is a screenshot of a snapshot create request
according to an embodiment of the disclosure;
[0052] FIG. 7A is a screenshot of a snapshot placement policy
create request with one selected policy according to an embodiment
of the disclosure;
[0053] FIG. 7B is a screenshot of a snapshot placement policy
create request with multiple placement policies according to an
embodiment of the disclosure;
[0054] FIG. 7C is a screenshot of a snapshot placement policy
create request with a placement policy option according to an
embodiment of the disclosure;
[0055] FIG. 8A is a screenshot of snapshot describe request
according to an embodiment of the disclosure;
[0056] FIG. 8B is a screenshot of a snapshot placement policy
describe request according to an embodiment of the disclosure;
[0057] FIG. 8C is a screenshot of a response to the snapshot
placement policy describe request of FIG. 8B;
[0058] FIG. 9 is a flow chart depicting disaster recovery
controller logic according to an embodiment of this disclosure;
and
[0059] FIG. 10 is a flow chart depicting an authentication method
according to an embodiment of this disclosure.
DETAILED DESCRIPTION
Overview
[0060] The present disclosure is directed to a multi-cloud platform
that can provide a single plane of management console from which
customers manage cloud-native applications and clusters and data
using a policy-based management framework. The platform can be
provided as a hosted service that is either managed centrally or
deployed in customer environments. The customers could be
enterprise customers or service providers. The platform can manage
applications across multiple Kubernetes clusters, which could be
residing on-premises or in the cloud or combinations thereof (e.g.,
hybrid cloud implementations). The platform can provide abstract
core networker and storage services on premises and in the cloud
for stateful and stateless applications.
[0061] The platform can migrate data (including application and
snapshot volumes) and applications to any desired set of resources
and provide failover stateful applications on premises to the cloud
or within the cloud. As will be appreciated, an "application
volume" refers to a volume in active use by an application while a
"snapshot volume" refers to a volume that is imaged from an
application volume at a specific point in time. It can provide
instant snapshot volumes of containers or application volumes
(e.g., mirroring or replicating of application or snapshot
volumes), backup, and stateful or stateless application disaster
recovery (DR) and data protection (DP).
[0062] The platform can automatically balance and adjust placement
and lifecycle management of snapshot and application volumes
depending on a user specified policy. Exemplary policies applied
within a selected cluster to select one or more worker nodes for
mirrored snapshot and application volume placement include:
limiting a number of mirrors for volumes (e.g., no mirror, two-way
mirroring, as many mirrors as the volume-storing worker nodes,
etc.), substantially minimizing storage space needed for a volume,
substantially maximizing volume storage distribution across worker
nodes, dedicating one or more worker nodes for all volume storage,
distributing volume storage substantially evenly and/or uniformly
across worker nodes, distributing volume storage giving more
priority to worker nodes that have more space available, selecting
the worker node storage resources used for creation and lifecycle
management of parent or linked clone volumes so as to minimize
"noisy neighbor", i.e. maintain Service Level Agreements (SLAs) for
applications or micro-services whichare using the parent or
application volume(s), selecting worker node storage resources so
as to achieve a predetermined Recovery Point Objective (RPO) and/or
Recovery Time Objective (RTO) for application recovery with the
system automatically deciding on the placement and type of
volume(s) to be created depending on the RPO and/or RTO to be
achieved, and the like. Typically, RPO designates the variable
amount of data that will be lost or will have to be re-entered
during network downtime. RTO designates the amount of "real time"
that can pass before the disruption begins to seriously and
unacceptably impede the flow of normal business operations.
[0063] These policies can be applied not only within a cluster but
also across clusters. Exemplary policies applied across clusters to
select a cluster for replicated snapshot or application volume
placement include: a user policy specifying one or more clusters
for storing volumes, limiting a number of mirrors across clusters
for volume storage (e.g., no mirror, two-way mirroring, as many
mirrors as the volume-storing clusters, etc.), substantially
minimizing storage space needed across clusters for volume
placement, substantially maximizing volume distribution across
clusters, dedicating one or more clusters for storing all volumes,
distributing volume storage substantially evenly and/or uniformly
across clusters (e.g., giving more priority to clusters that have
more storage space available), selecting, among multiple clusters,
cluster storage resources used for creation and lifecycle
management of parent or linked clone volumes to substantially
minimize "noisy neighbor", i.e. maintain SLAs for applications
and/or micro-services which are using the parent or application
volume(s), and selecting, among multiple clusters, cluster storage
resources so as to achieve a predetermined RPO and/or RTO for
application recovery with the system automatically deciding on the
placement and type of volume(s) to be created depending on the RPO
and/or RTO to be achieved.
[0064] As will be appreciated, other user-specified policies will
be envisioned by one of ordinary skill in the art for management of
clusters and/or worker nodes.
[0065] The platform can enable organizations to deliver a
high-productivity Platform-as-a-Service (PaaS) that addresses
multiple infrastructure-related and operations-related tasks and
issues surrounding cloud-native development. It can support many
container application platforms besides or in addition to
Kubernetes, such as Red Hat, OpenShift, Docker, and other
Kubernetes distributions, whether hosted or on-premises.
[0066] While this disclosure is discussed with reference to the
Kubernetes container platform, it is to be appreciated that the
concepts disclosed herein apply to other container platforms, such
as Microsoft Azure.TM., Amazon Web Services.TM. (AWS), Open
Container Initiative (OCI), CoreOS, and Canonical (Ubuntu)
LXD.TM..
The Multi-Cloud Platform
[0067] FIG. 1 depicts an embodiment of a multi-cloud platform
according to the present disclosure. The multi-cloud platform 100
is in communication, via network 128, with one or more tenant
clusters 132a, . . . Each tenant cluster 132a, . . . corresponds to
multiple tenants 136a, b, . . . , with each of the multiple tenants
136a, b, . . . in turn corresponding to a plurality of projects
140a, b, . . and worker node clusters 144a, b, . . . Each
containerized or VM-based application 148a, b, . . . n in each
project 140a, b, . . . utilizes the worker node resources in one or
more of the clusters 144a, b, . . . .
[0068] To manage the tenant clusters 132a . . . the multi-cloud
platform 100 is associated with a domain cluster 104 and comprises
an application management server 108 and associated data storage
110 and master application programming interface (API) server 114,
which is part of the master node (not shown) and associated data
storage 112. The application management server 108 communicates
with an application programming interface (API) server 152 assigned
to the tenant clusters 132a . . . to manage the associated tenant
cluster 132a . . . In some implementations, each cluster has a
controller or control plane that is different from the application
management server 108.
[0069] The servers 108 and 114 can be implemented as a physical (or
bare-metal) server or cloud server. As will be appreciated, a cloud
server is a physical and/or virtual infrastructure that performs
application- and information-processing storage. Cloud servers are
commonly created using virtualization software to divide a physical
(bare metal) server into multiple virtual servers. The cloud server
can use infrastructure-as-a-service (IaaS) model to process
workloads and store information.
[0070] The application management server 108 performs tenant
cluster management using two management planes or levels, namely an
infrastructure and application management layer 120 and stateful
and application services layer 124. The stateful and application
services layer 124 can abstract network and storage resources to
provide global control and persistence, span on-premises and cloud
resources, and provide intelligent placement of workloads based on
logical data locality and block storage capacity. These layers are
discussed in detail in connection with FIG. 2.
[0071] The API servers 114 and 152, which effectively act as
gateways to the clusters, are commonly each implemented as a
Kubernetes API server that implements a RESTful API over HTTP,
performs all API operations, and is responsible for storing API
objects into a persistent storage backend. Because all of the API
server's persistent state is stored in external storage (which is
one or both of the databases 110 and 112 in the case of master API
server 114) that are typically external to the API server, the
server itself is typically stateless and can be replicated to
handle request load and provide fault tolerance. The API servers
commonly provides API management (the process by which APIs are
exposed and managed by the server), requests processing (the target
set of functionality that processes individual API requests from a
client), and provides internal control loops (that provide
internals responsible for background operations necessary to the
successful operation of the API server).
[0072] In one implementation, the API server receives https
requests from Kubectl or any automation to send requests to any
Kubernetes cluster. Users access the cluster using API server 152
and it stores all the API objects into an etcd data structure. As
will be appreciated, etcd is a consistent and highly-available key
value store used as Kubernetes' backing store for all cluster data.
The master API server 114 receives https requests from user
interface (UI) or dmctl. This provides a single endpoint of contact
for all UI functionality. It typically validates the request and
sends the request to the API server 152. An agent controller (not
shown) can reside on each tenant cluster and perform actions in
each cluster. Domain cluster components can use Kubernetes native
or CustomResourceDefinitions (CRD) objects to communicate with the
API server 152 in the tenant cluster. The agent controller can
handle the CRD objects.
[0073] In one implementation, the tenant clusters can run
controllers such as an HNC controller, storage agent controller, or
agent controller. The communication between domain cluster
components and tenant cluster are via the API server 152 on the
tenant clusters. The applications on the domain cluster 104 can
communicate with applications 148 on tenant clusters 144 and the
applications 148 on one tenant cluster 144 can communicate with
applications 148 on another tenant cluster 144 to implement
specific functionality.
[0074] Data storage 110 is normally configured as a database and
stores data structures necessary to implement the functions of the
application management server 108. For example, data storage 110
comprises objects and associated definitions corresponding to each
tenant cluster 144, and project and references to the associated
cluster definitions in data storage 112. Other objects/definitions
include networks and endpoints (for data networks), volumes
(created from a distributed data storage pool on demand), mirrored
volumes (created to have mirrored copies on one or more other
nodes), snapshot volumes (a point-in-time image of a corresponding
set of volume data), linked clones (volumes created from snapshot
volumes are called linked clones of the parent volume and share
data blocks with the corresponding snapshot volume until the linked
clone blocks are modified), namespaces, access permissions and
credentials, and other service-related objects.
[0075] Namespaces enable the use of multiple virtual clusters
backed by a common physical cluster. The virtual clusters are
defined by namespaces. Names of resources are unique within a
namespace but not across namespaces. In this manner, namespaces
allow division of cluster resources between multiple uses.
Namespaces are also used to manage access to application and
service-related Kubernetes objects, such as pods, services,
replication, controllers, deployments, and other objects that are
created in namespaces.
[0076] Data storage 112 includes the data structures enabling
cluster management by the master API server 114. In one
implementation, data storage 112 is configured as a distributed
key-value lightweight database, such as an etcd key value store. In
Kubernetes, it is a central database for storing the current
cluster state at any point in time and also used to store the
configuration details such as subnets, configuration maps, etc.
[0077] The communication network 128, in some embodiments, can be
any trusted or untrusted computer network, such as a WAN or LAN.
The Internet is an example of the communication network 128 that
constitutes an IP network consisting of many computers, computing
networks, and other communication devices located all over the
world. Other examples of the communication network 128 include,
without limitation, an Integrated Services Digital Network (ISDN),
the Public Switched Telephone Network (PSTN), a cellular network,
and any other type of packet-switched or circuit-switched network
known in the art. In some embodiments, the communication network
128 may be administered by a Mobile Network Operator (MNO). It
should be appreciated that the communication network 128 need not
be limited to any one network type, and instead may be comprised of
a number of different networks and/or network types. Moreover, the
communication network 128 may comprise a number of different
communication media such as coaxial cable, copper cable/wire,
fiber-optic cable, antennas for transmitting/receiving wireless
messages, wireless access points, routers, and combinations
thereof.
[0078] With reference now to FIG. 2, additional details of the
application management server 108 will be described in accordance
with embodiments of the present disclosure. The server 108 is shown
to include processor(s) 204, memory 208, and communication
interfaces 212a . . . n. These resources may enable functionality
of the server 108 as will be described herein.
[0079] The processor(s) 204 can correspond to one or many computer
processing devices. For instance, the processor(s) 204 may be
provided as silicon, as a Field Programmable Gate Array (FPGA), an
Application-Specific Integrated Circuit (ASIC), any other type of
Integrated Circuit (IC) chip, a collection of IC chips, or the
like. As a more specific example, the processor(s) 204 may be
provided as a microcontroller, microprocessor, Central Processing
Unit (CPU), or plurality of microprocessors that are configured to
execute the instructions sets stored in memory 208. Upon executing
the instruction sets stored in memory 208, the processor(s) 204
enable various centralized management functions over the tenant
clusters.
[0080] The memory 208 may include any type of computer memory
device or collection of computer memory devices. The memory 208 may
include volatile and/or non-volatile memory devices. Non-limiting
examples of memory 208 include Random Access Memory (RAM), Read
Only Memory (ROM), flash memory, Electronically-Erasable
Programmable ROM (EEPROM), Dynamic RAM (DRAM), etc. The memory 208
may be configured to store the instruction sets depicted in
addition to temporarily storing data for the processor(s) 204 to
execute various types of routines or functions.
[0081] The communication interfaces 212a . . . n may provide the
server 108 with the ability to send and receive communication
packets (e.g., requests) or the like over the network 128. The
communication interfaces 212a . . . n may be provided as a network
interface card (NIC), a network port, drivers for the same, and the
like. Communications between the components of the server 108 and
other devices connected to the network 128 may all flow through the
communication interfaces 212a . . . n. In some embodiments, the
communication interfaces 212a . . . n may be provided in a single
physical component or set of components, but may correspond to
different communication channels (e.g., software-defined channels,
frequency-defined channels, amplitude-defined channels, etc.) that
are used to send/receive different communications to the master API
server 112 or API server 152.
[0082] The illustrative instruction sets that may be stored in
memory 208 include, without limitation, in the infrastructure and
application management (management plane) 124, the project
controller 216, data protection/disaster recovery controller 220,
domain/tenant cluster controller 224, policy controller 228, tenant
controller 232, and application controller 236 and, in the stateful
data and application services (data plane) 124, distributed storage
controller 244, networker controller 248, data protection
(DP)/disaster recovery (DR) 252, logical and physical drives 256,
container integration 260, and scheduler 264. Functions of the
application management server 108 enabled by these various
instruction sets are described below. Although not depicted, the
memory 208 may include instructions that enable the processor(s)
204 to store data into and retrieve data from data storage 110 and
112.
[0083] It should be appreciated that the instruction sets depicted
in FIG. 2 may be combined (partially or completely) with other
instruction sets or may be further separated into additional and
different instruction sets, depending upon configuration
preferences for the server 108. Said another way, the particular
instruction sets depicted in FIG. 2 should not be construed as
limiting embodiments described herein.
[0084] In some embodiments, the instructions for the project
controller 216, when executed by processor(s), may enable the
server 108 to control, on a project-by-project basis, the resource
utilization based on project members and control things such as
authorization of resources within a project or across other
projects using a network access control list (ACL) policies. The
project causes grouping of resources such as memory, CPU, storage
and network and quota of these resources. The project members view
or consume resources based on authorization policies. The projects
could be on only one cluster or span across multiple or different
clusters.
[0085] In some embodiments, instructions for the application
mobility and disaster recovery controller 220 (at the management
plane) and the data protection disaster recovery/DP 252 (at the
data plane), when executed by processor(s), may enable the server
108 to implement containerized or VM-based application migration
from one cluster to another cluster using migration agent
controllers on individual clusters. The controller 252 could
include a snapshot subcontroller to take periodic snapshots and a
cluster health monitoring subcontroller running at the domain
cluster to monitor health of containers.
[0086] In some embodiments, the instructions for the domain/tenant
cluster controller 224, when executed by processor(s), may enable
the server 108 to control provisioning of cloud-specific clusters
and manage their native Kubernetes clusters. Other cluster
operations that can be controlled include adopting an existing
cluster, removing the cluster from the server 108, upgrading a
cluster, creating the cluster, and destroying the cluster.
[0087] In some embodiments, instructions for the policy controller
228, when executed by the processor(s), may enable the server 108
to effect policy-based management, whose goal is to capture user
intent via templates and enforce them declaratively for different
applications, nodes, and clusters. An application may specify a
policy for an application or snapshot volume storage. The policy
controller can manage policy definitions and propagate them to
individual clusters. The policy controller can interpret the
policies and give the policy enforcement configuration to
corresponding feature specific controllers. The policy controller
could be run at the tenant cluster or at the master node based on
functionality. To implement a snapshot policy, the policy manager
can determine application volumes that need snapshots and make the
snapshot controller configuration, which will be propagated to the
cluster hosting volumes. The snapshot controller in cluster will
continue taking periodic snapshots as per the configuration
requirements. In the case of a cluster health monitoring
controller, the policy controller can send the configuration to
that health monitoring controller on the domain cluster itself.
[0088] Other examples of policy control include application policy
management (e.g., containerized or VM-based application placement,
failover, migration, and dynamic resource management), storage
policy management (e.g., storage policy management controls the
snapshot policy, backup policy, replication policy, encryption
policy, etc. for an application), network policy management,
security policies, performance policies, access control lists, and
policy updates.
[0089] In some embodiments, instructions for the application
controller 236, when executed by the processor(s), may enable the
server 108 to deploy applications, effect application
failover/fallback, application cloning, cluster cloning, and
monitoring applications. In one implementation, the application
controller enables users to launch their applications from the
server 108 on individual clusters or a set of clusters using a
Kubectl command.
[0090] In some embodiments, the instructions for the distributed
storage controller 244 and scheduler 264, when executed by
processor(s), may enable the server 108 to perform storage
configuration, management and operations such as storage
migration/replication/backup/snapshots, etc. By way of example, the
distributed storage controller 244 and scheduler 264 can work with
the policy controller 228 to manage and apply snapshot and
application volume placement policies, such as limiting a number of
mirrors for application and snapshot volumes (e.g., no mirror,
two-way mirroring, as many mirrors as the volume-storing worker
nodes, etc.), minimize storage space needed for a snapshot or
application volume, maximize snapshot or application volume storage
distribution across clusters and/or worker nodes, dedicate one or
more clusters and/or worker nodes for all snapshot and/or
application volumes, distribute snapshot and application volume
storage substantially evenly across clusters and/or worker nodes,
distribute snapshot and application volume storage unevenly across
clusters and/or worker nodes giving more priority to clusters
and/or worker nodes that have morespace available, select the
cluster and/or worker node storage resources used for creation and
lifecycle management of snapshot and application volumes so as to
maintain SLAs for applications or micro-services which are using
the parent volume(s), select cluster and/or worker node storage
resources so as to achieve a certain RPO and/or RTO for recovery
with the distributed storage controller automatically deciding on
the placement and type of snapshot volume(s) to be created
depending on the RPO/RTO to be achieved, and the like. These
policies can be applied not only by the stateful data and
application services or data plane layer 124 within a cluster but
also by the infrastructure and application management or management
plane across clusters.
[0091] In some embodiments, the instructions for the networker
controller 248, when executed by processor(s), may enable the
server 108 to enable multi-cluster or container networker
(particularly at the data link and network layers) in which
services or applications run mostly on one cluster and, for high
availability reasons, use another cluster either on premises or on
the public cloud. The service or application can migrate to other
clusters upon user request or for other reasons. In most
implementations, services run in one cluster at a time. The network
controller 248 can also enable services to use different clusters
simultaneously and enable communication across the clusters. The
networker controller 248 can attach one or more interfaces
(programmed to have a specific performance configuration) to a
selected container while maintaining isolation between management
and data networks. This can be done by each container having the
ability to request one or more interfaces on specified data
networks.
[0092] In some embodiments, the instructions for the logical drives
408a-n when executed by processor(s), may enable the server 108 to
provide a common API (via the Container Networker Interface) for
connecting containers to an external network and expose (via the
Container Storage Interface) arbitrary block and file storage
systems to containerized or VM-based workloads. In some
implementations, CSI can expose arbitrary block and file storage
systems to containerized workloads on Container Orchestration
Systems (COs), such as Kubernetes and AWS.
[0093] In some embodiments, the instructions for the container
integration 260, when executed by processor(s), may enable the
server 108 to provide (via OpenShift) a cloud-based container
platform that is both containerization software and a
platform-as-a-service (PaaS).
[0094] FIG. 3 illustrates the operations of the scheduler 264 and
distributed storage controller 244 in more detail. The application
server 108 is in communication, via network 128, with a plurality
of worker nodes 300a-n. While FIG. 3 depicts the master API server
separate from the worker nodes, in some implementations the same
node can act as both a master and worker node.
[0095] The database 112 is depicted as an "/etc distributed" or
etcd key value store that stores physical data as key-value pairs
in a persistent b+tree. Each revision of the etcd key value store's
state typically contains only the delta from a previous revision
for storage efficiency. A single revision may correspond to
multiple keys in the tree. The key of a key-value pair is a 3-tuple
(major, sub, type). The database 112, in this implementation,
stores the entire state of a cluster: that is, it stores the
cluster's configuration, specifications, and the statuses of the
running workloads. In Kubernetes in particular, etcd's "watch"
function monitor the data and reconfigures itself when changes
occur.
[0096] The worker nodes 300a-n can be part of a common cluster or
different clusters 144, the same or different projects 140, and/or
the same or different tenant clusters 132, depending on the
implementation. The worker nodes 300 comprise the compute
resources, drives on which volumes are created for applications,
and network(s) that deploy, run, and manage containerized or
VM-based applications. For example, a first worker node 300a
comprises an application 148a, a node agent 304, and a database 308
containing storage resources. The node agent 304, or Kubelet in
Kubernetes, runs on each worker node and ensures that all
containers are running and healthy in a pod and makes any
configuration changes on the worker nodes. The database 308 or
other data storage resource corresponds to the pod associated with
the worker node (e.g., the database 308 for first worker node 300a
is identified as "P0" for pod 0, the database 308 for the second
worker node 300b is identified as "P1" for pod 1, and the database
308 for the nth worker node 300n is identified as "P2" for pod 2.
Each database 308 in the first and second worker nodes 300a and b
is shown to include a volume associated with respective application
148a and b. The volume in the nth worker node 300n, depending on
the implementation, could be associated with either of the
applications 148a orb. As will be appreciated, an application's
volume can be divided among the storage resources of multiple
worker nodes and is not limited to the storage resources of the
worker node running the application.
[0097] The master API server 112, in response to user requests to
instantiate an application or create an application or snapshot 312
volume, records the request in the etcd database 112, and, in
response, the scheduler 264 determines on which database (s) 308
the volume should be created in accordance with placement polices
specified by the policy controller 228. For example, the placement
policy can select the worker node having the least amount of
storage resources consumed at that point, that is required for
optimal operation of the selected application 148, or that is
selected by the user.
[0098] FIG. 4 depicts volume layout. A snapshot volume 400 is
divided into subvolumes SV.sub.1, SV.sub.2, . . . SV.sub.n404 to
match blocks reserved on the physical drive or the database 308,
each of which is associated with a logical drive 256a-n, which then
associates the respective subvolume with a selected physical
address on a database 308 for the corresponding block, where data
in the corresponding subvolume is stored.
[0099] FIG. 5 further illustrates mapping of subvolumes for
discrete snapshot volumes-1 and -2 and the current logical drive
(LD) 408 onto physical drives in database 308 in a hyper-converged
infrastructure. As will be appreciated, a logical drive or LD
typically represents the smallest logical object in a volume
layout. Each worker node has one or more drives for creating
volumes for each application. The current LD represents the current
state or image of the volume in use by the application while the
snapshot volumes-1 and -2 represent prior states or images of the
volume at specific points in time. Though any number of maps may be
employed, an L0 and L1 map are typically used to map volumes onto
the physical drives and enable an application to locate on the
physical drive the particular block containing selected data. The
map 500 for the snapshot volume-1 comprises a logical subvolume
address 504 and, for each logical subvolume address 504 a
corresponding block address 508 on a physical drive. Thus, the
first logical subvolume address as shown by the arrow maps to a
physical drive address "pblk-a". The same relationship exists for
the snapshot volume-2 map 512 and current LD map 516.
[0100] In some embodiments, the distributed storage controller 244
creates a storage space that optimizes storage for the snapshot
volume to be created. If only a few gigabytes have been written in
the volume, the distributed storage controller 244 allocates blocks
for the volume that has been written for that point of time. When
the distributed storage controller 244 creates a snapshot volume,
it reserves some space for that snapshot volume on the selected
worker node. At that point in time, the distributed storage
controller 244 determines what is the usage of the volume on that
particular worker node and how much space is required for the
snapshot volume to be created. If there is already a prior snapshot
volume stored on the selected worker node and the distributed
storage controller 244 desires to create a second snapshot volume,
the distributed storage controller 244 first determines what is the
adequate storage space, or storage space required, creating the
second snapshot volume given that there is already a prior snapshot
volume stored on the worker node. the distributed storage
controller 244 determines the differences, updates, or changes made
to the second snapshot volume since the prior snapshot volume was
created and allocates the storage space required only for storing
the difference, updates, or changes on the selected worker node.
Stated simply, the distributed storage controller 244 tries to
create or reserve only that space needed for second snapshot
volume, though the volume itself could be huge. In this manner, the
distributed storage controller 244 can create multiple snapshot
volumes on the selected worker node even for larger volumes.
[0101] By way of example, assume that the distributed storage
controller 244 desires to create a 100 gigabyte volume to store
data. The distributed storage controller 244 does not immediately
assign blocks for all 100 gigabytes of data but determines how many
blocks are required to store the data and locates the available
blocks on the worker node drives. The distributed storage
controller 244 then maps each block to the corresponding physical
address on the drive. Each logical block has a corresponding
physical block on the physical drive or storage media. The blocks
do not need to be contiguous but can be located anywhere. The map
enables the application to determine where each and every block is
located so that it can collect and assemble the blocks as
needed.
[0102] The operation of the distributed storage controller 244 is
further illustrated by FIG. 5. FIG. 5 illustrates that each
snapshot volume is space-optimized and shares unmodified blocks
with, or is a linked clone of, the parent volume and, if
applicable, a preceding snapshot volume. The selected first
subvolume 504 as shown by arrow maps to a block at physical drive
address "pblk-a" 508 at the time of the first snapshot. When the
second snapshot is taken, the same first subvolume 504 has changed
and now maps to a different block at a different physical drive
address "pblk-x" 508. The current LD map 516 shows that the first
subvolume 504 has not changed and still maps to the same block at
physical drive address "pblk-x" 508. In contrast, an nth subvolume
504 at the time of the first and second snapshots maps to a common
block at physical drive address "pblk-n" 508. However, the nth
subvolume in the current LD map has changed and now maps to a new
block at a different physical drive address "pblk-nn" 508. In other
words, when only blocks that have changed since a preceding
snapshot volume are changed in the later snapshot volume; the
blocks that have not changed retain the same physical drive
addresses in the earlier snapshot volume. In this way, the storage
required for a later snapshot volume is reduced.
[0103] FIGS. 6A-6B, 7A-7C, and 8A-8C depict various user
commands.
[0104] FIG. 6A depicts a volume create request 600 that creates a
volume with three different mirrors on three different worker nodes
in a selected cluster. The request includes the command to create 3
mirrored volumes (m3) and additionally fields for volume name,
size, node identity, labels, phase, and status. A volume describe
request (not shown) describes the three worker nodes on which the
volume is created and contains fields including volume name, volume
size, worker node name or identity, phase, attached to, device
path, performance tier (e.g., best effort), and scheduled
plexes/actual plexes, and, for each plex, the further fields of
volume name and, for each volume name, worker node name, state,
condition, out-of-syn-age, and resync-progress.
[0105] FIG. 6B depicts a snapshot create request 650 that creates a
(typically read-only) LD snapshot volume of an application volume
LDs. The snapshot create request 650 creates a snapshot volume for
an application volume that already exists in the system. The
request does not specify any placement policy or scheduler
algorithms or mirror count, thereby permitting the snapshot volume
to be created on any one of the available worker nodes where space
exists. It will be created, by default, with a single mirror. The
request makes the snapshot volume with the same layout as the
parent application volume. Snapshot logical drives are created by
copying L0 blocks from the parent volume logical drives, and the
ownership of L1 blocks is transferred to the snapshot volume. The
snapshot volume shares layout L1 and physical data blocks with the
application volume and is therefore a linked clone of the
application volume. The snapshot volume owns the L1 block with
owner=1, the parent application volume shares the L1 block with
owner=0 in its L0 block. Block ownership information is maintained
at two levels, the first level being the L1 block level in the L0
block and the second level being the physical block level in the L1
block. The request 650 includes the command to create a snapshot of
the selected "src vol1" and the following additional fields for
snapshot volume name, snapshot volume size, node identity, labels,
parent volume name, attached-to, device-path, phase, and age.
[0106] FIG. 8A depicts a snapshot describe request 800 for a
specified snapshot volume. The request includes the following
fields: snapshot volume name (e.g., "shap1"), size (21.51 GB),
encryption (true or false), worker node name ("appserv19"), node
selector (none selected), phase (available), status (available),
age (27 seconds), scheduled plexes/actual plexes (1/1), and parent
volume name ("vol1"). The request further includes the following
field for related plexes: plex name ("snap1.p0"), node name
("appserv19"), and state (up).
[0107] As shown by FIG. 7C, the snapshot create request 780 can
include an option prefix to create a snapshot volume using
policy-based management. The request 780 requests creation of a
snapshot volume of "source vol1" using a placement policy
(identified as custom).
[0108] Policies can be created based on assigning weight to one or
more scheduler algorithms in connection with snapshot or
application volume mirroring (or synchronous replication). In
creating a placement policy, the user can select one or more
scheduler algorithms from a list of scheduler algorithms and, for
each selected scheduler algorithm, assign a weight.
[0109] By way of illustration, FIG. 7A depicts a snapshot placement
policy create request 700 that creates a selected snapshot
placement policy, e.g., storage optimized. FIG. 7A also depicts a
snapshot placement policy describe storage request 720 that
provides detailed information about a requested snapshot placement
policy (e.g., the storage optimized placement policy). As shown in
FIG. 7A, the storage optimized placement policy assigns a weight of
"1" to the BalanceStorageResourceUsage scheduler algorithm and "0"
to the other scheduler algorithms. In another example shown in
FIGS. 8B and 8C, the snapshot placement policy describe request 840
describes a default placement policy, which as shown by response
860 assigns a weight of "1" to the Least RecentlyUsed scheduler
algorithm but "0" to each of the BalanceStorageResourceUsage,
SkipInitiator, MostFrequentlyUsed, SelectorBased, and AllPlexes
scheduler algorithms.
[0110] While the requests 700 and 840 depict placement policies
using only a single scheduler algorithm for volume mirroring, FIG.
7B depicts a snapshot placement policy create request 740 to create
a custom placement policy for mirroring with the user assigning
selected weights to multiple scheduler algorithms. In the request
740, the user has assigned a weight of "1" to the
BalanceStorageResourceUsage scheduler algorithm and a weight of
"10" to the SkipInitiator scheduler algorithm. As shown by the
response 760 to the request, the user has, by implication, assigned
a weight of "0" to the remaining LeastRecentlyUsed,
MostFrequentlyUsed, SelectorBased, and AllPlexes scheduler
algorithms.
[0111] The BalanceStorageResourceUsage scheduler algorithm compares
the memory storage in use for each worker node of a selected set of
worker nodes and selects for snapshot or application volume storage
the worker node that has the least amount of storage in use at that
point in time. Stated differently, the algorithm compares the
memory not in use for each worker node of a selected set of worker
nodes and selects for snapshot or application volume storage the
worker node having the greatest amount of memory not in use (or
highest amount of free storage). Assume worker node 1 has one
terabyte storage used while worker nodes 2 and 3 each have 100 gigs
of storage in use. When the algorithm attempts to balance resource
storage usage, the snapshot volume will not be created on worker
node one because it has the maximum or highest storage in use when
compared to the other worker nodes. The algorithm attempts to
balance between worker node 2 and 3. In other words, a first
snapshot volume will be stored on worker node 2, and a second,
later snapshot volume will be stored on worker node 3. In this
manner, the algorithm performs a round-robin between the worker
nodes 2 and 3 with respect to later snapshot volume storage. Worker
node is not scored high until the storage resource's usage becomes
balanced across all the worker nodes in the selected cluster(s).
The scoring will be based on the current storage resource
consumption across the selected set of worker nodes.
[0112] The SkipInitiator scheduler algorithm skips the worker
node(s) executing the application from consideration as a location
for snapshot or application volume storage. The application
associated with the application volume of which the snapshot is to
be taken runs on a worker node known as the initiator node. The
user can decide not to create the snapshot volume on the initiator
node. For example, the user can try to create the snapshot volume
for backup workflow, and the user does not want the application to
be backed up on the worker node on which the application is running
because a malfunction of the initiator node will impact both the
application and the snapshot volume to be used in disaster recovery
and restarting the application. This algorithm enables the user to
skip the initiator node at all points in time to create the
snapshot or application volume. A specified snapshot will thus
always skip the initiator node when creating this snapshot volume.
As will be appreciated, a snapshot volume is used for multiple
purposes, including data protection, which is available locally on
the node. Snapshot volumes can be also used to take a backup of an
application's data from some cluster to store backups on the cloud.
A backup application can be run to take locally stored snapshot
volume data and copy it onto the cloud. When a user runs a backup
application, this algorithm treats it as a high-priority
application that should not be impacted by backup snapshot volume
data being collocated on a common worker node with the backup
application.
[0113] The LeastRecentlyUsed scheduler algorithm selects for the
snapshot or application volume storage location the worker node of
a set of worker nodes that is least recently used for snapshot or
application volume storage (e.g., not used within a predetermined
time period or last used when compared to other worker nodes in the
cluster(s)), respectively. For instance if a first snapshot volume
were created on worker node 1 in the set of worker nodes, a second
snapshot volume will not be stored on worker node 1. When an
application volume but no prior snapshot volume exists on worker
nodes 2 and 3 in the set of worker nodes, the algorithm would
select the least recently used worker node for creating the
snapshot volume so that worker node 2 and 3 would have a higher
score or likelihood of being selected for storing the second
snapshot volume than worker node 1. Conversely, when both worker
nodes 2 and 3 in the set of worker nodes have been used for
snapshot volume storage at a time before worker node 1, the
algorithm would select the least recently used worker node for
creating the snapshot volume so that the least recently used of
worker node 2 and 3 for snapshot volume storage would have a higher
score or likelihood of being selected for storing the second
snapshot volume than worker node 1.
[0114] The MostFrequentlyUsed scheduler algorithm selects, for
snapshot or application volume storage, the worker node of a set of
worker nodes that was used last (or most recently) for creating a
snapshot or application volume, respectively. If worker node 1 was
picked last for a snapshot volume placement, the algorithm tries to
always pick worker node 1 for later snapshot volume placement,
until worker node 1 runs out of memory space for creating snapshot
volumes, in which case the algorithm moves to a next worker node in
the set of worker nodes.
[0115] The AllPlexes scheduler algorithm could be characterized as
"all mirrors".as it creates a snapshot volume on all the worker
nodes in the set of worker nodes where the selected application
volume's mirrors exist. The user typically selects the scheduler
algorithm when the user desires the snapshot volume to be highly
available and be available on all worker nodes on the system.
[0116] A round robin scheduler algorithm (not shown) stores
snapshot or application volume among the worker nodes in a set of
worker nodes on a rotating basis in a specified order. The
specified order can be, for instance, a circular order in which
each worker node in the set of worker nodes receives one snapshot
or application volume per cycle. The round robin algorithm can
include round robin algorithm variants, such as weighted round
robin (e.g., classical weighted round robin, interleaved weighted
round robin, etc.) and deficit round robin or deficit weighted
round robin.
[0117] The SelectorBased scheduler algorithm specifies that the
user can have the option to specify on what worker nodes the user
wants the snapshot or application volume created. For example,
assume there are three nodes 1, 2, and 3 in the set of worker
nodes. This algorithm enables the user to specify for volume one
that he or she wants all snapshot volumes only on worker node 1
because worker node 1 has high availability. If the specifies a
worker node, the highest policy placement weight to that node.
[0118] As will be appreciated, other scheduling algorithms may be
used for snapshot or application volume mirroring depending on the
application.
[0119] In formulating a custom placement policy within a selected
cluster 144, the scheduler algorithms can be used alone or
together, as determined by the user's assigned weights to the
algorithms. A weight of "0" causes the associated scheduler
algorithm not to be executed in selecting one or more worker nodes
for snapshot or application volume placement. The relative weights
of the weighted scheduler algorithms determine the relative
priorities of the weighted scheduler algorithms in worker node
selection. For example, assume a user, in defining a custom
placement policy, selects two scheduler algorithms having different
weights per policy, the scheduler first attempts to apply both
scheduler algorithms consistently, and, when they produce
conflicting or inconsistent placement selections, uses the weights
as an arbiter. Each worker node is assigned a score based on the
aggregate output of both scheduler algorithms. While any scoring
algorithm can be employed, one approach uses each of the scheduler
algorithms to assign a score to each worker node and the score is
then multiplied by the corresponding algorithm's weight to produce
a weight adjusted score. This is done for each scheduler algorithm
and the weighted adjusted scores summed for each worker node. The
worker node(s) having the highest total scores are selected for
snapshot or application volume storage. If the user has a hard
storage placement requirement, then the user can choose only one
scheduler algorithm or, alternatively, assign the scheduler
algorithm that preferentially selects that worker node the highest
weight. This can provide the user with substantial flexibility and
simplicity in formulating custom snapshot and application placement
policies.
[0120] To illustrate how worker nodes are selected using multiple
weighted scheduler algorithms, a first example will be discussed
with reference to FIG. 7B. The user has assigned a first weight of
"1" to the BalanceStorageResourceUsage scheduler algorithm and a
second weight of "10" to the SkipInitiator scheduler algorithm. If
BalanceStorageResourceUsage scheduler algorithm selects first and
second worker nodes, each is assigned a score of "1". If
SkipInitiator scheduler algorithm selects the first worker node and
a third worker node, each is assigned a score of "1". The total
weighted score for the first worker node is "11" (the sum of
(10).times.(1) and (1).times.(1)), for the second worker node is
"1", and for the third worker node is "10". The custom placement
policy would thus cause the scheduler to select the first worker
node as the placement location for a snapshot or application
volume.
[0121] The policy can be further modified by a user rule that any
worker node having a weighted score of a minimum amount is selected
as the placement location for a snapshot or application volume. In
the prior example, a user rule setting the minimum amount as "10"
would cause the scheduler to select the first and third worker
nodes as the placement location for the snapshot or application
volume.
[0122] The placement policy can also be applied at the project 140
and/or cluster 144 level to select one or more clusters from among
a plurality of clusters for snapshot or application volume
asynchronous replication. When applied at the project 140 and/or
cluster 144 level, the placement policy can be used to select, for
a respective tenant and project, one or more clusters for snapshot
or application volume placement by treating each cluster as a
worker node in the description above.
[0123] As will be appreciated, there can be tiers or levels of
placement policies. With reference to the prior paragraph, the
scheduler algorithm could apply a first placement policy to select,
for a given tenant and project, one or more clusters 144 from among
plural clusters for snapshot or application volume asynchronous
replication, and a second placement policy to select, within a
cluster, one or more worker nodes from among plural worker nodes
for the snapshot or application volume asynchronous replication.
Apart from the object of the placement policies, the policies can
be the same or different. For example, a
BalanceStorageResourceUsage scheduler algorithm can be used at the
tenant and project level to select one or more clusters and the
SkipInitiator scheduler algorithm can be used at the cluster level
to select one or more worker nodes within the one or more clusters
selected by the BalanceStorageResourceUsage scheduler algorithm.
Alternatively, for snapshot or application volume placement a first
custom policy can assign a first set of weights to the
BalanceStorageResourceUsage and SkipInitiator scheduler algorithms
to select, at the tenant and project levels, one or more clusters
from among plural clusters and a second custom placement policy
assigning a different second set of weights to the
BalanceStorageResourceUsage and SkipInitiator scheduler algorithms
can be used at the cluster level to select one or more worker nodes
from among plural worker nodes within the one or more clusters
selected by the first custom placement policy.
Method of Operation of the Multi-Cloud Platform
[0124] With reference to FIGS. 3 and 9-10, an embodiment of a
process to select a worker node within a cluster for snapshot
volume placement will be discussed. While the process is discussed
with reference to snapshot volume placement, it is to be understood
that it could also be applied to selecting a cluster and/or worker
node for application volume mirroring. While the process is
discussed with reference to selecting a worker node within a
cluster for snapshot volume placement, it is to be understood that
it could also be applied to selecting a cluster from among multiple
clusters for snapshot volume mirroring.
[0125] With reference to FIG. 9, the master API server 114 receives
from a user (e.g., issued by the user through Command Line
Interface (CLI)) a snapshot create request for a specified
application's volume (step 904). An exemplary snapshot creation
request is depicted in FIG. 6B.
[0126] In step 908, the master API server 114 records the snapshot
creation request in the etcd database 112 and sets the snapshot
creation request to the pending state.
[0127] In step 912, the scheduler 264, in response to the snapshot
creation request in the pending state, identifies relevant snapshot
placement policy(s) previously selected by the user. The scheduler
264 watches the etcd database 112 for any snapshot requests that
are in the pending state.
[0128] In decision diamond 916, the scheduler 264 determines
whether the relevant policy(s) stipulate more than one scheduler
algorithm applies.
[0129] When the relevant policy(s) specify that more than one
scheduler algorithm applies, the scheduler 264, in step 920,
determines, for each selected or weighted scheduler algorithm, the
worker nodes specified by the algorithm.
[0130] In step 924, the scheduler 264 assigns a weighted score or
ranking to each worker node as noted above.
[0131] In step 928, the scheduler 264 selects the worker node(s)
having the highest or (depending on the implementation) at least a
minimum weighted score threshold as the worker nodes to be used for
snapshot volume placement.
[0132] When the relevant policy(s) specify that only one scheduler
algorithm applies, the scheduler 264, in step 928, selects the
worker node(s) stipulated by the output of the scheduler
algorithm.
[0133] In step 930, the scheduler 264 updates the etcd database 112
with the selected worker node(s) for the snapshot volume and sets
the snapshot volume to the scheduled state and the snapshot create
request to the completed state.
[0134] In response to the snapshot volume being in the scheduled
state, the distributed storage controller 244, in step 1004, sends
a quiesce request to the worker node(s) 300 where the application
148 corresponding to the snapshot volume is executing. The
distributed storage controller 244 watches the etcd database 112
for any snapshot volumes that are in scheduled state.
[0135] In step 1008, the distributed storage controller 264
receives a response from the node agent 304 in the worker node(s)
300 after quiescing.
[0136] The distributed storage controller 264, in step 1012, sends
a snapshot create request to the worker node(s) selected by the
scheduler 264 based on the relevant placement policies.
[0137] In step 1016, the node agent 304 of each selected worker
node 300 interacts with the quiesced worker node(s) and/or any
worker node where the application volume for the application is
stored (e.g., mirrored) to create a snapshot volume 312 and returns
a response indicating that the snapshot volume 312 has been
created.
[0138] In step 1020, the distributed storage controller 264 sends a
resume request to the worker node(s) where the application is
executing to resume execution.
[0139] In step 1024, the distributed storage controller 244 updates
the database 112 to set the snapshot volume to the available
state.
[0140] The exemplary systems and methods of this invention have
been described in relation to cloud computing. However, to avoid
unnecessarily obscuring the present invention, the preceding
description omits a number of known structures and devices. This
omission is not to be construed as a limitation of the scope of the
claimed invention. Specific details are set forth to provide an
understanding of the present invention. It should however be
appreciated that the present invention may be practiced in a
variety of ways beyond the specific detail set forth herein.
[0141] Furthermore, while the exemplary embodiments illustrated
herein show the various components of the system collocated,
certain components of the system can be located remotely, at
distant portions of a distributed network, such as a LAN and/or the
Internet, or within a dedicated system. Thus, it should be
appreciated, that the components of the system can be combined in
to one or more devices, such as a server, or collocated on a
particular node of a distributed network, such as an analog and/or
digital telecommunications network, a packet-switch network, or a
circuit-switched network. It will be appreciated from the preceding
description, and for reasons of computational efficiency, that the
components of the system can be arranged at any location within a
distributed network of components without affecting the operation
of the system. For example, the various components can be located
in a switch such as a PBX and media server, gateway, in one or more
communications devices, at one or more users' premises, or some
combination thereof. Similarly, one or more functional portions of
the system could be distributed between a telecommunications
device(s) and an associated computing device.
[0142] Furthermore, it should be appreciated that the various links
connecting the elements can be wired or wireless links, or any
combination thereof, or any other known or later developed
element(s) that is capable of supplying and/or communicating data
to and from the connected elements. These wired or wireless links
can also be secure links and may be capable of communicating
encrypted information. Transmission media used as links, for
example, can be any suitable carrier for electrical signals,
including coaxial cables, copper wire and fiber optics, and may
take the form of acoustic or light waves, such as those generated
during radio-wave and infra-red data communications.
[0143] Also, while the flowcharts have been discussed and
illustrated in relation to a particular sequence of events, it
should be appreciated that changes, additions, and omissions to
this sequence can occur without materially affecting the operation
of the invention.
[0144] A number of variations and modifications of the invention
can be used. It would be possible to provide for some features of
the invention without providing others.
[0145] In one embodiment, the systems and methods of this invention
can be implemented in conjunction with a special purpose computer,
a programmed microprocessor or microcontroller and peripheral
integrated circuit element(s), an ASIC or other integrated circuit,
a digital signal processor, a hard-wired electronic or logic
circuit such as discrete element circuit, a programmable logic
device or gate array such as PLD, PLA, FPGA, PAL, special purpose
computer, any comparable means, or the like. In general, any
device(s) or means capable of implementing the methodology
illustrated herein can be used to implement the various aspects of
this invention. Exemplary hardware that can be used for the present
invention includes computers, handheld devices, telephones (e.g.,
cellular, Internet enabled, digital, analog, hybrids, and others),
and other hardware known in the art. Some of these devices include
processors (e.g., a single or multiple microprocessors), memory,
nonvolatile storage, input devices, and output devices.
Furthermore, alternative software implementations including, but
not limited to, distributed processing or component/object
distributed processing, parallel processing, or virtual machine
processing can also be constructed to implement the methods
described herein.
[0146] In yet another embodiment, the disclosed methods may be
readily implemented in conjunction with software using object or
object-oriented software development environments that provide
portable source code that can be used on a variety of computer or
workstation platforms. Alternatively, the disclosed system may be
implemented partially or fully in hardware using standard logic
circuits or VLSI design. Whether software or hardware is used to
implement the systems in accordance with this invention is
dependent on the speed and/or efficiency requirements of the
system, the particular function, and the particular software or
hardware systems or microprocessor or microcomputer systems being
utilized.
[0147] In yet another embodiment, the disclosed methods may be
partially implemented in software that can be stored on a storage
medium, executed on programmed general-purpose computer with the
cooperation of a controller and memory, a special purpose computer,
a microprocessor, or the like. In these instances, the systems and
methods of this invention can be implemented as program embedded on
personal computer such as an applet, JAVA.RTM. or CGI script, as a
resource residing on a server or computer workstation, as a routine
embedded in a dedicated measurement system, system component, or
the like. The system can also be implemented by physically
incorporating the system and/or method into a software and/or
hardware system.
[0148] Although the present invention describes components and
functions implemented in the embodiments with reference to
particular standards and protocols, the invention is not limited to
such standards and protocols. Other similar standards and protocols
not mentioned herein are in existence and are considered to be
included in the present invention. Moreover, the standards and
protocols mentioned herein and other similar standards and
protocols not mentioned herein are periodically superseded by
faster or more effective equivalents having essentially the same
functions. Such replacement standards and protocols having the same
functions are considered equivalents included in the present
invention.
[0149] The present invention, in various embodiments,
configurations, and aspects, includes components, methods,
processes, systems and/or apparatus substantially as depicted and
described herein, including various embodiments, subcombinations,
and subsets thereof. Those of skill in the art will understand how
to make and use the present invention after understanding the
present disclosure. The present invention, in various embodiments,
configurations, and aspects, includes providing devices and
processes in the absence of items not depicted and/or described
herein or in various embodiments, configurations, or aspects
hereof, including in the absence of such items as may have been
used in previous devices or processes, e.g., for improving
performance, achieving ease and\or reducing cost of
implementation.
[0150] The foregoing discussion of the invention has been presented
for purposes of illustration and description. The foregoing is not
intended to limit the invention to the form or forms disclosed
herein. In the foregoing Detailed Description for example, various
features of the invention are grouped together in one or more
embodiments, configurations, or aspects for the purpose of
streamlining the disclosure. The features of the embodiments,
configurations, or aspects of the invention may be combined in
alternate embodiments, configurations, or aspects other than those
discussed above. This method of disclosure is not to be interpreted
as reflecting an intention that the claimed invention requires more
features than are expressly recited in each claim. Rather, as the
following claims reflect, inventive aspects lie in less than all
features of a single foregoing disclosed embodiment, configuration,
or aspect. Thus, the following claims are hereby incorporated into
this Detailed Description, with each claim standing on its own as a
separate preferred embodiment of the invention.
[0151] Moreover, though the description of the invention has
included description of one or more embodiments, configurations, or
aspects and certain variations and modifications, other variations,
combinations, and modifications are within the scope of the
invention, e.g., as may be within the skill and knowledge of those
in the art, after understanding the present disclosure. It is
intended to obtain rights which include alternative embodiments,
configurations, or aspects to the extent permitted, including
alternate, interchangeable and/or equivalent structures, functions,
ranges or steps to those claimed, whether or not such alternate,
interchangeable and/or equivalent structures, functions, ranges or
steps are disclosed herein, and without intending to publicly
dedicate any patentable subject matter.
* * * * *