U.S. patent application number 15/188279 was filed with the patent office on 2017-12-21 for load balancing back-end application services utilizing derivative-based cluster metrics.
The applicant listed for this patent is Facebook, Inc.. Invention is credited to Mark Warren McDuff.
Application Number | 20170366604 15/188279 |
Document ID | / |
Family ID | 60659976 |
Filed Date | 2017-12-21 |
United States Patent
Application |
20170366604 |
Kind Code |
A1 |
McDuff; Mark Warren |
December 21, 2017 |
LOAD BALANCING BACK-END APPLICATION SERVICES UTILIZING
DERIVATIVE-BASED CLUSTER METRICS
Abstract
Some embodiments include a back-end routing engine. The engine
can receive traffic data characterizes amount of service requests
from front-end servers to a server group of one or more back-end
servers that corresponds to a geographical tier in a server group
hierarchy. The engine can receive metric measurements in a
performance metric dimension for the server group and a performance
threshold corresponding to the performance metric dimension and the
geographical tier. The engine can estimate a linear derivative
between variable traffic data and variable performance metric in
the performance metric dimension based on collected sample points
respectively representing the traffic data and the metric
measurement. The engine can then compute, based on the linear
derivative and the performance threshold, a threshold traffic
capacity of the server group. The engine can then generate a
routing table based on the threshold traffic capacity.
Inventors: |
McDuff; Mark Warren;
(Seattle, WA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Facebook, Inc. |
Menlo Park |
CA |
US |
|
|
Family ID: |
60659976 |
Appl. No.: |
15/188279 |
Filed: |
June 21, 2016 |
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04L 41/147 20130101;
H04L 43/16 20130101; H04L 43/067 20130101; H04L 47/823 20130101;
H04L 43/0817 20130101; H04L 67/1002 20130101; H04L 43/0876
20130101; H04L 47/822 20130101; H04L 67/42 20130101 |
International
Class: |
H04L 29/08 20060101
H04L029/08; H04L 12/24 20060101 H04L012/24; H04L 12/26 20060101
H04L012/26; H04L 12/911 20130101 H04L012/911 |
Claims
1. A computer-implemented method, comprising: receiving timestamped
traffic data that characterizes amount of application service
requests from one or more front-end servers of a social networking
system to a server group of one or more back-end servers in the
social networking system, and wherein the server group corresponds
to a geographical tier in a server group hierarchy of application
service back-end servers in the social networking system; receiving
one or more timestamped metric measurements in a performance metric
dimension and a performance metric threshold corresponding to the
performance metric dimension and the geographical tier, wherein the
timestamped metric measurements are for the server group;
estimating a linear derivative between variable traffic data and
variable performance metric in the performance metric dimension
based on collected sample points respectively representing the
received timestamped traffic data and the received timestamped
metric measurements at different time points; computing, based on
the estimated linear derivative, a threshold traffic capacity of
the server group before the variable performance metric is
estimated to exceed the performance metric threshold; and
generating, based on the threshold traffic capacity, a routing
table for access by a front-end server to facilitate selection of a
back-end server destination from amongst the application service
back-end servers, wherein the threshold traffic capacity determines
how frequent that the front-end server selects a target back-end
server outside of the server group as the back-end server
destination.
2. The computer-implemented method of claim 1, wherein said
generating the routing table includes publishing the routing table
to the front-end servers; wherein the one or more timestamped
metric measurements are collected periodically and the routing
table is updated periodically; and wherein the timestamped traffic
data is received from the front-end servers or from a traffic data
collection server.
3. The computer-implemented method of claim 1, further comprising
receiving multiple performance metric thresholds corresponding to
multiple geographical tiers in the server group hierarchy; and
wherein the routing table is configured to specify routing logic
corresponding to the geographical tiers.
4. The computer-implemented method of claim 1, further comprising:
specifying a blackhole threshold for the geographical tier or to
all tiers of the server group hierarchy; and wherein said
generating the routing table includes configuring the routing table
to route traffic away from a back-end server responsive to an
estimated metric measurement of the server group crossing the
blackhole threshold.
5. The computer-implemented method of claim 1, further comprising:
specifying a traffic movement limit for the geographical tier or to
all tiers of the server group hierarchy; and wherein said
generating the routing table includes configuring the routing table
such that the generated routing table moves traffic to or away from
the server group no more than the traffic movement limit as
compared to a last iteration of the routing table.
6. The computer-implemented method of claim 1, further comprising:
specifying a maximum traffic amount for the server group; wherein
said generating the routing table includes: configuring, in a first
interval, the routing table to distribute traffic without exceeding
the maximum traffic amount for the server group; and configuring,
in a second interval following the first interval, the routing
table to distribute traffic without regard to the maximum traffic
amount.
7. The computer-implemented method of claim 1, wherein said
estimating, computing, and generating are performed by a back-end
routing engine running on a standalone server system separate from
the front-end servers and the back-end servers.
8. The computer-implemented method of claim 1, wherein the server
group is a server cluster that includes at least a back-end server,
a datacenter that includes at least a server cluster, or a region
that includes at least a datacenter.
9. The computer-implemented method of claim 1, further comprising
expiring at least a subset of the collected sample points according
to a decay function based on timestamp data of the collected sample
points.
10. The computer-implemented method of claim 1, further comprising
reducing one or more weights of at least a subset of the collected
sample points according to a decay function based on timestamp data
of the collected sample points, wherein said estimating includes
adjusting the linear derivative based on a sample point
proportional to a weight of the sample point.
11. The computer-implemented method of claim 1, wherein the
variable traffic data and the variable performance metric have a
negative linear relationship.
12. The computer-implemented method of claim 1, further comprising
receiving metric measurements and performance metric thresholds
from different server groups of back-end servers corresponding to
different geographical tiers in the server group hierarchy.
13. The computer-implemented method of claim 1, further comprising:
computing an estimated performance metric of the server group based
on the estimated linear derivative and a current traffic amount to
the server group; and wherein the routing table is configured to
instruct the front-end server to route traffic outside of the
server group responsive to the estimated performance metric
exceeding the performance metric threshold associated with the
geographical tier.
14. The computer-implemented method of claim 13, wherein the
routing table is configured to instruct the front-end server to
route traffic outside of the server group and in another server
group within the geographical tier responsive to the estimated
performance metric exceeding the performance metric threshold
associated with the geographical tier.
15. The computer-implemented method of claim 1, wherein the linear
derivative is estimated based on statistical summaries of data
points from different back-end servers in the server group.
16. A computer readable data storage memory storing
computer-executable instructions that, when executed by a computer
system, cause the computer system to perform a computer-implemented
method, the instructions comprising: instructions for receiving
timestamped traffic data that characterizes amount of application
service requests from one or more front-end servers of a social
networking system to a server group of one or more back-end servers
in the social networking system, and wherein the server group
corresponds to a geographical tier in a server group hierarchy of
application service back-end servers in the social networking
system; instructions for receiving one or more timestamped metric
measurements in a performance metric dimension and a performance
metric threshold corresponding to the performance metric dimension
and the geographical tier, wherein the timestamped metric
measurements are for the server group; instructions for estimating
a linear derivative between variable traffic data and variable
performance metric in the performance metric dimension based on
collected sample points respectively representing the received
timestamped traffic data and the received timestamped metric
measurements at different time points; instructions for computing,
based on the estimated linear derivative, a threshold traffic
capacity of the server group before the variable performance metric
is estimated to exceed the performance metric threshold; and
instructions for generating, based on the threshold traffic
capacity, a routing table for access by a front-end server to
facilitate selection of a back-end server from amongst the
application service back-end servers, wherein the threshold traffic
capacity determines how frequent that the front-end server selects
a target back-end server outside of the server group as the
back-end server destination, and wherein the application service
back-end servers are potential traffic destinations in the routing
table.
17. The computer readable data storage memory of claim 16, wherein
the instructions further comprises: instructions for determining
whether the server group stops reporting further timestamped metric
measurements; and instructions for updating the routing table to
route traffic away from the server group.
18. The computer readable data storage memory of claim 16, wherein
the front-end servers are computer servers that process user device
requests to the social networking system and the back-end servers
are computer servers correspond to one or more application services
that processes data for the front-end servers.
19. A computer system, comprising: non-transitory memory configured
to store executable instructions; one or more processors configured
by the executable instructions to: receive timestamped traffic data
that characterizes amount of application service requests from one
or more front-end servers of a social networking system to a server
group of one or more back-end servers in the social networking
system, and wherein the server group corresponds to a geographical
tier in a server group hierarchy of application service back-end
servers in the social networking system; receive one or more
timestamped metric measurements in a performance metric dimension
and a performance metric threshold corresponding to the performance
metric dimension and the geographical tier, wherein the timestamped
metric measurements are for the server group; collect and pair the
timestamped traffic data and the timestamped metric measurements
into sample points based on timestamps of the timestamped traffic
data and the timestamped metric measurements; estimate a linear
derivative between variable traffic data and variable performance
metric in the performance metric dimension based on collected
sample points respectively representing the received timestamped
traffic data and the received timestamped metric measurements at
different time points; compute, based on the estimated linear
derivative, a threshold traffic capacity of the server group before
the variable performance metric is estimated to exceed the
performance metric threshold; and generate, based on the threshold
traffic capacity, a routing table for access by a front-end server
to facilitate selection of a back-end server from amongst the
application service back-end servers, wherein the threshold traffic
capacity determines how frequent that the front-end server selects
a target back-end server outside of the server group as the
back-end server destination.
20. The computer system of claim 19, wherein the one or more
processors are configured to receive multiple performance metric
thresholds for the geographical tier and a single application
service type, wherein the multiple performance metric thresholds
correspond to multiple performance metric dimensions, and wherein
the one or more processors are configured to: compute multiple
traffic capacities of the server group respective for before
variable performance metric of each performance metric dimension is
estimated to exceed each performance threshold; and selecting
lowest of the multiple traffic capacities as the threshold traffic
capacity.
Description
BACKGROUND
[0001] Social networking services are accessed by users to
communicate with each other, share their interests, upload images
and videos, create new relationships, etc. Social networking
services typically operate in a distributed computing environment
with data being distributed among one or more server clusters that
are each located in one of multiple data centers. A server cluster
is a grouping of server computing devices ("servers"). When a user
of a social networking service sends a query to request and/or
process data, a front-end load balancing server can be the first
server to receive the user request. The front-end load balancing
server usually routes the request to a front-end server (e.g., web
server) in one of the server clusters. The front-end server can
identify one or more back-end application services to help respond
to the user request. The front-end server generates one or more
service requests to the back-end application services based on a
routing table published by a back-end routing service.
[0002] A conventional back-end routing service faces different
challenges than the front-end load balancing server. For example,
polling performance metrics from back-end servers are much more
resource intensive compared to polling the performance metrics of
front-end servers. The conventional back-end routing service
publishes, periodically, a routing table based on known front-end
service request statistics and back-end server performance
statistics. The conventional back-end routing service generally
reduces amount of traffic to a back-end server whenever that
back-end server is over its capacity limit. This conventional
technique causes the back-end servers to frequently experience
oscillations, where for example, an available back-end server often
times receives a sudden overload of service requests followed by a
drought of application service requests leading to sub-optimal
usage.
BRIEF DESCRIPTION OF THE DRAWINGS
[0003] FIG. 1 is block diagram illustrating an overview of a social
networking system in which some embodiments may operate.
[0004] FIG. 2 is a data flow diagram illustrating a back-end
routing engine, in accordance with various embodiments.
[0005] FIG. 3 is a tree diagram illustrating a server group
hierarchy, in accordance with various embodiments.
[0006] FIG. 4 is a flow chart illustrating a method of operating a
back-end routing server system, in accordance with various
embodiments.
[0007] FIG. 5 is a block diagram of an example of a computing
device, which may represent one or more computing device or server
described herein, in accordance with various embodiments.
[0008] The figures depict various embodiments of this disclosure
for purposes of illustration only. One skilled in the art will
readily recognize from the following discussion that alternative
embodiments of the structures and methods illustrated herein may be
employed without departing from the principles of embodiments
described herein.
DETAILED DESCRIPTION
[0009] Various embodiments are directed to routing traffic between
front-end servers and back-end servers (e.g., application service
servers). A back-end routing engine (e.g., running on a standalone
server system or integrated with the application service servers)
is configured to generate (e.g., periodically) traffic routing
tables for a group of front-end servers. The back-end routing
engine can collect (e.g., periodically) performance metrics from a
group of back-end servers (e.g., traffic destinations) across
different geographical regions and traffic data from the group of
front-end servers. The back-end routing engine can also receive a
set of performance metric thresholds from application service
owners corresponding to the destination servers. The traffic data
characterizes the amount of traffic from the front-end servers to
the back-end servers. The performance metrics are measurements
indicative of the capacity of the back-end servers to handle more
incoming request traffic.
[0010] Each performance metric threshold can correspond to a
grouping of servers under different geographical tiers (e.g., a
cluster tier, a data center tier, a geographical region tier,
etc.). In some embodiments, the back-end routing engine can check a
most recent performance metric measurement against a performance
metric threshold of a geographical tier's border to determine
whether to route traffic outside of the border. For example, a
front-end server in the server cluster would have to route traffic
to a back-end server within the server cluster if the performance
metric threshold for the "cluster tier" has not been exceeded by
the most recent performance measurement.
[0011] In some embodiments, the back-end routing engine estimates a
linear relationship between variable traffic and variable
performance metric for each back-end server based on collected
historical data points. The back-end routing engine can reduce the
weight of or expire the data points according to one or more
time-based decay algorithms, such as an exponential decay
algorithm. In embodiments where the weight of the data points are
reduced, as time passes, older data points influences the estimated
linear relationship less and less. The back-end routing engine can
further estimate that linear relationship between traffic and
performance metric for each grouping of servers (e.g., based on
statistical summaries of data points of its member servers). Using
the linear derivative estimate, the back-end routing engine can
estimate how much traffic each geographical group of servers can
take before the performance metric exceeds an assigned threshold.
The back-end routing engine enables a geographical grouping of
servers to route traffic outside of its own boundary if the
threshold associated therewith is exceeded. For example, if a
threshold is exceeded for a group of servers, a new routing table
(e.g., published periodically) generated and published by the
back-end routing engine for front-end servers would reflect traffic
routes outside of the group boundary (e.g., geographical boundary,
system boundary, sub-network boundary, or any combination thereof),
even if the front-end servers are within the group boundary.
[0012] Referring now to the figures, FIG. 1 is block diagram
illustrating an overview of a social networking system 100 in which
some embodiments may operate. The social networking system 100 can
include one or more client computing devices (e.g., a mobile device
105A, a desktop computer 105B, a computer server 105C, a laptop
105D, etc., collectively as the "client computing devices 105").
The client computing devices 105 may operate in a networked
environment using logical connections to one or more remote
computers (e.g., front-end servers 125) through an external network
110. Requests from the client computing devices 105 can reach a
traffic routing server 115 first. The traffic routing server 115,
in some embodiments, can be a customized load balancing server that
monitors loads on different server clusters that contain at least a
front-end server. For example, one or more front-end servers can be
part of server clusters 120 (e.g., a server cluster 120A, a server
cluster 120B, a server cluster 120C, etc., collectively as the
"server clusters 120"). In some embodiments, the traffic routing
server 115 resides in a "point of presence" (PoP) outside the data
centers. In some embodiments, instead of making traffic routing
decisions randomly or based solely on load constraints, the traffic
routing server 115 can route traffic based on social information of
the social networking system 100. For example, the traffic routing
server 115 routes traffic from similar users to the same cluster
while respecting load constraints to achieve improved cache hit
rates without putting excessive load on the front-end clusters.
[0013] As illustrated, the social networking system 100 includes
the server clusters 120. The server clusters 120 may be located in
the same data center, in a different data center of the same
geographical region, or in a different data center in another
geographic region. Each server cluster can include different
functional types (e.g., a front-end server type and/or a back-end
server type) of computer servers.
[0014] In one example, the server cluster 120A is a multifunctional
server cluster and includes one or more front-end servers 125A, one
or more back-end servers 123A, and a caching layer 130A. In one
example, the server cluster 120B is a back-end only server cluster,
and includes one or more back-end servers 123B. In one example, the
server cluster 120C is a front-end only cluster, and includes only
the front-end servers 125C and the caching layer 130C. In
embodiments where a server cluster is a front-end only cluster, the
server cluster is connected to one or more back-end server
clusters. Each back-end server cluster can include one or more
databases that store data in a persistent manner.
[0015] The server clusters 120 can communicate with each other, a
back-end polling server system 132, a back-end routing server
system 136, or any combination thereof, via an internal network
140. The back-end polling server system 132 can collect performance
metric measurements from the computer servers (e.g., back-end
servers or both back-end and front-end servers) of the server
clusters 120. The back-end polling server system 132 can provide
the collected metric measurements to the back-end routing server
system 136. The back-end routing server system 136 can periodically
generate and publish one or more routing tables to the front-end
servers in the server clusters 120 to facilitate the front-end
servers to choose which back-end server to send an application
service request.
[0016] Front-end servers, such as the front-end servers 125A and
the front-end servers 125C, can implement application services to
directly serve incoming client device requests from the client
computing devices 105. For example, a front-end server can
implement a Web server (e.g., that provides a web service) and/or
an application programming interface for mobile applications (e.g.,
applications running on general-purpose operating systems of mobile
devices) associated with the social networking system 100. The
back-end servers (e.g., the back-end servers 123A and the back-end
servers 123B) can implement application services that serve the
front-end servers. Caching layer 130A and the caching layer 130C
can cache recently used data for the front-end servers and/or
back-end servers. In some embodiments, the back-end servers and the
front-end servers respectively have different cache layers.
[0017] The traffic routing server 115 is a front-end routing
service. The traffic routing server 115 directs one or more client
device requests to the front-end servers (e.g., the front-end
servers 125A or one of the front-end servers 125C) in the server
clusters 120. A front-end server can process a client device
request from one of the client computing devices 105. To complete
the client device request, the front-end server can generate
application service requests to one or more back-end servers (e.g.,
the back-end servers 123A and/or the back-end servers 123B).
[0018] It should be noted that the term "server" as used throughout
this application refers generally to a computer, an electronic
device, a program, or any combination thereof that processes and
responds to requests (e.g., from the client computing devices 105
or from other servers). Servers can provide data to the requesting
"clients." The term "client" as used herein refers generally to a
computer, a program, an electronic device, or any combination
thereof that is capable of processing and making requests and/or
obtaining and processing any responses from servers. Client
computing devices and server computing devices may each act as a
server or client to other server/client devices. In some
embodiments, the external network 110 is a wide area network (WAN)
and the internal network 140 is also a WAN. In other embodiments,
the internal network 140 is a local area network (LAN).
[0019] FIG. 2 is a data flow diagram illustrating a back-end
routing engine 200, in accordance with various embodiments. The
back-end routing engine 200 can be a software application. The
back-end routing engine 200 can implement a routing table
generation service on a computer system (e.g., the back-end routing
server system 136). The back-end routing engine 200 can receive a
data stream of metric measurements 202 and a data stream of traffic
data 206. The data stream can provide data periodically or whenever
the data becomes available. Both the traffic data 206 and the
metric measurements 202 can be timestamped. The traffic data 206
are associated with application service requests from one or more
front-end servers of a social networking system (e.g., the social
networking system 100). The back-end routing engine 200 can be
assigned to provide one or more routing tables 210 for the
front-end servers.
[0020] The metric measurements 202 can correspond to at least a
performance metric dimension for a set of one or more back-end
servers. For example, the performance metric dimension can be
processor usage, memory usage, network bandwidth usage, request
queue length, request latency/wait time, storage usage, or any
combination thereof. The back-end routing engine 200 can receive,
for the set of one or more back-end servers, a performance metric
threshold 214 corresponding to the performance metric dimension.
The performance metric threshold 214 can correspond to a
geographical tier in a server grouping hierarchy and to an
application service type.
[0021] Every grouping of servers corresponding to the geographical
tier and to the application service type can use the performance
metric threshold 214 to determine whether to route traffic outside
of the grouping border. The performance metric threshold 214 for a
group of back-end servers can be compared against an average of the
group, a median of the group, a mode of the group, a percentile
metric level of the group, a maximum or minimum value of the metric
measurements of the group, or other statistical summary of the
metric measurements of the group). In some embodiments, the set of
one or more back-end servers can send the back-end routing engine
200 the performance metric threshold 214 at any time. In some
embodiments, the set of one or more back-end servers can publish
the performance metric threshold 214 to a data storage, and the
back-end routing engine 200 can poll periodically.
[0022] The back-end routing engine 200 can collect and pair the
traffic data and the metric measurements into sample data points
based on timestamps of the traffic data and metric measurements. A
unit traffic data (e.g., x coordinate) and a unit metric
measurement (e.g., y coordinate) with similar timestamps (e.g.,
within a threshold time window from each other) can be labeled as a
single sample point. Based on timestamp data of the collected data
points, the back-end routing engine 200 can expire or reduce the
weight of at least a subset of the collected data points according
to a decay function. For example, the decay function can be an
exponential decay function.
[0023] FIG. 3 is a tree diagram illustrating a server group
hierarchy 300, in accordance with various embodiments. In various
embodiments, front-end servers and back-end servers of a social
networking system (e.g., the social networking system 100) are
grouped hierarchically (e.g., in separate groups or in mixed
groups). In some embodiments, communication within a server group,
on average, is faster than communication between a server in the
server group and another server outside of the server group. In
some embodiments, communication between a server in a server group
and another server in a different server group in the same tier of
the server group hierarchy 300, on average, is faster than
communication between a server in a server group and another server
in a different server group in a higher tier.
[0024] The server group hierarchy 300 can include a global tier
302, a continental tier 306, a regional tier 310, a datacenter tier
314, and a cluster tier 318. The global tier 302 can be the highest
tier and includes all the servers of the social networking system.
The continental tier 306 can be the second-highest tier, and can
include a continental group 326A and a continental group 326B of
computer servers (e.g., back-end servers). Next down in this
example is the regional tier 310, and can include a regional group
330A and a regional group 330B of computer servers. Yet next down
in this example is the datacenter tier 314, and can include a
datacenter group 334A and a datacenter group 334B of computer
servers. The last tier in this example can be the cluster tier 318,
and can include a server cluster 340A and a server cluster
340B.
[0025] In some embodiments, a front-end server can be in the same
group as a back-end server. In some embodiments, a server group can
include multiple back-end application services. Each back-end
application service can identify one or more performance metric
thresholds 214 respectively corresponding to one or more
performance metric dimensions.
[0026] Communication within each border of a geographical tier in
the server group hierarchy 300 is cheaper (e.g., in terms of
latency, bottlenecks, cost of connection, or any combination
thereof) than across that border. For example, a front-end server
can communicate with a back-end server in the same datacenter at a
cheaper resource cost than to communicate with a back-end server in
another datacenter.
[0027] In various embodiments, the back-end routing engine 200
compares a reported performance metric against a set of performance
metric thresholds for one or more of the server group borders in
the server group hierarchy 300. Each performance metric threshold
can be specific to an application service type (e.g., a data cache
service, a messenger service, a media player service, a newsfeed
service, a location-based service, a content filter service, a
machine learning service, etc., or any combination thereof). In
some embodiments, server groups within the same geographical tier
uses the same performance metric threshold. In some embodiments,
the back-end routing engine 200 receives the metric measurements
202 collected from servers in the server group hierarchy 300. In
some embodiments, only the metric measurements of the back-end
servers are collected. In some embodiments, metric measurements of
all servers are collected. In some embodiments, metric measurements
of the back-end servers corresponding to one or more application
service types are collected.
[0028] The back-end routing engine 200 can aggregate the metric
measurements for back-end servers within a boundary (e.g., an
averaged or a percentile of the aggregated whole). When the
aggregated metric measurements for the back-end servers within a
border reach that a performance metric threshold indicated by a
back-end service, the back-end routing engine can start to offload
requests for the back-end service to outside of the border, while
still within the next highest border. The back-end routing engine
200 can specify a "blackhole" threshold which instructs the
back-end routing engine to blackhole traffic (e.g., drop additional
traffic) when a most recent metric measurement or metric
measurements crosses the blackhole threshold. In various
embodiments, the performance metric thresholds increase for each
border in the server group hierarchy 300. The performance metric
thresholds and the aggregation of metric measurements can be
automatically or manually configured.
[0029] The back-end routing engine 200 can take as input a time
series of current traffic data (e.g., tuples of source,
destination, and queries per second (QPS)), and a time series of
metric measurements from the back-end servers. The back-end routing
engine 200 can apply a routing algorithm, which decides how traffic
should be routed. The back-end routing engine can publish a routing
table with the determined route to the front-end servers. The
back-end routing engine 200 can repeat the update and publication
of the routing table on a preset interval or variable interval. The
interval of update can be configured to be high enough to account
for response lag within the social networking system. For example,
publishing to the front-end servers and aggregating traffic data
from the front-end servers both take time. Further, metric
measurements on the back-end servers may not react instantaneously
to new traffic load. In some embodiments, the variable interval is
manually configured. In some embodiments, the back-end routing
engine measures and monitors the response lag, and determines the
variable interval based on the measured response lag.
[0030] The back-end routing engine 200 can read the traffic data of
the front-end servers and the metric measurements of the back-end
servers. The back-end routing engine 200 can estimate a linear
relationship (e.g., a derivative) between variable traffic
measurements (e.g., QPSs) and variable metric measurements (e.g.,
in a performance metric dimensions) for each destination (e.g., a
back-end server or a server group in a geographical tier). The
back-end routing engine 200 can compute the linear relationship by
computing the slope of a line between data points representative of
sequential traffic and metric measurements. For example, the X
coordinate of a data point can be traffic measurement and the Y
coordinate of the data point can be a metric measurement. The
back-end routing engine can use a decay function (e.g., exponential
decay) to weight newer data points higher than older data
points.
[0031] The back-end routing engine 200 can use the estimate of the
linear relationship, for each server group, to estimate how much
traffic each server group can take before the variable metric
measurements exceed the performance metric threshold. The back-end
routing engine can compute the "capacity" of the server group
relative to a performance metric threshold at a hierarchical level.
The capacity for a server group can increase with each hierarchical
level for which a performance metric threshold has been set.
[0032] In some embodiments, the back-end routing engine 200 starts
off with all traffic considered undistributed. The back-end routing
engine can move up the hierarchical levels for which performance
metric thresholds have been set, starting with the lowest level.
For each level, the back-end routing engine 200 can scan
undistributed traffic generated by its clients (e.g., front-end
servers) within the border of that level, and the metric
measurements of the back-end servers within that border. If it is
not possible to distribute that traffic to the back-end servers and
keep the back-end servers below their capacities at that level, the
back-end routing engine 200 can distribute the traffic so that each
back-end server reaches its capacity and the rest of the traffic is
left undistributed. If possible, then the back-end routing engine
distributes the traffic using the derivative estimations such that
the spare capacity at that hierarchical level is proportional to
the inverse of the estimated derivative, which ensures that the
result in the metric being equal for the server groups to which
traffic is distributed. If there is additional undistributed
traffic left after the last hierarchical level (e.g., one level
below the global level or the global level) for which there is a
performance metric threshold and/or if that hierarchical level is
the blackhole level, then the back-end routing engine blackholes
the traffic thereto; otherwise, the back-end routing engine
distributes the traffic in proportion to the inverse of the
estimated derivative in order to equalize the metric
measurements.
[0033] In some embodiments, the back-end routing engine 200 is
configured to limit the amount of traffic onloaded to or offloaded
from a back-end server during one interval (e.g., one cycle of
generating and publishing the routing tables 210). This is useful
to maintain stability, especially because the relationship between
request traffic and metric measurements may not be linear, making
the linear estimation sub-optimal.
[0034] In some embodiments, a server group may be manually drained
(e.g., to reduce traffic) by setting a maximum QPS (e.g., 100 QPS
or 0 QPS) to the server group. The back-end routing engine 200 can
obey the maximum QPS unless all server groups in the same tier are
being drained and the sum of their maximum QPSs is less than the
traffic generated by the front-end servers. The back-end routing
engine 200 can accomplish this by doing two rounds of the traffic
routing determination. In the first round, the back-end routing
engine can distribute traffic by obeying the maximum QPS. If
additional undistributed traffic remains, the back-end routing
engine can perform a second round of traffic route determination
(e.g., as in the method 400) while not obeying the maximum QPS to
distribute the remaining traffic.
[0035] In some embodiments, the metric measurements 202 may be
configured to have an inverse relationship with the traffic data
206 (i.e. the metric measurements 202 decreases as the traffic data
206 increases). The performance metric dimension in this example
can be idle central processing unit (CPU) percentage. In this case,
the performance metric threshold can decrease for each border in
the server group hierarchy 300.
[0036] In some embodiments, if all servers in a server group are
not responsive to polls for metric measurements, the back-end
routing engine 200 can automatically drain that server group. In
some embodiments, the back-end routing engine 200 can utilize
multiple performance metric dimensions. In these embodiments, an
application service owner can send multiple performance metric
thresholds per hierarchy tier corresponding to the multiple
performance metric dimensions. At any level with multiple
performance metric thresholds set for multiple performance metric
dimensions, the metric measurements (e.g., normalized by the
magnitude of the threshold value) that result in the lowest
capacity for that hierarchy level/tier, along with its associated
derivative, can be used for distributing traffic to that server
group at that hierarchy level.
[0037] In various embodiments, the back-end routing engine 200
advantageously balances the often competing objectives of sending
requests to the least loaded server and sending requests to a
server that is nearby in the network. When set up with proper
performance metric thresholds, the back-end routing engine 200
enables the server group hierarchy 300 to offload requests to
servers that are slightly further away during periods of high local
load, which enables the back-end routing engine 200 to provision
fewer servers locally, thus saving precious computational
resource.
[0038] Functional components (e.g., devices, engines, modules, and
data repositories, etc.) associated with the social networking
system 100, the back-end routing engine 200, and/or the server
group hierarchy 300, can be implemented as a combination of
circuitry, firmware, software, or other executable instructions.
For example, the functional components can be implemented in the
form of special-purpose circuitry, in the form of one or more
appropriately programmed processors, a single board chip, a field
programmable gate array, a network-capable computing device, a
virtual machine, a cloud computing environment, or any combination
thereof. For example, the functional components described can be
implemented as instructions on a tangible storage memory capable of
being executed by a processor or other integrated circuit chip. The
tangible storage memory may be volatile or non-volatile memory. In
some embodiments, the volatile memory may be considered
"non-transitory" in the sense that it is not a transitory signal.
Memory space and storages described in the figures can be
implemented with the tangible storage memory as well, including
volatile or non-volatile memory.
[0039] Each of the functional components may operate individually
and independently of other functional components. Some or all of
the functional components may be executed on the same host device
or on separate devices. The separate devices can be coupled through
one or more communication channels (e.g., wireless or wired
channel) to coordinate their operations. Some or all of the
functional components may be combined as one component. A single
functional component may be divided into sub-components, each
sub-component performing separate method step or method steps of
the single component.
[0040] In some embodiments, at least some of the functional
components share access to a memory space. For example, one
functional component may access data accessed by or transformed by
another functional component. The functional components may be
considered "coupled" to one another if they share a physical
connection or a virtual connection, directly or indirectly,
allowing data accessed or modified by one functional component to
be accessed in another functional component. In some embodiments,
at least some of the functional components can be upgraded or
modified remotely (e.g., by reconfiguring executable instructions
that implements a portion of the functional components). Other
arrays, systems and devices described above may include additional,
fewer, or different functional components for various
applications.
[0041] FIG. 4 is a flow chart illustrating a method 400 of
operating a back-end routing server system, in accordance with
various embodiments. The back-end routing server system can be a
single computing device (e.g., the computing device 500) or two or
more computing devices working in sync as a collective computer
system. In some embodiments, the back-end routing server system can
include a computing device for updating a routing table specific
for a back-end service. A back-end routing application/engine that
runs a back-end routing service can be implemented on the back-end
routing server system.
[0042] At step 402, the back-end routing server system can receive
timestamped traffic data associated with application service
requests from one or more front-end servers of a social networking
system (e.g., the social networking system 100). The timestamped
traffic data can include traffic data that characterizes amount of
requests sent to a server group of one or more back-end servers
that corresponds to a geographical tier in a server group hierarchy
(e.g., the server group hierarchy 300) of application service
back-end servers in the social networking system.
[0043] In some embodiments, the back-end routing server system
receives the timestamped traffic data from the front-end servers
themselves. In some embodiments, a traffic data collection engine
running on a standalone computer server or a computer server system
can collect the timestamped traffic data from the front-end
servers. In these embodiments, the back-end routing server system
receives the timestamped traffic data from the traffic data
collection engine. The front-end servers are computer servers that
process user queries/requests to the social networking system.
[0044] At step 404, the back-end routing server system can receive
one or more timestamped metric measurements in a performance metric
dimension for the server group. In some embodiments, the back-end
routing server system can receive metric measurements in multiple
performance metric dimensions. In some embodiments, the one or more
timestamped metric measurements are collected periodically.
[0045] At step 406, the back-end routing server system can also
receive, for the geographical tier of the server group, a
performance metric threshold corresponding to the performance
metric dimension. In some embodiments, the back-end routing server
system can receive multiple performance metric thresholds for the
geographical tier corresponding to the multiple performance metric
dimensions. In some embodiments, the back-end routing server system
can receive multiple performance metric thresholds corresponding to
multiple geographical tiers in the server group hierarchy. In some
embodiments, a performance metric threshold can correspond to a
particular geographical tier and a particular performance metric
dimension. The server group can correspond to the geographical tier
in the server group hierarchy. In some embodiments, the back-end
routing server system can receive timestamped metric measurements
and performance metric thresholds from different server groups of
back-end servers corresponding to different geographical tiers in
the server group hierarchy.
[0046] The back-end servers are computer servers that correspond to
one or more application services. The application services can
process data for the frond-end servers. In some embodiments, the
back-end routing server system is a standalone computer system
separate from the front-end servers and the back-end servers. In
some embodiments, the back-end routing server system is integrated
with or part of the back-end servers and/or the frond-end
servers.
[0047] At step 408, the back-end routing server system can collect
and pair the traffic data and the metric measurements into sample
points based on timestamps of the traffic data and the metric
measurements. A unit traffic data and a unit metric measurement
with similar timestamps (e.g., within a threshold time window from
each other) can be labeled as a single sample point. At step 410,
based on timestamp data, the back-end routing server system can
expire or reduce the weight of at least a subset of the collected
data points according to a decay function. For example, the decay
function can be an exponential decay function.
[0048] At step 412, the back-end routing server system can estimate
a linear relationship (e.g., a linear derivative estimate) between
variable traffic data and variable performance metric in the
performance metric dimension based on collected sample points
respectively representing the received traffic data and the
received performance metrics at different time points. In some
embodiments, the linear relationship is based on the collected
sample points representing the request traffic data and the
corresponding performance metrics in a back-end server, a server
cluster that includes at least a back-end server, a data center
that includes at least a server cluster, or a region that includes
at least a datacenter. In some embodiments, the back-end routing
server system estimates the linear relationship based on
statistical summaries of the collected data points (e.g., within
one or more intervals of computing the routing table).
[0049] At step 414, the back-end routing server system can compute,
based on the linear derivative estimate (e.g., the linear
relationship), a threshold traffic capacity of the server group
before the variable performance metric is estimated to exceed the
performance metric threshold. At step 416, the back-end routing
server system can generate and publish, based on the threshold
traffic capacity, a routing table for access by a front-end server
to facilitate selection of a back-end server destination from
amongst the application service back-end servers. In some
embodiments, the routing table is generated/updated
periodically.
[0050] The threshold traffic capacity can determine whether a
selected back-end server is in the server group or outside of the
server group. For example, the threshold traffic capacity can
determine how frequent that the front-end server selects a back-end
server outside of the server group as the back-end server
destination. For example, the back-end routing server system can
compute an estimated performance metric of the server group based
on the estimated linear derivative and a current traffic amount to
the server group. The routing table can then be configured to
instruct the front-end server to route traffic outside of the
server group responsive to the estimated performance metric
exceeding the performance metric threshold associated with the
geographical tier. The routing table can be configured to instruct
the front-end server to route traffic outside of the server group
and in another server group within the geographical tier responsive
to the estimated performance metric exceeding the performance
metric threshold associated with the geographical tier.
[0051] The application service back-end servers are potential
traffic destinations in the routing table. The routing table
enables a set of front-end servers to route traffic outside of the
server group responsive to the performance metric threshold
associated with the geographical tier is exceeded.
[0052] For example, the back-end routing server system is
configured to publish the routing table for the front-end servers
whose traffic data were used to compute the linear derivative
estimate. For example, if the performance metric threshold of the
geographical tier is exceeded by the current or estimated metric
measurements of server group, the back-end routing engine publishes
a new routing table for front-end servers to use that reflects
traffic routes outside of the boundary of the server group, even if
the front-end servers are within the boundary of the server
group.
[0053] In some embodiments, the routing table is a lookup table
keyed by specific application service type and origin server
cluster (e.g., the server cluster of the front-end server that is
performing the look-up). The value of the lookup table can be the
identity or identities of one or more back-end servers. The routing
table can be configured to specify routing logic corresponding to
all of the geographical tiers of the server group hierarchy.
[0054] In some embodiments, the back-end routing server system can
receive a specification of a blackhole threshold for the
geographical tier or to all tiers of the server group hierarchy.
The back-end routing server system can configure the routing table
to route traffic away from a back-end server responsive to the
estimated metric measurement of the server group crossing the
blackhole threshold. In some embodiments, the back-end routing
server system can specify a traffic movement limit for the
geographical tier or to all tiers of the server group hierarchy.
The back-end routing server system can configure the routing table
such that the generated routing table moves traffic to or away from
the server group no more than the traffic movement limit as
compared to a last iteration of the routing table.
[0055] In some embodiments, the back-end routing server system can
specify a maximum traffic amount for the server group. When
generating the routing table, the back-end routing server system
can configure, in a first interval, the routing table to distribute
traffic without exceeding the maximum traffic amount for the server
group and configure, in a second interval following the first
interval, the routing table to distribute traffic without regard to
the maximum traffic amount.
[0056] In some embodiments, the back-end routing server system can
determine whether the server group stops reporting further
timestamped metric measurements. The back-end routing server system
can then update the routing table to route traffic away from the
server group.
[0057] In some embodiments, the back-end routing server system can
receive multiple performance metric thresholds for the geographical
tier and an application service type. The multiple performance
metric thresholds can correspond respectively to multiple
performance metric dimensions. The back-end routing server system
can compute multiple traffic capacities of the server group
respective for before the variable performance metric of each
performance metric dimension is estimated to exceed each
performance threshold. The back-end routing server system can then
select the lowest of the multiple traffic capacities as the
threshold traffic capacity.
[0058] While processes or blocks are presented in a given order,
alternative embodiments may perform routines having steps, or
employ systems having blocks, in a different order, and some
processes or blocks may be deleted, moved, added, subdivided,
combined, and/or modified to provide alternative or
subcombinations. Each of these processes or blocks may be
implemented in a variety of different ways. In addition, while
processes or blocks are at times shown as being performed in
series, these processes or blocks may instead be performed in
parallel, or may be performed at different times. When a process or
step is "based on" a value or a computation, the process or step
should be interpreted as based at least on that value or that
computation.
[0059] FIG. 5 is a block diagram of an example of a computing
device 500, which may represent one or more computing device or
server described herein, in accordance with various embodiments.
The computing device 500 can implement one or more computing
devices that implement the social networking system 100 of FIG. 1,
the back-end routing engine 200 of FIG. 2, and/or the server group
hierarchy 300 of FIG. 3. The computing device 500 can execute at
least part of the method 400 of FIG. 4. The computing device 500
includes one or more processors 510 and memory 520 coupled to an
interconnect 530. The interconnect 530 is an abstraction that
represents any one or more separate physical buses, point-to-point
connections, or both connected by appropriate bridges, adapters, or
controllers. The interconnect 530, therefore, may include, for
example, a system bus, a Peripheral Component Interconnect (PCI)
bus or PCI-Express bus, a HyperTransport or industry standard
architecture (ISA) bus, a small computer system interface (SCSI)
bus, a universal serial bus (USB), IIC (I2C) bus, or an Institute
of Electrical and Electronics Engineers (IEEE) standard 1394 bus,
also called "Firewire".
[0060] The processor(s) 510 is/are the central processing unit
(CPU) of the computing device 500 and thus controls the overall
operation of the computing device 500. In certain embodiments, the
processor(s) 510 accomplishes this by executing software or
firmware stored in memory 520. The processor(s) 510 may be, or may
include, one or more programmable general-purpose or
special-purpose microprocessors, digital signal processors (DSPs),
programmable controllers, application specific integrated circuits
(ASICs), programmable logic devices (PLDs), trusted platform
modules (TPMs), or the like, or a combination of such devices.
[0061] The memory 520 is or includes the main memory of the
computing device 500. The memory 520 represents any form of random
access memory (RAM), read-only memory (ROM), flash memory, or the
like, or a combination of such devices. In use, the memory 520 may
contain a code 570 containing instructions according to the mesh
connection system disclosed herein.
[0062] Also connected to the processor(s) 510 through the
interconnect 530 are a network adapter 540 and a storage adapter
550. The network adapter 540 provides the computing device 500 with
the ability to communicate with remote devices, over a network and
may be, for example, an Ethernet adapter or Fibre Channel adapter.
The network adapter 540 may also provide the computing device 500
with the ability to communicate with other computers. The storage
adapter 550 enables the computing device 500 to access a persistent
storage, and may be, for example, a Fibre Channel adapter or SCSI
adapter.
[0063] The code 570 stored in memory 520 may be implemented as
software and/or firmware to program the processor(s) 510 to carry
out actions described above. In certain embodiments, such software
or firmware may be initially provided to the computing device 500
by downloading it from a remote system through the computing device
500 (e.g., via network adapter 540).
[0064] The techniques introduced herein can be implemented by, for
example, programmable circuitry (e.g., one or more microprocessors)
programmed with software and/or firmware, or entirely in
special-purpose hardwired circuitry, or in a combination of such
forms. Special-purpose hardwired circuitry may be in the form of,
for example, one or more application-specific integrated circuits
(ASICs), programmable logic devices (PLDs), field-programmable gate
arrays (FPGAs), etc.
[0065] Software or firmware for use in implementing the techniques
introduced here may be stored on a machine-readable storage medium
and may be executed by one or more general-purpose or
special-purpose programmable microprocessors. A "machine-readable
storage medium," as the term is used herein, includes any mechanism
that can store information in a form accessible by a machine (a
machine may be, for example, a computer, network device, cellular
phone, personal digital assistant (PDA), manufacturing tool, any
device with one or more processors, etc.). For example, a
machine-accessible storage medium includes
recordable/non-recordable media (e.g., read-only memory (ROM);
random access memory (RAM); magnetic disk storage media; and/or
optical storage media; flash memory devices), etc.
[0066] The term "logic," as used herein, can include, for example,
programmable circuitry programmed with specific software and/or
firmware, special-purpose hardwired circuitry, or a combination
thereof.
[0067] Some embodiments of the disclosure have other aspects,
elements, features, and steps in addition to or in place of what is
described above. These potential additions and replacements are
described throughout the rest of the specification. Reference in
this specification to "various embodiments" or "some embodiments"
means that a particular feature, structure, or characteristic
described in connection with the embodiment is included in at least
one embodiment of the disclosure. Alternative embodiments (e.g.,
referenced as "other embodiments") are not mutually exclusive of
other embodiments. Moreover, various features are described which
may be exhibited by some embodiments and not by others. Similarly,
various requirements are described which may be requirements for
some embodiments but not other embodiments. Reference in this
specification to where a result of an action is "based on" another
element or feature means that the result produced by the action can
change depending at least on the nature of the other element or
feature.
* * * * *