U.S. patent application number 16/267263 was filed with the patent office on 2019-06-06 for point of presence based data uploading.
The applicant listed for this patent is Amazon Technologies, Inc.. Invention is credited to David Alexander Dunlap, Katarzyna Anna Puchala, Anton Stephen Radlein.
Application Number | 20190173941 16/267263 |
Document ID | / |
Family ID | 65496071 |
Filed Date | 2019-06-06 |
United States Patent
Application |
20190173941 |
Kind Code |
A1 |
Puchala; Katarzyna Anna ; et
al. |
June 6, 2019 |
POINT OF PRESENCE BASED DATA UPLOADING
Abstract
A system, method and computer-readable medium for data uploading
based on points of presence (POPs) are provided. In response to a
client's request for data uploading, the system provides routing
information for POPs that may facilitate data communications
between the client and a data storage service provider. The client
may fragment the upload data and transmit the data fragments via
data connections to POPs, which in turn may relay the received
fragments to the data storage service provider. Upon receipt of
necessary data fragments, the data storage service provider may
merge the data fragments to reconstruct a copy of the upload data
for storage.
Inventors: |
Puchala; Katarzyna Anna;
(Kirkland, WA) ; Radlein; Anton Stephen; (Seattle,
WA) ; Dunlap; David Alexander; (Seattle, WA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Amazon Technologies, Inc. |
Seattle |
WA |
US |
|
|
Family ID: |
65496071 |
Appl. No.: |
16/267263 |
Filed: |
February 4, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14666205 |
Mar 23, 2015 |
10225326 |
|
|
16267263 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04L 67/1097 20130101;
H04L 12/5691 20130101; H04L 69/18 20130101; H04L 67/06 20130101;
H04L 47/125 20130101 |
International
Class: |
H04L 29/08 20060101
H04L029/08; H04L 12/803 20060101 H04L012/803 |
Claims
1. A computer-implemented method comprising: under control of a
client computing device configured with specific computer
executable instructions: transmitting, to one or more computing
devices, a request to upload target data from the client computing
device to the one or more computing devices; receiving routing
information corresponding to a set of intermediate devices selected
by the one or more computing devices to facilitate uploading of the
target data from the client computing device to the one or more
computing devices; transmitting a first fragment of the target data
to a first intermediate device in the set of intermediate devices
and a second fragment of the target data to a second intermediate
device in the set of intermediate devices; and transmitting a third
fragment of the target data to a third intermediate device not in
the set of intermediate devices instead of to the second
intermediate device in response to a service interruption between
the client computing device and the second intermediate device.
2. The computer-implemented method of claim 1, further comprising
generating the first fragment in accordance with a data
fragmentation encoding.
3. The computer-implemented method of claim 2, wherein the data
fragmentation encoding comprises forward error correction code.
4. The computer-implemented method of claim 1, wherein transmitting
the first fragment further comprises transmitting the first
fragment via a first network path, wherein the first intermediate
device corresponds to a destination of the first network path.
5. The computer-implemented method of claim 4, wherein the first
intermediate device transmits the first fragment to the one or more
computing devices via a second network path, wherein the first
intermediate device corresponds to a source of the second network
path.
6. The computer-implemented method of claim 5, wherein the one or
more computing devices receives the second fragment from the second
intermediate device, and wherein the one or more computing devices
merges the first fragment and the second fragment to form at least
part of a copy of the target data.
7. The computer-implemented method of claim 1, wherein the request
to upload target data includes at least one of a size, type, or
priority associated with the target data.
8. The computer-implemented method of claim 1, wherein the set of
intermediate devices are selected based on performance information
associated with at least one of the intermediate devices in the set
of intermediate devices.
9. The computer-implemented method of claim 8, wherein the
performance information corresponds to at least one of latency,
geographic proximity, bandwidth, throughput, capacity, cost, load,
or availability.
10. A non-transitory computer readable storage medium storing
computer executable instructions that when executed by one or more
processors of a client computing device perform operations
comprising: transmitting, to a data storage system, a request to
upload target data; processing routing information received for a
set of intermediate devices; transmitting a first fragment of the
target data to a first intermediate device in a first subset of
intermediate devices of the set of intermediate devices, the first
subset of intermediate devices selected to facilitate the target
data upload; identifying a connectivity issue between the client
computing device and the first intermediate device; and
transmitting a second fragment to a second intermediate device in
the set of intermediate devices instead of to the first
intermediate device, wherein the second intermediate device is not
in the first subset of intermediate devices.
11. The non-transitory computer readable storage medium of claim
10, wherein the operations further comprise transmitting the first
fragment to the first intermediate device in accordance with a
first network protocol, and wherein the first intermediate device
transmits the first fragment to the data storage system in
accordance with a second network protocol.
12. The non-transitory computer readable storage medium of claim
11, wherein the data storage system reconstructs the target data
using at least the first fragment and the second fragment.
13. The non-transitory computer readable storage medium of claim
10, wherein the routing information comprises performance
information corresponding to the first intermediate device.
14. The non-transitory computer readable storage medium of claim
13, wherein the operations further comprise selecting the first
subset of intermediate devices based, at least in part, on the
performance information.
15. The non-transitory computer readable storage medium of claim
13, wherein the performance information corresponds to at least one
of latency, geographic proximity, bandwidth, throughput, capacity,
cost, load, or availability.
16. The non-transitory computer readable storage medium of claim
10, wherein the request to upload target data includes at least one
of a size, type, or priority associated with the target data.
17. A system comprising: a data store configured to at least store
computer-executable instructions; and a processor in communication
with the data store, the processor configured to execute the
computer-executable instructions to at least: transmit, to a data
storage system, a request to upload data; in response to the
request, process routing information received for a set of
intermediate devices; transmit a first portion of the data to a
first intermediate device in a first subset of intermediate devices
of the set of intermediate devices, the first subset of
intermediate devices selected to facilitate the data upload;
identify a connectivity issue between the system and the first
intermediate device; and transmit a second portion of the data to a
second intermediate device in the set of intermediate devices
instead of to the first intermediate device, wherein the second
intermediate device is not a member of the first subset of
intermediate devices.
18. The system of claim 17, wherein the first subset of
intermediate devices are selected by the processor to facilitate an
upload of the data.
19. The system of claim 17, wherein the processor is further
configured to execute the computer-executable instructions to at
least generate the first portion of the data and the second portion
of the data based on at least one of a horizontal, vertical,
sequential, or randomized scheme applied to the data.
20. The system of claim 17, wherein the data storage system
reconstructs the data using at least the first portion and the
second portion.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent
application Ser. No. 14/666,205, entitled "POINT OF PRESENCE BASED
DATA UPLOADING" and filed on Mar. 23, 2015, which is hereby
incorporated by reference herein in its entirety.
BACKGROUND
[0002] Generally described, computing devices and communication
networks can be utilized to exchange information. In a common
application, a computing device can upload data to another
computing device via a communication network. For example, a user
at a personal computing device can utilize a data transfer protocol
to send digital media files, computer executable code, system
backup images, etc., to a server computing device via the Internet.
In such embodiments, the user computing device can be referred to
as a client computing device and the server computing device can be
referred to as a data storage service provider. In another common
application, a client computing device can request data from
another computing device via a communication network. For example,
a user at a client computing device can utilize a software browser
application to request a Web page or application from a server
computing device via the Internet. In such embodiments, the server
computing device can be referred to as a content provider.
[0003] Some data storage service providers are associated with
content providers, which may facilitate the delivery of requested
content, such as Web pages or resources, through the utilization of
a point of presence ("POP") service provider. A POP service
provider typically maintains a number of computing devices,
generally referred to as "points of presence" or "POPs" in a
communication network. The POPs can include data storage components
that maintain content from various content providers. In turn,
content providers can instruct, or otherwise suggest to, client
computing devices to request some, or all, of a content provider's
content from the POPs, allowing content providers to deliver
content closer to clients.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] The foregoing aspects and many of the attendant advantages
will become more readily appreciated as the same become better
understood by reference to the following detailed description, when
taken in conjunction with the accompanying drawings. Throughout the
drawings, reference numbers may be re-used to indicate
correspondence between referenced elements. The drawings are
provided to illustrate example embodiments described herein and are
not intended to limit the scope of the disclosure.
[0005] FIG. 1 is a block diagram illustrative of a data
communication environment including a number of client computing
devices, a routing service provider, a data storage service
provider, and a point of presence service provider;
[0006] FIG. 2 is a block diagram of the data communication
environment of FIG. 1 illustrating POP routing information being
provided in response to a request from a client computing
device;
[0007] FIG. 3 is a block diagram of the data communication
environment of FIG. 1 illustrating the generation, transmitting and
merging of fragments of upload data; and
[0008] FIG. 4 is a flowchart illustrative of a parallelized data
uploading routine implemented by a routing service provider and a
data storage service provider.
DETAILED DESCRIPTION
[0009] Generally described, the present disclosure is directed to
data communication between a client computing device and a data
storage service provider via one or more intermediate devices or
systems. Specifically, aspects of the disclosure will be described
with regard to uploading data from a client computing device to a
data storage service provider utilizing multiple points of presence
(POPs). Additionally, aspects of the disclosure will be described
with regard to fragmentation of upload data by the client computing
device, transmission of the data fragments via POPs, and merging
the data fragments by the data storage service provider.
[0010] In accordance with an illustrative embodiment, a data
storage service provider is communicatively connected with one or
more POPs. For example, the data storage service provider may
correspond to or otherwise be associated with a content provider,
which utilizes a POP service provider for delivering content to
client computing devices. Illustratively, a POP service provider
may correspond to a content delivery network (CDN) service
provider, which maintains multiple POP locations across a
communication network and assists the content provider in efficient
content delivery to clients. Alternatively, the data storage
service provider may include or be directly associated with
multiple POPs to facilitate communications with client computing
devices.
[0011] When the data storage service provider receives a request
from a client computing device to upload data, the data storage
service determines which POPs may facilitate the uploading. This
can be facilitated by a routing service provider associated with
the data storage service provider. Illustratively, the associated
routing service provider can make this determination based on POP
performance information, such as latency, geographic proximity,
bandwidth, throughput, capacity, cost, load or availability. In
some embodiments, the routing service provider maintains and
updates the POP performance information, based on characteristics
of past or ongoing communications with the POPs. In other
embodiments, the routing service provider obtains relevant POP
performance information from an associated POP service provider.
The routing service provider then provides routing information
regarding the determined POPs to the client computing device. For
example, the routing service provider may provide Internet Protocol
(IP) addresses corresponding to the POPs to the client computing
device. Alternatively or in addition, the routing service provider
may request the associated POP service provider to determine POPs
that may facilitate the requested data upload and to provide
routing information regarding the determined POPs.
[0012] Upon receipt of the routing information, the client
computing device may further evaluate the POPs included in the
routing information and decide which POPs to use for data
uploading. For example, the client computing device may test the
speed, robustness, stability, protocol compatibility or other
characteristics of communication with individual POPs. The client
computing device may fragment the data to be uploaded and establish
network connections with at least a subset of the POPs based on the
routing information and/or the POP evaluation results. The client
computing device then attempts to distribute and transmit the data
fragments to each of the subset of POPs, and may adjust respective
quantities of data fragments that are being assigned to different
POPs based on the performance of data transmissions thereto. In
some embodiments, the client computing device may decide to cease
data transmission to certain POPs due to inadequate performance. In
some embodiments, the client computing device may redirect
transmission of certain data fragments to POPs that were not
initially selected to facilitate the data upload.
[0013] The POPs that have received at least some of the data
fragments may forward or relay the data fragments using their
existing communication channels, such as network paths via a
backbone or overlay network, to the data storage service provider.
Upon receipt of the data fragments relayed from one or more POPs,
the data storage service provider may merge or otherwise process
the fragments to reconstruct a single copy of the upload data and
confirm completion of data upload with the client computing device.
In some embodiments, redundancies are built into the data
fragmentation (e.g., based on a forward error correction code) so
that the data storage service provider does not need to receive all
the data fragments and may reconstruct the upload data based on a
sufficiently large proportion of the data fragments.
[0014] Although various aspects of the disclosure will be described
with regard to illustrative examples and embodiments, one skilled
in the art will appreciate that the disclosed embodiments and
examples should not be construed as limiting. For example, although
aspects of the disclosure will be described with regard to specific
service providers such as a data storage service provider, a
routing service provider or a POP service provider, one skilled in
the relevant art will appreciate that aspects of the disclosure may
be implemented by a single service provider or various types of
service providers, or that a service provider implementing aspects
of the disclosure is not required to have the specific components
utilized in the illustrative examples.
[0015] FIG. 1 is a block diagram illustrative of a data
communication environment 100 for the management and processing of
data uploads. As illustrated in FIG. 1, the data communication
environment 100 includes a number of client computing devices 102
("clients") uploading data or otherwise communicating with a
routing service provider, a data storage service provider, a POP
service provider, or other service providers. In an illustrative
embodiment, the clients 102 can correspond to a wide variety of
computing devices including desktop computers, laptop computers,
tablets, personal digital assistants (PDAs), mobile phones,
electronic book readers, other wireless handheld devices, set-top
or other television boxes, media players, video game platforms,
kiosks, and/or the like.
[0016] In an illustrative embodiment, the clients 102 include
necessary hardware and software components for establishing
communications over a communication network 108. For example, the
client computing devices 102 may be equipped with networking
equipment and browser software applications that facilitate
communications via the network 108. In particular, the clients 102
may include or otherwise be associated with a data upload module
112, implemented in hardware or software. The data upload module
112 may transmit data upload requests, receive POP routing
information, generate fragments of upload data, establish
connections with POPs, transmit upload data fragments, and/or
implement other related functionalities as disclosed herein.
[0017] The network 108 can be a publicly accessible network of
linked networks, possibly operated by various distinct parties,
such as the Internet. In other embodiments, the network 108 may
include a private network, personal area network ("PAN"), LAN, WAN,
cable network, satellite network, any other medium of computer data
transfer, or some combination thereof.
[0018] The data communication environment 100 can also include a
routing service provider 103 in communication with the one or more
clients 102 via the communication network 108. The routing service
provider 103 illustrated in FIG. 1 corresponds to a logical
association of one or more computing devices associated with a
routing service provider, a data storage service provider and/or a
content provider. Specifically, the routing service provider 103
can include a POP routing service 110 corresponding to one or more
computing devices for obtaining and processing POP routing requests
for uploading data from the clients 102 to a data storage service
provider 104 and for providing POP routing information in response.
The POP routing service 110 may be associated with a data store
maintaining performance data, such as latency, geographic
proximity, bandwidth, throughput, capacity, cost, load or
availability, for individual POPs. The performance data may be
generated based on past or ongoing data communications between the
data storage service provider 104 and individual POPs.
Alternatively or in addition, the performance data can be
constantly updated by POPs or their associated service
provider.
[0019] The data communication environment 100 can further include a
data storage service provider 104 in communication with the one or
more clients 102 and the routing service provider 103 via the
communication network 108. The data storage service provider 104
illustrated in FIG. 1 corresponds to a logical association of one
or more computing devices associated with a data storage service
provider and/or a content provider. Specifically, the data storage
service provider 104 can include a data storage service 120 and
associated storage component corresponding to one or more computing
devices for obtaining and merging upload data fragments and for
storing reconstructed copies of upload data.
[0020] One skilled in the relevant art will appreciate that the
routing service provider 103 or data storage service provider 104
can be associated with various additional computing resources, such
additional computing devices for administration of data and
resources, DNS nameservers, and the like. For example, although not
illustrated in FIG. 1, the routing service provider 103 or data
storage service provider 104 can be associated with one or more DNS
nameserver components that receive DNS queries associated with the
domain of the data storage service provider 104 and that would be
authoritative to resolve client computing device DNS queries
corresponding to a domain of the data storage service provider
(e.g., return one or more IP addresses responsive to the DNS
query).
[0021] With continued reference to FIG. 1, the data communication
environment 100 can further include a POP service provider 106 in
communication with the one or more clients 102, the routing service
provider 103 and the data storage service providers 104 via the
communication network 108. The POP service provider 106 illustrated
in FIG. 1 corresponds to a logical association of one or more
computing devices associated with a POP service provider.
Specifically, the POP service provider 106 can include a number of
point of presence ("POP") locations 116 that correspond to nodes on
the communication network 108. Each POP 116 may include a data
storage component made up of a number of computing devices for
caching or storing data for the data storage service provider 104,
an associated content provider, or other service providers. In some
embodiments, the POP service provider may include or be associated
with a data store for maintaining information regarding individual
POP performance with respect to different service providers and/or
clients.
[0022] Although the POPs 116 are illustrated in FIG. 1 as logically
associated with the POP provider 106, the POPs 116 can be
geographically distributed throughout the communication network 108
in a manner to best serve various demographics of clients 102.
Additionally, one skilled in the relevant art will appreciate that
the POP service provider 106 can be associated with various
additional computing resources, such as DNS nameservers, computing
devices or components for rearranging, regrouping, or otherwise
manipulating data fragments, and the like.
[0023] With reference now to FIG. 2 and FIG. 3, the interaction
between various components of the data environment 100 of FIG. 1
will be illustrated. For purposes of the example, however, the
illustration has been simplified such that many of the components
utilized to facilitate communications are not shown. One skilled in
the relevant art will appreciate that such components can be
utilized and that additional interactions would accordingly occur
without departing from the spirit and scope of the present
disclosure.
[0024] FIG. 2 is a block diagram of the data communication
environment 100 of FIG. 1 illustrating POP routing information
being provided in response to a client computing device's POP
routing request for data upload. As illustrated in FIG. 2, at (1),
a client 102 transmits a request for POP routing information to the
POP routing service 110. In some embodiments, the request may
correspond to a form of DNS query. In other embodiments, the client
102 may utilize an application program interface ("API") to send
this request to the POP routing service 110. The request may
include information about the data to be uploaded, such as one or
more file identifiers, sizes, types, or priorities associated with
the data. The request may also include information about the
requesting client 102, such as geographic or network-related
location information, computational or networking resources, data
fragmentation preferences, data transfer capability or limitations,
etc.
[0025] At (2), the POP routing service 110 processes the data
upload request. The POP routing service 110 may identify
information about the upload data and requesting client, and
retrieve relevant POP performance data for determining POPs 116
that are potentially suitable to facilitate the data uploading.
Optionally at (3), the POP routing service 110 may request and
retrieve from the POP service provider 106 additional or specific
POP information that may assist the analysis of POP performance.
For example, the POP routing service 110 may provide a geographic
or network location corresponding to the requesting client 102 and
request characteristics and performance data of POPs 116 that may
handle high volumes of traffic from the geographic or network
location, have short latencies or large bandwidths for
communicating with clients close to the location, or are otherwise
associated with the location.
[0026] At (4), the POP routing service 110 determines a list of
POPs 116 that are potentially suitable to facilitate the data
uploading. The determination of potentially suitable POPs may
include an analysis of throughput rate from the requesting client's
geographic region to the data storage service provider, ability to
handle the specific type of upload file or fragments, current load
and spare capacity, combination of the same, or the like, as
examples. In some embodiments, the POP routing service 110 may
filter out certain POPs 116 using thresholds on one or more
attributes included in the POP performance data. In other
embodiments, the POP routing service 110 may compute a suitability
score for individual POPs 116 based on a combination of performance
attribute values, and select a specified number of POPs with top
scores.
[0027] At (5), the POP routing service 110 sends POP routing
information for data uploading to the requesting client 102. For
example, the POP routing service 110 may send a list of candidate
POPs 116 with corresponding IP addresses or other network addresses
or identifiers to the requesting client 102 in response to its API
call for POP routing information. The POP routing information may
include a portion of POP performance data relevant to the requested
data upload. In some embodiments, the POP routing information may
include information for routing to the data storage service
provider 104 directly. For example, the POP routing service 110 may
have determined that one or more of the data storage service
provider's own servers are potentially suitable for receiving data
communications from the requesting client 102 directly.
Accordingly, POP routing information may list the one or more
servers of the data storage service provider 104 with corresponding
IP addresses or other network addresses or identifiers.
[0028] At (6), the POP routing service 110 provides information
regarding the upload data to the data storage service 120. For
example, the POP routing service 110 may provide one or more file
identifiers, sizes, types, priorities, or data fragmentation
preferences associated with the upload data so that the data
storage service 120 may perform appropriate actions (e.g., prepare
storage space, allocate computation or networking resources, etc.)
to facilitate the data upload. In some embodiments, the data
storage service 120 may receive such information directly from the
client 102 in another request.
[0029] FIG. 3 is a block diagram of the data communication
environment 100 of FIG. 1 illustrating the generation, transmitting
and merging of fragments of upload data. As illustrated in FIG. 3,
at (1), upon receipt of the routing information, the client 102
determines which POPs 116 to use for data uploading. The client 102
may further evaluate the POPs included in the routing information
and decide which POPs to use for data uploading. For example, the
client 102 may analyze performance characteristics associated with
the POPs and filter out POPs that do not satisfy certain threshold
standards. In some embodiment, the client 102 may actively test the
speed, robustness, stability, protocol compatibility or other
characteristics of communication with individual POPs. In some
embodiments, the client 102 may determine that a portion of the
upload data can be directly transmitted to the data storage service
provider 104, for example, due to an insufficient number of
available or suitable POPs.
[0030] With continued reference to FIG. 3, at (2), the client 102
fragments the data to be uploaded. The data fragmentation can be
based on the number, capacity, latency, bandwidth, stability, or
other performance characteristics of the selected POPs. For
example, the upload data can be divided into relatively large
fragments if the network connections to a majority of selected POPs
are associated with small latencies and high bandwidth. Conversely,
if connections to a majority of selected POPs are instable, smaller
fragments can be generated to facilitate error correction and data
resending. Of course, the data fragment size does not need to be
uniform. Larger or smaller fragments can be generated from the same
upload data to suit specific network connection conditions between
the client 102 and various selected POPs. In some embodiments, the
data fragmentation does not disrupt the completeness of individual
data files to be uploaded. In other words, each data fragment may
include one or more data files in their entirety, and a data file
will not be divided in anyway among multiple data fragments.
[0031] The data fragmentation can be horizontal, vertical,
sequential, or randomized, based on any existing schemes or methods
to fragment data files. The data fragmentation can be generated
based on plaintext data division or can be encoded, for example, by
using any forward error correction (FEC) code such as erasure code.
In either case, each fragment can be uniquely identified with a
respective identifier, and redundancies can be built into the
fragmentation so that a copy of the upload data can be
reconstructed from a subset of generated fragments.
[0032] At (3), the client 102 establishes network connections with
each of the selected POPs using their associated routing
information and starts transmitting fragments of the upload data to
the POPs. For example, the client 102 may establish independent
network paths between the client 102 and each of the selected POPs
116 (i.e., the client 102 being the source and a respective POP 116
being the destination) and may begin transferring the data
fragments in accordance with data communication protocols, such as
File Transfer Protocol (FTP), Hypertext Transfer Protocol (HTTP) or
any other public or proprietary protocols. It should be noted that
the client 102 may use the same or different protocols to
communicate with different POPs 116.
[0033] For each of the communicatively connected POPs, the client
102 may dynamically assign a portion or subset of the upload data
fragments to transfer, based on an assessment of the respective
performance of the POPs. For example, POPs with sufficient spare
capacity and connected to the client 102 with low latency and high
bandwidth connections may be assigned a larger number of or larger
sized data fragments. The client 102 may keep monitoring the
performance of each POP over the course of data fragment
transmission and adjust quantities, sizes or types of data
fragments that are being assigned to different POPs.
[0034] In some embodiments, the client 102 may transfer certain
fragments of the upload data directly to one or more servers of the
data storage service provider 104, basically treating the data
storage service provider as a POP. In some embodiments, a same
fragment of upload data may be assigned to transmit to multiple
POPs, such as those with questionable stability, to enhance the
robustness of the upload process. In some embodiments, the client
102 may decide to cease data transmission to certain POPs or the
data storage service provider 104 due to inadequate performance,
such as service interruptions, connectivity delays or failures. In
some embodiments, the client computing device may redirect
transmission of certain data fragments to POPs that were not
initially selected to facilitate the data upload.
[0035] Once individual POPs 116 receives at least some fragments of
upload data from the client 102, at (4), the POPs 116 may forward
or relay the data fragments to the data storage service 120.
Illustratively, the POPs 116 each may utilize an existing
communication channel or establish a new communication channel with
the data storage service provider 104 (or its subcomponent such as
the data storage service 120) for data fragment transmission
between the POP 116 and the data storage service 120. For example,
the POP 116 may establish a network path between the POP 116 and
the data storage service provider 104 or its subcomponents (i.e.,
the POP 116 being the source and the data storage service provider
104 or its subcomponent being the destination) over a backbone or
overlay network. It should be noted that the POP 116 may
communicate with the data storage service provider 104 in
accordance with the same or different data communication
protocol(s) as utilized for the data communication between the
client 102 and the POP 116.
[0036] Further, each POP 116 may rearrange, regroup or otherwise
manipulate the data fragments that it has received, in order to
efficiently forward or relay to the data storage service 120. It
should also be noted that some POPs 116 may not be able to
successfully relay all the received data fragments due to
connection issues between the POP and the data storage service 120.
In some embodiments, instead of relaying received data fragments to
the data storage service 120 directly, a POP 116 may establish
connections with other POPs and forward at least portions of the
received data fragments to the other POPs, which in turn may
forward to the data storage service 120.
[0037] At (5), the data storage service 120 obtains the relayed
data fragments and merges them to reconstruct a copy of the upload
data. The data storage service 120 may determine that it has
received all necessary fragments to reconstruct the upload data,
based on a known size of the upload data, an analysis of the unique
identifiers associated with the data fragments, an "upload
completion" message sent by the client 102, combination of the
same, or the like. As described above, in some embodiments,
redundancies are built into the data fragmentation and transmission
process (e.g., based on a forward error correction code or
duplicated transmission of data fragments) so that the data storage
service 120 does not need to receive all the data fragments and may
proceed with reconstruction of the upload data. In some
embodiments, the data storage service 120 may request information,
such as the data fragmentation encoding as applied, for merging the
data fragments, from the client 102. In other embodiments, the data
fragments are self-explanatory or otherwise provide guidance for
merging (e.g., sequentially linking the data fragments based on
their identifiers).
[0038] At (6), the data storage service 120 successfully merges the
data fragments to reconstruct a copy of the upload data and stores
the copy in an associated data store or database, either locally or
network-based. The data storage service 120 then transmits a
message to the client 102 confirming successful upload of the
data.
[0039] FIG. 4 is a flowchart illustrative of a parallelized data
uploading routine implemented by a routing service provider 103 and
a data storage service provider 104. The routine starts at block
400. At block 402, the routing service provider 103 obtains a
request for POP routing information for purposes of data upload to
the data storage service provider 104. For example, the routing
service provider 103 may receive an API call from a client 102 for
POP routing information. As described above, the request may
include information about the data to be uploaded, such as one or
more file identifiers, sizes, types, or priorities associated with
the data. The request may also include information about the
requesting client 102, such as geographic or network-related
location information, computational or networking resources, data
fragmentation preferences, data transfer capability or limitations,
etc. In some embodiments, the request may be received from one of
the POPs 116 or the POP service provider 106, which forwarded the
data upload request it had received from the client 102.
[0040] At block 404, the routing service provider 103 determines
POPs potentially suitable for the data upload request. The routing
service provider 103 may identify information about the upload data
and requesting client, and retrieve relevant POP performance data
for determining POPs 116 that are potentially suitable to
facilitate the data uploading. In some embodiments, the routing
service provider 103 may request and retrieve from the POP service
provider 106 additional or specific POP information that may assist
in the data storage service provider's analysis of POP performance.
In other embodiments, the routing service provider 104 may provide
information about the upload data or the requesting client to the
POP service provider 106 and request the POP service provider 106
to determine POPs 116 that may be appropriate for relaying
fragments of upload data between the client 102 and the data
storage service provider 104. As described above, a list of POPs
116 that that may facilitate the data uploading can be determined
based on an analysis of POP characteristics and performance data.
For example, throughput rate from the requesting client's
geographic region to the data storage service provider, ability to
handle the specific type of upload file or fragments, current load
and spare capacity, combination of the same, or the like, can be
included in the analysis.
[0041] At block 406, the routing service provider 104 sends POP
routing information for data uploading to the requesting client
102. For example, the routing service provider 104 may send a list
of candidate POPs 116 with corresponding IP addresses or other
network addresses or identifiers to the requesting client 102 in
response to its API call for POP routing information. In some
embodiments, the POP routing information may be provided to the
client 102 by the POP service provider 106. In some embodiments,
the POP routing information may include a portion of POP
performance data relevant to the requested data upload. In some
embodiments, the POP routing information may include information
for routing to the data storage service provider 104 directly.
[0042] At block 408, the data storage service provider 104 obtains
at least some portion of fragments of the upload data from one or
more POPs 116. Illustratively, individual POPs 116 may forward or
relay fragments of upload data the POP has received from the client
102 to the data storage service provider 104. As described above,
the POPs 116 each may utilize an existing communication channel or
establish a new communication channel with the data storage service
provider 104, such as a network path between the POP 116 and the
data storage service provider 104 (i.e., the POP 116 being the
source and the data storage service provider 104 being the
destination) over a backbone or overlay network. It should be noted
that the POP 116 may communicate with the data storage service
provider 104 in accordance with same or different data
communication protocol(s) as utilized for the data communication
between the client 102 and the POP 116. In something embodiments,
the data storage service provider 104 may receive some portion of
the upload data fragments from the client 102 directly via a
network path connecting the client 102 and the data storage service
provider 104.
[0043] At block 410, the data storage service provider 104
determines whether it has obtained sufficient data fragments to
reconstruct a complete copy of the upload data. Illustratively, the
data storage service provider 104 can make this determination based
on a known size of the upload data, an analysis of the unique
identifiers associated with the data fragments, an "upload
completion" message sent by the client 102, combination of the
same, or the like. As described above, in some embodiments,
redundancies are built into the data fragmentation and transmission
process (e.g., based on a forward error correction code, such as
erasure code, or duplicated transmission of data fragments) so that
the data storage service provider 104 does not need to receive all
the data fragments in order to reconstruct a copy of the upload
data. If the data storage service provider 104 determines that it
has not obtained sufficient number of data fragments yet, the
routine proceeds to block 408. Otherwise, the routine proceeds to
block 412.
[0044] At block 412, the data storage service provider 104 merges
or otherwise processes obtained data fragments to reconstruct a
complete copy of the upload data. In some embodiments, the data
storage service provider 104 may request additional information
such as applicable encoding, for merging or otherwise processing
the data fragments, from the client 102. In other embodiments, the
data fragments are self-explanatory or otherwise provide guidance
for merging (e.g., sequentially linking the data fragments based on
their identifiers). At block 414, the data storage service provider
104 completes reconstruction of a copy of the upload data and
transmits a confirmation message to the client 102. In some
embodiments, the confirmation may be relayed or forwarded to the
client 102 by a POP 116. The routine of FIG. 4 ends at block
416.
[0045] Depending on the embodiment, certain acts, events, or
functions of any of the methods described herein can be performed
in a different sequence, can be added, merged, or left out
altogether (e.g., not all described acts or events are necessary
for the practice of the algorithm). Moreover, in certain
embodiments, acts or events can be performed concurrently, e.g.,
through multi-threaded processing, interrupt processing, or
multiple processors or processor cores or on other parallel
architectures, rather than sequentially.
[0046] The various illustrative logical blocks, modules and method
elements described in connection with the embodiments disclosed
herein can be implemented as electronic hardware, computer software
or combinations of both. To clearly illustrate this
interchangeability of hardware and software, various illustrative
components, blocks, modules and steps have been described above
generally in terms of their functionality. Whether such
functionality is implemented as hardware or software depends upon
the particular application and design constraints imposed on the
overall system. The described functionality can be implemented in
varying ways for each particular application, but such
implementation decisions should not be interpreted as causing a
departure from the scope of the disclosure.
[0047] The various illustrative logical blocks and modules
described in connection with the embodiments disclosed herein can
be implemented or performed by a machine, such as a general purpose
processor, a digital signal processor (DSP), an application
specific integrated circuit (ASIC), a field programmable gate array
(FPGA) or other programmable logic device, discrete gate or
transistor logic, discrete hardware components, or any combination
thereof designed to perform the functions described herein. A
general purpose processor can be a microprocessor, but in the
alternative, the processor can be a controller, microcontroller, or
state machine, combinations of the same, or the like. A processor
can also be implemented as a combination of computing devices,
e.g., a combination of a DSP and a microprocessor, a plurality of
microprocessors, one or more microprocessors in conjunction with a
DSP core, or any other such configuration.
[0048] The elements of a method, process, or algorithm described in
connection with the embodiments disclosed herein can be embodied
directly in hardware, in a software module executed by a processor,
or in a combination of the two. A software module can reside in RAM
memory, flash memory, ROM memory, EPROM memory, EEPROM memory,
registers, hard disk, a removable disk, a CD-ROM or any other form
of computer-readable storage medium known in the art. A storage
medium can be coupled to the processor such that the processor can
read information from, and write information to, the storage
medium. In the alternative, the storage medium can be integral to
the processor. The processor and the storage medium can reside in
an ASIC. The ASIC can reside in a user terminal. In the
alternative, the processor and the storage medium can reside as
discrete components in a user terminal.
[0049] Conditional language used herein, such as, among others,
"can," "might," "may," "e.g." and the like, unless specifically
stated otherwise, or otherwise understood within the context as
used, is generally intended to convey that certain embodiments
include, while other embodiments do not include, certain features,
elements and/or states. Thus, such conditional language is not
generally intended to imply that features, elements and/or states
are in any way required for one or more embodiments or that one or
more embodiments necessarily include logic for deciding, with or
without author input or prompting, whether these features, elements
and/or states are included or are to be performed in any particular
embodiment. The terms "comprising," "including," "having,"
"involving" and the like are synonymous and are used inclusively,
in an open-ended fashion, and do not exclude additional elements,
features, acts, operations and so forth. Also, the term "or" is
used in its inclusive sense (and not in its exclusive sense) so
that when used, for example, to connect a list of elements, the
term "or" means one, some or all of the elements in the list.
[0050] Disjunctive language such as the phrase "at least one of X,
Y or Z," unless specifically stated otherwise, is otherwise
understood with the context as used in general to present that an
item, term, etc., may be either X, Y or Z, or any combination
thereof (e.g., X, Y and/or Z). Thus, such disjunctive language is
not generally intended to, and should not, imply that certain
embodiments require at least one of X, at least one of Y or at
least one of Z to each be present.
[0051] Unless otherwise explicitly stated, articles such as "a" or
"an" should generally be interpreted to include one or more
described items. Accordingly, phrases such as "a device configured
to" are intended to include one or more recited devices. Such one
or more recited devices can also be collectively configured to
carry out the stated recitations. For example, "a processor
configured to carry out recitations A, B and C" can include a first
processor configured to carry out recitation A working in
conjunction with a second processor configured to carry out
recitations B and C.
[0052] While the above detailed description has shown, described,
and pointed out novel features as applied to various embodiments,
it will be understood that various omissions, substitutions, and
changes in the form and details of the devices or algorithms
illustrated can be made without departing from the spirit of the
disclosure. As will be recognized, certain embodiments described
herein can be embodied within a form that does not provide all of
the features and benefits set forth herein, as some features can be
used or practiced separately from others. All changes which come
within the meaning and range of equivalency of the claims are to be
embraced within their scope.
* * * * *