U.S. patent application number 12/651928 was filed with the patent office on 2010-07-15 for return-link optimization for file-sharing traffic.
This patent application is currently assigned to ViaSat, Inc.. Invention is credited to William B. Sebastian.
Application Number | 20100179984 12/651928 |
Document ID | / |
Family ID | 42123182 |
Filed Date | 2010-07-15 |
United States Patent
Application |
20100179984 |
Kind Code |
A1 |
Sebastian; William B. |
July 15, 2010 |
RETURN-LINK OPTIMIZATION FOR FILE-SHARING TRAFFIC
Abstract
Methods, apparatuses, and systems for return-link optimization
are provided. Embodiments identify upload-after-download content
(e.g., file sharing content) upon download, and generate one or
more identifiers characterizing the content (e.g., a digest). The
identifiers are stored in a client-side server dictionary model
reflecting a presumption that the content is stored in a
server-side dictionary. When content is later uploaded, the server
dictionary model is used to identify when the upload content
matches previously downloaded content. When a match is detected,
the stored identifiers are used to generate a highly compressed
version of the upload content, which is then uploaded to the server
instead of uploading the full content data. In some embodiments,
similar techniques are used to optimize return link bandwidth usage
for upload-after-upload transactions.
Inventors: |
Sebastian; William B.;
(Quincy, MA) |
Correspondence
Address: |
TOWNSEND AND TOWNSEND AND CREW LLP;VIASAT, INC. (CLIENT #017018)
TWO EMBARCADERO CENTER, 8TH FLOOR
SAN FRANCISCO
CA
94111
US
|
Assignee: |
ViaSat, Inc.
Carlsbad
CA
|
Family ID: |
42123182 |
Appl. No.: |
12/651928 |
Filed: |
January 4, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61170359 |
Apr 17, 2009 |
|
|
|
61144363 |
Jan 13, 2009 |
|
|
|
Current U.S.
Class: |
709/203 |
Current CPC
Class: |
H04L 69/22 20130101;
H04L 12/1859 20130101; H04L 67/42 20130101; H04L 67/10 20130101;
H04L 45/7453 20130101; H04L 12/1881 20130101; H04L 12/1886
20130101; H04L 47/70 20130101; H04B 7/185 20130101; H04L 65/4076
20130101; H04L 12/1863 20130101; H04L 69/04 20130101; H04L 65/60
20130101 |
Class at
Publication: |
709/203 |
International
Class: |
G06F 15/16 20060101
G06F015/16 |
Claims
1. A method for managing return-link resource usage in a
communications system, the method comprising: receiving a first
content data block at a client-side device from a server-side
device, the server-side device being communicatively coupled with a
server dictionary and the client-side device being communicatively
coupled with a server dictionary model, the server dictionary model
configured to store identifiers associated with data blocks stored
on the server dictionary, each identifier having a substantially
smaller file size than its associated data block; calculating a
first identifier from the first content data block; storing the
first identifier in the server dictionary model at the client-side
device; removing the first content data block from the client-side
device; subsequent to storing the first identifier, receiving a
second content data block at the client-side device for upload to
the server-side device; calculating a second identifier from the
second content data block; determining whether the second
identifier matches the first identifier stored in the server
dictionary model; and when the second identifier matches the first
identifier, using the first identifier or the second identifier to
compress the second content data block into compressed content.
2. The method of claim 1, further comprising: communicating the
compressed content to the server-side device.
3. The method of claim 2, wherein the compressed content is the
second identifier.
4. The method of claim 1, wherein the first content data block is
removed from the client-side device substantially upon calculating
the first identifier.
5. The method of claim 1, further comprising: determining whether
the first content data block comprises file sharing content,
wherein the first content data block is removed from the
client-side device only when the content data block comprises file
sharing content.
6. The method of claim 5, wherein determining whether the first
content data block comprises file sharing content comprises:
determining whether the first content data block is configured
according to a file sharing protocol.
7. The method of claim 5, wherein determining whether the first
content data block comprises file sharing content comprises:
determining a probability that a substantially identical content
data block will be received by the client-side device for upload to
the server-side device at a subsequent time.
8. The method of claim 1, further comprising: storing the first
content data block in a client data store configured such that the
client-side device has substantially no access to the content data
block when stored in the client data store.
9. The method of claim 1, wherein the first identifier and the
second identifier are calculated such that a probability of the
first identifier and the second identifier matching when the first
content data block and the second content data block are not
identical is effectively zero.
10. A method for managing return-link resource usage in a
communications system, the method comprising: receiving a first
content data block at a client-side device from a server-side
device, the server-side device being communicatively coupled with a
server dictionary and the client-side device being communicatively
coupled with a server dictionary model; calculating a first
identifier from the first content data block, such that the first
identifier has a substantially smaller file size than the first
content data block; storing the first identifier in the server
dictionary model at the client-side device; determining whether the
first content data block comprises file sharing content; removing
the first content data block from the client-side device when the
first content data block comprises file sharing content; subsequent
to storing the first identifier, receiving a second content data
block at the client-side device for upload to the server-side
device; calculating a second identifier from the second content
data block; determining whether the second identifier matches the
first identifier stored in the server dictionary model; and when
the second identifier matches the first identifier, using the first
identifier or the second identifier to compress the second content
data block into compressed content.
11. The method of claim 10, further comprising: communicating the
compressed content to the server-side device.
12. The method of claim 10, wherein the first content data block is
removed from the client-side device prior to receiving the second
content data block at the client-side device.
13. The method of claim 10, wherein determining whether the first
content data block comprises file sharing content comprises
determining whether the first content data block is configured
according to a file sharing protocol.
14. The method of claim 10, wherein determining whether the first
content data block comprises file sharing content comprises
determining a probability that a substantially identical content
data block will be received by the client-side device for upload to
the server-side device at a subsequent time.
15. The method of claim 10, wherein the first identifier and the
second identifier are calculated such that a probability of the
first identifier and the second identifier matching when the first
content data block and the second content data block are not
identical is effectively zero.
16. A system for managing return-link resource usage in a
communications system, the system comprising: a local dictionary
model configured to store identifiers associated with data blocks
stored on a remote dictionary, the remote dictionary located at a
remote node of the communications system; a download processor
module, configured to: receive a first content data block from a
remote device associated with the remote dictionary; store the
first content data block in a local store; calculate a first
identifier from the first content data block; store the first
identifier in the local dictionary model; and remove the first
content data block from the local store; and an upload processor
module, configured to: receive a second content data block for
upload to the remote device; calculate a second identifier from the
second content data block; determine whether the second identifier
matches the first identifier stored in the local dictionary model;
and when the second identifier matches the first identifier, use
the first identifier or the second identifier to compress the
second content data block into compressed content.
17. The system of claim 16, further comprising: a communications
module configured to communicate the compressed content to the
remote device.
18. The system of claim 16, wherein the download processor module
is configured to remove the first content data block from the local
store substantially upon calculating the first identifier.
19. The system of claim 16, further comprising: a file sharing
detector, communicatively coupled with the download processor
module, and configured to: determine whether the first content data
block comprises file sharing content, wherein the download
processor module is configured to remove the first content data
block from the local store only when the content data block
comprises file sharing content.
20. The system of claim 19, wherein the file sharing detector is
configured to determine whether the first content data block
comprises file sharing content by determining whether the first
content data block is configured according to a file sharing
protocol.
21. The system of claim 19, wherein the file sharing detector is
configured to determine whether the first content data block
comprises file sharing content by determining a probability that a
substantially identical content data block will be received for
upload to the remote device at a subsequent time.
22. The system of claim 16, further comprising: a client storage
module, configured to store the first content data block such that
the download processor module has substantially no access to the
first content data block when stored in the client storage
module.
23. The system of claim 16, wherein the local store is a
buffer.
24. The system of claim 16, wherein: the remote device is a server
and the remote dictionary is a server dictionary.
Description
CROSS-REFERENCES
[0001] This application claims the benefit of and is a
non-provisional of co-pending U.S. Provisional Application Ser. No.
61/144,363, filed on Jan. 13, 2009, titled "SATELLITE
MULTICASTING"; and co-pending U.S. Provisional Application Ser. No.
61/170,359, filed on Apr. 17, 2009, titled "DISTRIBUTED BASE
STATION SATELLITE TOPOLOGY," both of which are hereby expressly
incorporated by reference in their entirety for all purposes.
BACKGROUND
[0002] This disclosure relates in general to communications and,
but not by way of limitation, to optimization of return links of a
communications system.
[0003] In some satellite communications systems, a single user
plays a dual role of client and server (e.g., in a peer-to-peer
environment). For example, a user may desire to share previously
downloaded content with another user. Certain types of local
networking and/or shared caching techniques may be used to limit
redundancies and/or other inefficiencies associated with these
types of transactions. However, the techniques may rely at times on
users sharing a subnet, relatively symmetric client-server storage
capabilities, relatively symmetric upload-download capabilities of
the network links, or other types of network characteristics.
[0004] As such, it may be desirable to further mitigate
inefficiencies associated with these types of communications while
avoiding limitations of current approaches.
SUMMARY
[0005] Among other things, methods, systems, devices, and software
are provided for improving utilization of a communications system
(e.g., a satellite communications system) through techniques
referred to herein as return-link optimization. Embodiments operate
in a client-server context (or a more generalized sender-receiver
context). When content is downloaded by a client from a server, a
client optimizer intercepts the download and generates one or more
identifiers characterizing the content (e.g., a digest). The
identifiers are stored in a client-side server dictionary model
reflecting a presumption that the content is stored in a
server-side dictionary. In some embodiments, the actual data blocks
(e.g., byte sequences) making up the content are not stored at the
client side; only digests or other identifiers are stored.
[0006] In some embodiments, when content is uploaded by the client
at some later time, the server dictionary model is used to identify
when the upload content matches previously downloaded (e.g., or, in
some embodiments, previously uploaded) content. When a match is
detected, the identifiers stored in the server dictionary model are
used to generate a highly compressed version of the upload content,
which is then uploaded to the server instead of the full content
data. In this way, return-link bandwidth usage can be reduced for
these types of transactions.
[0007] In one set of embodiments, a system is provided for managing
return-link resource usage in a communications system. The system
includes a local dictionary model configured to store identifiers
associated with data blocks stored on a remote dictionary, where
the remote dictionary is located at a remote node of the
communications system. For example, the remote dictionary may be a
server dictionary in communication with a server optimizer. The
system further includes a download processor module, configured to:
receive a first content data block from a remote device associated
with the remote dictionary; store the first content data block in a
local store (e.g., a buffer); calculate a first identifier (e.g., a
digest) from the first content data block; store the first
identifier in the local dictionary model; and remove the first
content data block from the local store. The system further
includes an upload processor module, configured to: receive a
second content data block for upload to the remote device;
calculate a second identifier from the second content data block;
determine whether the second identifier matches the first
identifier stored in the local dictionary model; and when the
second identifier matches the first identifier, use the first
identifier or the second identifier to compress the second content
data block into compressed content.
[0008] Further areas of applicability of the present disclosure
will become apparent from the detailed description provided
hereinafter. It should be understood that the detailed description
and specific examples, while indicating various embodiments, are
intended for purposes of illustration only and are not intended to
necessarily limit the scope of the disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The present disclosure is described in conjunction with the
appended figures:
[0010] FIG. 1 shows a simplified block diagram of one embodiment of
a communications system for use with various embodiments;
[0011] FIG. 2A shows a simplified block diagram of one embodiment
of a client-server communications system for use with various
embodiments;
[0012] FIG. 2B shows a simplified block diagram of an embodiment of
a communications system having multiple user systems for use with
various embodiments;
[0013] FIG. 3 shows a block diagram of an embodiment of a satellite
communications system having a server system in communication with
multiple user systems via a satellite over multiple spot beams,
according to various embodiments;
[0014] FIG. 4 shows a block diagram of an embodiment of a
communications system, illustrating client-server interactivity
through a client optimizer and a server optimizer, according to
various embodiments;
[0015] FIG. 5 shows a block diagram of an embodiment of a client
optimizer having additional storage capacity and mode selection,
according to various embodiments;
[0016] FIG. 6 shows an illustrative method for performing
return-link optimization, according to various embodiments; and
[0017] FIG. 7 shows an illustrative method for performing
return-link optimization for an upload-after-upload transaction,
according to various embodiments.
[0018] In the appended figures, similar components and/or features
may have the same reference label. Further, various components of
the same type may be distinguished by following the reference label
by a dash and a second label that distinguishes among the similar
components. If only the first reference label is used in the
specification, the description is applicable to any one of the
similar components having the same first reference label
irrespective of the second reference label.
DETAILED DESCRIPTION
[0019] The ensuing description provides preferred exemplary
embodiment(s) only, and is not intended to limit the scope,
applicability, or configuration of the disclosure. Rather, the
ensuing description of the preferred exemplary embodiment(s) will
provide those skilled in the art with an enabling description for
implementing a preferred exemplary embodiment. It is understood
that various changes may be made in the function and arrangement of
elements without departing from the spirit and scope as set forth
in the appended claims.
[0020] Referring first to FIG. 1, a simplified block diagram is
shown of one embodiment of a communications system 100 for use with
various embodiments. The communications system 100 facilitates
communications between a sender optimizer 120 on a sender side 110
and a receiver optimizer 140 on a receiver side 130. The sender
optimizer 120 and the receiver optimizer 140 are configured to
effectively provide an optimizer tunnel 105 between the sender side
110 and the receiver side 130 of the communications system 100,
including providing certain communications functionality.
[0021] Embodiments of the optimizers (e.g., the sender optimizer
120 and/or the receiver optimizer 140) can be implemented in a
number of ways without departing from the scope of the invention.
In some embodiments, the optimizers are implemented as proxy
components (e.g., a two-part proxy client/server topology), such
that the optimizer tunnel 105 is a proxy tunnel. For example, a
transparent intercept proxy can be used to intercept traffic in a
way that is substantially transparent to users at each side of the
proxy tunnel. In other embodiments, the optimizers are implemented
as in-line optimizers. For example, the optimizers are implemented
within respective user or provider terminals. Other configurations
are possible in other embodiments. For example, embodiments of the
receiver optimizer 140 are implemented in the Internet cloud (e.g.,
on commercial network leased server space), and embodiments of the
sender optimizer 120 are implemented within a user system (e.g., in
user's personal computer, within a user's modem, in a physically
separate component at the customer premises, etc.).
[0022] Various embodiments of optimizers may include and/or have
access to different amounts of storage. Some embodiments are
configured to cache data, store dictionaries of byte sequences,
etc. For example, in the communications system 100, the receiver
optimizer 140 has access to enough storage to maintain a receiver
dictionary 144. Embodiments of the receiver dictionary 144 include
chunks of content data (e.g., implemented as delta dictionaries,
wide dictionaries, byte caches, and/or other types of dictionary
structures). For example, when content data is stored in the
dictionary, some or all of the blocks of data defining the content
are stored in the dictionary in an unordered, but indexed way. As
such, content may not be directly accessible from the dictionary;
rather, the set of indexes may be needed to recreate the content
from the set of unordered blocks.
[0023] Other embodiments of optimizers have substantially limited
storage. For example, in the communications system 100, the sender
optimizer 120 has access only to a small amount of storage. The
storage capacity may be too limited to store a full dictionary, but
sufficient to store a model of the receiver dictionary 144,
illustrated as the receiver dictionary model 124. Embodiments of
the receiver dictionary model 124 store digests representing data
stored at the receiver dictionary 144. For example, as described
more fully below, embodiments of the sender optimizer 120 intercept
traffic, and use one or more techniques to generate digests of byte
sequences of the traffic. The digests are then stored in the
receiver dictionary model 124, and can be used to identify matching
byte sequences in the receiver dictionary 144.
[0024] As used herein, "digests" may generally include any type of
fingerprint, digest, signature, hash function, and/or other
functional coding of byte sequences generated so as to provide a
strong enough identifier to reliably represent substantially
identical matching blocks stored in a dictionary. For example, a
user on the sender side 110 of the communications system 100
downloads content from the receiver side 130 of the communications
system 100. In one embodiment, the content is intercepted by the
sender optimizer 120 and a digest is created and stored in the
receiver dictionary model 124. Storage of the digest in the
receiver dictionary model 124 indicates that a full copy of the
downloaded content is stored in the receiver dictionary 144 on the
receiver side 130 of the communications system 100 without storing
a copy of the data on the sender side 110 of the communications
system 100.
[0025] If the user at the sender side 110 later uploads the content
to the receiver side 130, embodiments of the sender optimizer 120
intercept the upload to see if the content was previously
downloaded from the receiver side 130 (i.e., the content is
presumed to be stored in the receiver dictionary 144 according to
the receiver dictionary model 124). If the content is determined to
be previously downloaded content, a highly compressed version of
the content may be uploaded to the receiver side 130. Notably, this
technique may allow significant reductions in return-link resource
usage for file sharing traffic and/or other upload-after-download
traffic, even where there is a very small amount of storage
capacity accessible by the sender optimizer 120 (e.g., enough to
store only a receiver dictionary model 124).
[0026] It will be appreciated that the limited storage capacity at
the sender optimizer 120 may be considered differently in different
embodiments. In one embodiment, the sender optimizer 120 is
implemented within a network device (e.g., a user modem) having
minimal storage capacity. In another embodiment, the sender
optimizer 120 is configured to operate in different operating
modes, where one or more operating modes is configured to use
minimal storage capacity. For example, the sender optimizer 120 may
operate either in a normal mode that stores dictionary entries for
certain types of traffic or in a file sharing mode (when file
sharing traffic is detected) that only stores digests without
storing the actual file sharing content.
[0027] Embodiments of the sender optimizer 120 implement certain
functionality described herein when file sharing or similar types
of content are detected (e.g., resulting in switching into a file
sharing mode, as described above). In some embodiments, the
detection involves determining that traffic intercepted during a
download is likely to be uploaded at some later time. The
determination may account for certain tags or protocols in the
metadata, which application is downloading the data, which ports
are carrying the traffic, etc. For example, file sharing data may
be assumed to have a high probability of upload after download,
while Internet-protocol television (IPTV) or
voice-over-Internet-protocol (VoIP) content may carry a low
probability of upload after download. As used herein, "file
sharing" connotes traffic and associated environments in which a
downloader of content becomes a provider (e.g., a server) of the
content. For example, peer-to-peer and other types of file sharing
applications may allow a downloader to become a server in the
context of particular traffic.
[0028] It is worth noting that many file sharing applications
fragment files for communication. For example, some programs allow
clients to download a content file in parallel from multiple
sources (e.g., other peers on the network) by receiving fragments
of the file from each source. As discussed more fully below,
embodiments generate identifiers (e.g., digests) at the data block
level, rather than at the full-file level. In this way,
optimization opportunities may be identified even from file
fragments, and even when fragments are received asynchronously, out
of order, etc.
[0029] It is worth noting that the storage capacity of the sender
optimizer 120, as discussed above, may be distinct from other
storage capacity at the sender side 110 of the communications
system 100. For example, there may be a user machine 114 at one or
both sides of the communications system 100. The user machine 114
may broadly include any type of machine through which a user may
interact with content over the communications system 100. For
example, the user machine 114 may include consumer premises
equipment (CPE), such as computers, televisions, etc. Further, as
illustrated, the user machines 114 may have access to their own
respective machine storage 118. The machine storage 118 may include
hard-disk space, application storage, cache capacity, etc.
[0030] Notably, the optimizers at each side of the communications
system 100 may or may not have access to the respective machine
storage 118. For example, embodiments of the sender optimizer 120
may typically have little or no access to the machine storage 118.
In some embodiments, the sender optimizer 120 is an independent
(e.g., transparent) network component that does not have access to
the machine storage 118. In other embodiments, it is inefficient or
impractical for the sender optimizer 120 to access machine storage
118 for various optimization processes. For example, access to the
machine storage 118 may be too slow to provide desirable
optimization benefits. As such, embodiments of the sender optimizer
120 are described as having limited storage capacity (e.g., or
operating in a mode with limited storage capacity) even where other
storage capacity is available at the sender side 110 of the
communications system 100.
[0031] While the communications system 100 of FIG. 1 is illustrated
generically as a sender side 110 and receiver side 130, some
typical embodiments operate in a client-server context. FIG. 2A
shows a simplified block diagram of one embodiment of a
client-server communications system 200a for use with various
embodiments. The communications system 200a facilitates
communications between a user system 210 and a server system 320
via a client optimizer 220 and a server optimizer 230. The client
optimizer 220 and the server optimizer 230 are configured to
effectively provide an optimizer tunnel 205 between the user system
210 and the server system 320, including providing certain
communications functionality. Notably, client and server are used
herein to clarify particular sides of the communications system,
and are not intended to limit the respective roles, functions,
direction of communications, etc. For example, in a peer-to-peer
context, users may act as both clients and servers in file sharing
transactions.
[0032] In an illustrative file sharing transaction, the client
optimizer 220 and the server optimizer 230 implement functionality
of the sender optimizer 120 and the receiver optimizer 140 of FIG.
1, respectively. For example, a user downloads content from a
content server 250 over a network 240 through the user system 210.
Embodiments of the user system 210 may include any component or
components for providing a user with network interactivity. For
example, the user system 210 may include any type of computational
device, network interface device, communications device, or other
device for communicating data to and from the user. Typically, the
communications system 200a facilitates communications between
multiple user systems 210 and a variety of content servers 250 over
one or more networks 240 (only one of each is shown in FIG. 2A for
the sake of clarity). The content servers 250 are in communication
with the server optimizer 230 via one or more networks 240. The
network 240 may be any type of network 240 and can include, for
example, the Internet, an Internet protocol ("IP") network, an
intranet, a wide-area network ("WAN"), a local-area network
("LAN"), a virtual private network ("VPN"), the Public Switched
Telephone Network ("PSTN"), and/or any other type of network 240
supporting data communication between devices described herein, in
different embodiments. The network 240 may also include both wired
and wireless connections, including optical links.
[0033] As used herein, "content servers" is intended broadly to
include any source of content in which the users may be interested.
For example, a content server 250 may provide website content,
television content, file sharing, multimedia serving,
voice-over-Internet-protocol (VoIP) handling, and/or any other
useful content. It is worth noting that, in some embodiments, the
content servers 250 are in direct communication with the server
optimizer 230 (e.g., not through the network 240). For example, the
server optimizer 230 may be located in a gateway that includes a
content or application server. As such, discussions of embodiments
herein with respect to communications with content servers 250 over
the network 240 are intended only to be illustrative, and should
not be construed as limiting.
[0034] As described below, the server optimizer 230 may be part of
a server system 320 that includes components for server-side
communications (e.g., base stations, gateways, satellite modem
termination systems (SMTSs), digital subscriber line access
multiplexers (DSLAMs), etc., as described below with reference to
FIG. 3). The server optimizer 230 may act as a transparent and/or
intercepting proxy. For example, the client optimizer 220 is in
communication with the server optimizer 230 over a client-server
communication link 225, and the server optimizer 230 is in
communication with the content server 250 over a content network
link 235. The server optimizer 230 may act as a transparent
man-in-the-middle to intercept the data as it passes between the
client-server communication link 225 and the content network link
235. Further, embodiments of the server optimizer 230 maintain a
server dictionary 234 (e.g., like the receiver dictionary 144 of
FIG. 1) including byte sequences of some or all of the traffic
previously seen by the server optimizer 230.
[0035] For example, when the user system 210 downloads content from
the content server 250, the server optimizer 230 may intercept the
content and store blocks of content data in the server dictionary
234. The content may then be sent (e.g., over the client-server
communication link 225) to the user terminal 210 in response to the
user's request for the content. The client optimizer 220 intercepts
the traffic at the client side of the optimizer tunnel 205 and
generates a digest of the content, as described above. The digest
is stored in a server dictionary model 224. In some embodiments,
additional data (e.g., fingerprints) are generated to facilitate
efficient searches for the digests in the server dictionary model
224. For example, the digest may be a strong identifier that can
reliably represent an identical data block stored at the server
dictionary 234, and a weak identifier (e.g., a hash) may be
generated for quickly finding matching candidates among a large set
of digests.
[0036] In the event that the content is later uploaded to the
communications system 200a, the client optimizer 220 may intercept
the upload (e.g., the request may be directed or redirected to the
client optimizer 220) and look for a match in the server dictionary
model 224, indicating presumptive existence of the upload content
on the server dictionary 234. If a match is found, a highly
compressed version of the content may be communicated to the server
system 320 over the client-server communication link 225. For
example, the highly compressed version may use the matching digests
or other identifiers (e.g., block IDs) from the server dictionary
model 224 as indexes to recreate the content at the server side
from byte sequences stored in the server dictionary 234.
[0037] It is worth noting that the upload may not be ultimately
destined for the server system 320. For example, in a peer-to-peer
context, the upload may actually be from one user system 210 to
another user system 210. While the communications system 200a
illustrated in FIG. 2A shows only one optimizer tunnel 205 between
one server system 320 and one user system 210, embodiments
typically operate in the context of, and take advantage of,
optimization among multiple user systems 210. FIG. 2B shows a
simplified block diagram of an embodiment of a communications
system 200b having multiple user systems 210 for use with various
embodiments. The communications system 200b facilitates
communications between a server system 320 and multiple user
systems 210, via a respective server optimizer 230 and at least one
client optimizer 220.
[0038] As described above with reference to FIG. 2A, a first user
system 210a may desire to upload content after a previous download
of the content from the server system 320. Using the client
optimizer 220, the server optimizer 230, the server dictionary 234,
and the server dictionary model 224, return-link bandwidth may be
optimized for this scenario. Notably, the optimized return-link
bandwidth may refer to the return link between the first user
system 210a and the server system 320, regardless of the ultimate
destination of the upload content. For example, the return link may
be optimized even where the ultimate destination of the content is
the second user system 210n, such that the content is further
communicated from the server system 320 to other nodes of the
communications system 200b.
[0039] Further, it is worth noting that embodiments may optimize
the return-link bandwidth, regardless of whether the ultimate
destination terminal includes optimization functionality. For
example, some embodiments of the second user system 210n include a
second client optimizer 220n that is in communication with the
server optimizer 230 and maintains its own server dictionary model
224n. In other embodiments, however, the second client optimizer
220n may be any receiving node anywhere on the network, even one
having no client optimizer 220n and/or no server dictionary model
224n. For example, the return-link optimization may be effectuated
between the first user system 210a and the server system 320 via
their respective client optimizer 220 and server optimizer 230,
even where the destination for the traffic is some node of the
network other than the server system 320.
[0040] FIGS. 1, 2A, and 2B illustrate various types of
communications systems for use with embodiments of the invention
using generic component designations. It will be appreciated that
these components may be implemented in various nodes of various
types and topologies of communications systems. For example, the
communications systems may include cable communications systems,
satellite communications systems, digital subscriber line (DSL)
communications systems, local are networks (LANs), wide area
networks (WANs), etc. Further, the links of the communications
systems may include wired and/or wireless links, Ethernet links,
coaxial cable links, fiber-optic links, etc. Some embodiments
include shared portions of the forward and/or reverse links between
nodes (e.g., a shared spot beam in a satellite communications
system), while other embodiments include unshared links between
nodes (e.g., in an Ethernet network).
[0041] In one illustrative example, FIG. 3 shows a block diagram of
an embodiment of a satellite communications system 300 having a
server system 320 in communication with multiple user systems 210
via a satellite 305 over multiple spot beams 335, according to
various embodiments. The server system 320 may include any server
components, including base stations 315, gateways 317, etc. A base
station 315 is sometimes referred to as a hub or ground station. In
certain embodiments, the base station 315 has functionality that is
the same or different from a gateway 317. For example, as
illustrated, a gateway 317 provides an interface between the
network 240 and the satellite 305 via a number of base stations
315. Various embodiments provide different types of interfaces
between the gateways 317 and base stations 315. For example, the
gateways 317 and base stations 315 may be in communication over
leased high-bandwidth lines (e.g., raw Ethernet), a virtual private
large-area network service (VPLS), an Internet protocol virtual
private network (IP VPN), or any other public or private, wired or
wireless network. Embodiments of the server system 320 are in
communication with one or more content servers 250 via one or more
networks 240.
[0042] As traffic traverses the satellite communications system 300
in multiple directions, the gateway 317 may be configured to
implement multi-directional communications functionality. For
example, the gateway 317 may send data to and receive data from the
base stations 315.
[0043] Similarly, the gateway 317 may be configured to receive data
and information directed to one or more user systems 210, and
format the data and information for delivery to the respective
destination device via the satellite 305; or receive signals from
the satellite 305 (e.g., from one or more user systems 210)
directed to a destination in the network 240, and process the
received signals for transmission through the network 240.
[0044] In various embodiments, one or more of the satellite links
are capable of communicating using one or more communication
schemes. In various embodiments, the communication schemes may be
the same or different for different links. The communication
schemes may include different types of coding and modulation
combinations. For example, various satellite links may communicate
using physical layer transmission modulation and coding techniques
using adaptive coding and modulation schemes, etc. The
communication schemes may also use one or more different types of
multiplexing schemes, including Multi-Frequency Time-Division
Multiple Access ("MF-TDMA"), Time-Division Multiple Access
("TDMA"), Frequency Division Multiple Access ("FDMA"), Orthogonal
Frequency Division Multiple Access ("OFDMA"), Code Division
Multiple Access ("CDMA"), or any number of other schemes.
[0045] The satellite 305 may operate in a multi-beam mode,
transmitting a number of spot beams 335, each directed at a
different region of the earth. Each spot beam 335 may be associated
with one of the user links, and used to communicate between the
satellite 305 and a large group (e.g., thousands) of user systems
210 (e.g., user terminals 330 within the user systems 210). The
signals transmitted from the satellite 305 may be received by one
or more user systems 210, via a respective user antenna 325. In
some embodiments, some or all of the user systems 210 include one
or more user terminals 330 and one or more CPE devices 360. User
terminals 330 may include modems, satellite modems, routers, or any
other useful components for handling the user-side communications.
Reference to "users" should be construed generally to include any
user (e.g., subscriber, consumer, customer, etc.) of services
provided over the satellite communications system 300 (e.g., by or
through the server system 320).
[0046] In a given spot beam 335, some or all of the users (e.g.,
user systems 210) serviced by the spot beam 335 may be capable of
receiving all the content traversing the spot beam 335 by virtue of
the fact that the satellite communications system 300 employs
wireless communications via various antennae (e.g., 310 and 325).
However, some of the content may not be intended for receipt by
certain customers. As such, the satellite communications system 300
may use various techniques to "direct" content to a user or group
of users. For example, the content may be tagged (e.g., using
packet header information according to a transmission protocol)
with a certain destination identifier (e.g., an IP address), use
different modcode points that can be reliably received only by
certain user terminals 330, send control information to user
systems 210 to direct the user systems 210 to ignore or accept
certain communications, etc. Each user system 210 may then be
adapted to handle the received data accordingly. For example,
content destined for a particular user system 210 may be passed on
to its respective CPE 360, while content not destined for the user
system 210 may be ignored. In some cases, the user system 210
caches information not destined for the associated CPE 360 for use
if the information is later found to be useful in avoiding traffic
over the satellite link, as described in more detail below.
[0047] Embodiments of the server system 320 and/or the user system
210 include an accelerator module and/or other processing
components. In one embodiment, real-time types of data (e.g., User
Datagram Protocol ("UDP") data traffic, like Internet-protocol
television ("IPTV") programming) bypass the accelerator module,
while non-real-time types of data (e.g., Transmission Control
Protocol ("TCP") data traffic, like web video) are routed through
the accelerator module for processing. Embodiments of the
accelerator module provide various types of applications, WAN/LAN,
and/or other acceleration functionality.
[0048] In some embodiments, the accelerator module is adapted to
provide high payload compression. This allows faster transfer of
the data and enhances the effective capacity of the network. The
accelerator module can also implement protocol-specific methods to
reduce the number of round trips needed to complete a transaction,
such as by prefetching objects embedded in HTTP pages. In other
embodiments, functionality of the accelerator module is closely
integrated with the satellite link through other modules, including
the client optimizer 220 and/or the server optimizer 230.
[0049] As discussed above, the satellite communications system 300
may be configured to implement various optimization functions
through client-server interactions, implemented by the client
optimizer 220 and the server optimizer 230. The server optimizer
230 may be configured to maintain a server dictionary and the
client optimizer 220 may be configured to maintain a model of the
server dictionary. Embodiments of the client optimizers 220 and
server optimizer 230 may act to create a virtual tunnel between the
user systems 210 and the content servers 250 or the server system
320, as described with reference to FIGS. 2A and 2B. In a topology,
like the satellite communications system 300 shown in FIG. 3, vast
amounts of traffic may traverse various portions of the satellite
communications system 300 at any given time. The optimizer
functionality may help relieve the satellite communications system
300 from traffic burdens relating to file sharing and similar
transactions (e.g., by optimizing return-link resources). This and
other functionality of the client optimizer 220 and the server
optimizer 230 are described more fully with reference to FIG.
4.
[0050] FIG. 4 shows a block diagram of an embodiment of a
communications system 400, illustrating client-server interactivity
through a client optimizer 220 and a server optimizer 230,
according to various embodiments. In some embodiments, the
communications system 400 is an embodiment of the communications
system 200a of FIG. 2A or the satellite communications system 300
of FIG. 3. As shown, the communications system 400 facilitates
communications between a user system 210 and one or more content
servers 250 via at least one client-server communication link 225.
For example, interactions between the client optimizer 220 and the
server optimizer 230 effectively create an optimizer tunnel 205
between the user system 210 and the content servers 250. In some
embodiments, the server system 320 is in communication with the
content servers 250 via one or more networks 240, like the
Internet.
[0051] In some embodiments, the user system 210 includes a client
graphical user interface (GUI) 410, a web browser 406, and a
redirector 408. The client GUI 410 may allow a user to configure
performance aspects of the user system 210 (e.g., or even aspects
of the greater communications system 400 in some cases). For
example, the user may adjust compression parameters and/or
algorithms, alter content filters (e.g., for blocking illicit
websites), or enable or disable various features used by the
communications system 400. In one embodiment, some of the features
may include network diagnostics, error reporting, as well as
controlling, for example, components of the client optimizer 220
and/or the server optimizer 230.
[0052] In one embodiment, the user selects a universal recourse
locator (URL) address through the client GUI 410 which directs the
web browser 406 (e.g., Internet Explorer.RTM., Firefox.RTM.,
Netscape Navigator.RTM., etc.) to a website (e.g., cnn.com,
google.com, yahoo.com, etc.). The web browser 406 may then issue a
request for the website and associated objects to the Internet. It
is worth noting that the web browser 406 is shown for illustrative
purposes only. While embodiments of the user system 210 may
typically include at least one web browser 406, user systems 210
may interact with content servers 250 in a number of different ways
without departing from the scope of the invention (e.g., through
downloader applications, file sharing applications, applets,
etc.).
[0053] The content request from the user system 210 (e.g., download
request from the web browser 406) may be intercepted by the
redirector 408. It is worth noting that embodiments of the
redirector 408 are implemented in various ways. For example,
embodiments of the redirector 408 are implemented within a user
modem as part of the modem's internal routing functionality. The
redirector 408 may send the request to the client optimizer 220. It
is worth noting that the client optimizer 220 is shown as separate
from the user machine 214 (e.g., in communication over a local bus,
on a separate computer system connected to the user system 210 via
a high speed/low latency link, like a branch office LAN subnet,
etc.). However, embodiments of the client optimizer 220 are
implemented as part of any component of the user system 210 in any
useful client-side location, including as part of a user terminal,
as part of a user modem, as part of a hub, as a separate hardware
component, as a software application on the user machine 214,
etc.
[0054] In some embodiments, the client optimizer 220 includes a
request manager 416. The request manager 416 may be configured to
perform a number of different processing functions, including Java
parsing and protocol processing. Embodiments of the request manager
416 may process hypertext transfer protocol (HTTP), file transfer
protocol (FTP), various media protocols, metadata, header
information, and/or other relevant information from the request
data (e.g., packets) to allow the client optimizer 220 to perform
its optimizer functions. For example, the request may be processed
by the request manager 416 as part of identifying opportunities for
optimizing return-link resources for previously downloaded
content.
[0055] The request manager 416 may forward the request to a request
encoder 418. Embodiments of the request encoder 418 encode the
request using one of many possible data compression or similar
types of algorithms. For example, strong identifiers and/or weak
identifiers may be generated using dictionary coding techniques,
including hashes, checksums, fingerprints, signatures, etc. As
described below, these identifiers may be used to identify digests
in a server dictionary model 224 indicating matching data blocks in
a server dictionary 234 in, or in communication with, the server
optimizer 230.
[0056] In some embodiments, the request manager 416 and/or the
request encoder 418 process the request content differently,
depending on the type of data included in the request. For example,
the content portion (e.g., byte-level data) of the data may be
processed according to metadata. Some types of schema-specific
coding are described in U.S. Provisional Patent Application No.
61/231,265, entitled "METHODS AND SYSTEMS FOR INTEGRATING DELTA
CODING WITH SCHEMA SPECIFIC CODING" (026841-002300US), filed on
Aug. 4, 2009, which is incorporated herein by reference in its
entirety for all purposes.
[0057] In some embodiments, the request may be forwarded to a
transport manager 428a. In one embodiment, the transport manager
428a implements Intelligent Compression Technology's.degree.
("ICT") transport protocol ("ITP"). Nonetheless, other protocols
may be used, such as the standard transmission control protocol
("TCP"). In one embodiment, ITP maintains a persistent connection
with the server system 320 via its server optimizer 230. The
persistent connection between the client optimizer 220 and the
server optimizer 230 may enable the communications system 400 to
eliminate or reduce inefficiencies and overhead costs associated
with creating a new connection for each request.
[0058] In one embodiment, the encoded request is forwarded from the
transport manager 428a in the client optimizer 220 to a transport
manager 428b in the server optimizer 230 to a request decoder 436.
The request decoder 436 may use a decoder which is appropriate for
the encoding performed by the request encoder 418. The request
decoder 436 may then transmit the decoded request to a content
processor 442 configured to communicate the request to an
appropriate content source. For example, the content processor 442
may communicate with a content server 250 over a network 240. Of
course, other types of content sources are possible. For example,
some or all of the data blocks that make up the requested content
may be available in the server dictionary 234. As discussed above,
embodiments of the server dictionary 234 include indexed blocks of
content data (e.g., byte sequences).
[0059] In response to the request, response data may be received by
the content processor 442. For example, the response data may be
retrieved from an appropriate content server 250, from the server
dictionary 234, etc. The response data may include various types of
information, such as one or more attachments (e.g., media files,
text files, etc.), references to "in-line" objects needed to render
a web page, etc. Embodiments of the content processor 442 may be
configured to interpret the response data, which may, for example,
be received as HTML, XML, CSS, Java Scripts, or other types of
data. In some embodiments, when response data is received, the
content processor 442 checks the server dictionary 234 to determine
whether the content is already stored by the server system 230. If
not, the content may be stored to the server dictionary 234.
[0060] In some embodiments, the response received at the content
processor 442 is parsed by a response parser 444 and/or encoded by
a response encoder 440. The response data may then be communicated
back to the user system 210 via the protocol managers 428 and the
client-server communication link 225. After the response data is
received at the client optimizer 220 by its transport manager 428a,
the response data is forwarded to a response manager 424 for
client-side processing.
[0061] Embodiments of the response manager 424 generate a strong
identifier (e.g., a digest) of the response data for storage in the
server dictionary model 224. For example, certain embodiments
assume that response data is stored in the server dictionary 234
(i.e., the response data was stored either prior to or upon receipt
by the content processor 442 in the server optimizer 230. As such,
it may be assumed by embodiments of the client optimizer 220 that
the server dictionary model 224 is, in fact, a model of the server
dictionary 234 without requiring any explicit messages from the
server optimizer 230 to that effect. It is worth noting that, in
some embodiments, synchronization techniques are used to ensure
that the server dictionary model 224 remains an accurate model of
the server dictionary 234. For example, the server optimizer 230
may desire to remove a data block from its server dictionary 234.
The server optimizer 230 may notify the client optimizer 220 that
is it ready to remove the data block, wait for a notification back
from the client optimizer 220 confirming deletion of the data block
from the server dictionary model 224, and then remove the data
block from the server dictionary 234.
[0062] In certain embodiments, the response manager 424 may further
generate a weak identifier (e.g., a checksum, a hash, etc.). The
weak identifier may be used to quickly find strong identifier
entries in the server dictionary model 224, as described more fully
below. Once the server dictionary model 224 is updated, the
response manager 424 may forward the response data to the user
machine 214 (e.g., via its redirector 408).
[0063] At some later time, a user desires to upload the same
content that was previously downloaded (referred to as
"upload-after-download"). For example, the upload-after-download
may occur as part of a file sharing transaction. It is worth noting
that the upload-after-download content may be either identical to
or different from the originally downloaded content; and where the
upload-after-download content is different, it may differ in
varying degrees. For example, a user may download a document,
modify the document, and re-upload the modified document. Depending
on the amount of modification, the upload-after-download content
(i.e., the re-uploaded, modified document) may be slightly or
significantly different (e.g., at the byte level) from the
downloaded version of the document.
[0064] When the user uploads the content in the
upload-after-download context, the upload request may be
intercepted by the redirector 408 and sent to the client optimizer
220. The request manager 416 in the client optimizer 220 parses the
upload request to find any object or other content data that should
be evaluated for optimization. The parsed data may then be encoded
by the request encoder 418 to generate one or more identifiers
associated with the content requested for upload.
[0065] In some embodiments, the request encoder 418 generates a
weak identifier (e.g., by applying a hashing function). The weak
identifier is then used to quickly find candidate matches for the
content among the digests stored in the server dictionary model
224. As noted above, when matches are found, embodiments of the
client optimizer 220 assume that the content (e.g., or data blocks
needed to decompress a compressed version of the content) are
presently stored in the server dictionary 234). If matches are
found, the matching digests may be used to generate a highly
compressed version of the upload content. The highly compressed
version of the upload content may then be uploaded to the server
system 320 for decompression and/or further processing.
[0066] It is worth noting that strong and weak identifiers, as used
herein, may be generated in different ways according to different
functions. In some embodiments, received data block are of variable
size. The boundaries of the blocks are established, for example, by
a function that operates on N bytes. Each time the output of this
function has a particular value, a boundary is established. If the
value of the function does not match the particular value, the
block position may be advanced by one byte and a new function
output may be calculated. For computational efficiency, certain
embodiments of the function include a rolling checksum or other
algorithm that allows the function value to be adjusted as a new
byte is added and an old byte is removed from the set of N bytes
used to compute the function. This approach may allow the same
block boundaries to be established even when starting at different
points in a stream (e.g., a session stream).
[0067] The boundary points may delimit blocks of variable sizes,
and a strong identifier can then be calculated on each block
delimited in this way (e.g., using a Message-Digest algorithm 5
(MD5) technique, or other technique). When a boundary point is
reached, the strong identifier of the completed block can be
compared against the identifiers in the server dictionary model 224
to see if the new block matches data in the server dictionary 234.
Other techniques for delimiting blocks and identifying matching
with previous blocks are possible.
[0068] In one illustrative embodiment, a byte sequence is received
as a stream of data. For each N bytes, a rolling checksum is
calculated, for example, according to the equation:
( i = 0 N - 1 f ( x , i ) ) mod M . ##EQU00001##
[0069] According to this equation, "i" is the position of a byte in
the sequence, so that i=0 for the first byte in the block and i=N-1
for the last byte in the block. Also according to the equation, "x"
is the value of the byte at position i, which may, for example, be
in the range 0-255. Further according to the equation, "f(x,i)" is
a function applied to each entry. For example, the function may use
x as an index into an array of prime values "P," which may be
multiplied by the local offset i, so that f(x,i)=P[x]*i. And
further, according to the equation, modulo arithmetic can be
applied to the total, so that the number of possible output values
is the modulo value. Adjusting the modulus may then adjust the
average size of the output blocks, as it sets the probability that
a match with the special value S (e.g., the "particular value"
discussed above) will occur at any point, where S is any value
between 0 and M-1. Each time the sum equals the special value S, a
boundary point is established in the incoming stream. Each pair of
boundary points may define a dictionary block, and a strong
identifier is calculated on each such block. The rolling checksum
function is applied to every block of N bytes in the incoming
stream.
[0070] In one example, a user engages in file sharing by
downloading a one-Megabyte content file and then becoming a source
(e.g., a server) for that content file, uploading the file multiple
times. Without return-link optimization, the entire one-Megabyte of
file data may be re-uploaded with each upload request. Using the
client optimizer 220, however, the return-link bandwidth usage may
be minimized. For example, the digests in the server dictionary
model 224 may provide 10,000-to-1 compression, such that the
one-Megabyte file can be compressed into only one-hundred bytes of
digest data. As such, even multiple upload requests may be
compressed into only hundreds or thousands of bytes of total
bandwidth usage on the return link.
[0071] Notably, as the optimization occurs on the return link from
the client optimizer 220 to the server optimizer 230, the
optimization may be unaffected by destinations for the upload
content beyond the server system 320. For example, the upload from
the user system 210 via the client optimizer 220 may be destined
for another user system 210 in communication with the server system
320. As discussed above, the optimization may be unaffected by a
presence or absence of a client optimizer 220 at the destination
user system 210.
[0072] Further, it is worth noting that the digest-based
optimization may provide optimization benefits (e.g., compression),
even where portions of a content file have changed. For example, in
a collaborative media editing environment, revisions of large media
files may be sent back and forth among a number of users. When each
revision upload is intercepted by a client optimizer 220, the
server dictionary model 224 may include digests for the unchanged
data blocks. As such, those unchanged blocks may still be sent in
highly compressed form, while the changes are sent in uncompressed
form (e.g., or, at least, not compressed according to the server
dictionary model 224). Upon receipt at the server optimizer 230,
the server dictionary 234 may then be used to decompress the
compressed blocks of the upload and/or updated with the
uncompressed revision data.
[0073] The communications system 400 illustrated in FIG. 4 shows a
client optimizer 220 having storage only for a server dictionary
model 224. Embodiments of the client optimizer 220 shown in FIG. 4
may have no access (e.g., or no practical or efficient access) to
other storage capacity. For example, the client optimizer 220 may
not be authorized to access the machine storage 218 and/or may not
have additional capacity of its own (e.g., for storage of its own
dictionary or for a cache). However, in other embodiments, the
client optimizer 220 has additional capacity, which it may manage
according to whether return-link optimization is desired.
[0074] FIG. 5 shows a block diagram of an embodiment of a client
optimizer 220 having additional storage capacity and mode
selection, according to various embodiments. As in the client
optimizer 220 of FIG. 4, the client optimizer 220a of FIG. 5
includes a request manager 416, a request encoder 418, a response
manager 424, a transport manager 428, and a server dictionary model
224. However, the client optimizer 220a of FIG. 5 also includes a
client dictionary 524, a mode selector 520, and file sharing
detectors 510. Embodiments of the client optimizer 220a operate in
a "file sharing" operating mode when file sharing content (e.g., or
any content deemed a likely upload-after-download candidate) is
detected and in a "normal" mode for other types of traffic, as
described below.
[0075] When the user downloads content from the server system 320,
the content is received via the transport manager 428 by the client
optimizer 220. The received content is evaluated by the file
sharing detector 510 to determine whether the content includes file
sharing content. As discussed above, "file sharing" content is used
herein to describe any traffic having a probability of being
uploaded after download. This determination can be made in a number
of ways. For example, metadata may be evaluated to look for certain
file sharing protocols, certain types of content (e.g., file types)
may be deemed more likely to be re-uploaded, patterns of use may be
evaluated to find upload-after-download candidates, etc.
[0076] The determination of the file sharing detector 510 may be
used to set the operating mode of the client optimizer 220a for
handling that content, and the response manager 424 may process the
content according to that operating mode. In some embodiments, if
the file sharing detector 510 determines that the content includes
file sharing content, the mode selector 520 may be set such that
the client optimizer 220a processes the content in "file sharing"
operating mode. For example, the file sharing content may be
processed as described above with reference to FIG. 4. The response
manager 424 may generate one or more identifiers (e.g., digests)
for storage in the server dictionary model 224, and may pass the
content to the user machine 214.
[0077] If the file sharing detector 510 determines that there is no
file sharing content, the mode selector 520 may be set such that
the client optimizer 220a processes the content in "normal"
operating mode. According to the normal operating mode, the content
may be processed in a number of ways, including using the client
dictionary 524 for various types of optimization. In one
embodiment, the normal operating mode exploits deltacasting
opportunities, as described in U.S. patent application Ser. No.
12/651,909, entitled "DELTACASTING" (017018-019510US), filed on
Jan. 4, 2010, which is incorporated herein by reference in its
entirety for all purposes. In other embodiments, the normal
operating mode configures the client optimizer 220a to implement
functionality of delta coders, caches, and/or other types of
network components known in the art.
[0078] When the user uploads content from the user machine 214, the
upload request may be sent to the client optimizer 220. The request
manager 416 in the client optimizer 220 processes (e.g., parses)
the upload request to find any object or other content data that
should be evaluated for optimization. In some embodiments, the
parsed data is then encoded by the request encoder 418 to generate
one or more identifiers associated with the content requested for
upload. The identifiers may then be evaluated against one or both
of the server dictionary model 224 and the client dictionary 524 to
find and/or exploit matches.
[0079] In other embodiments, information obtained from processing
the upload request is used by the file sharing detector 510 to
determine whether the upload request includes file sharing traffic.
The operating mode may then be selected by the mode selector 520
and the upload request may be encoded by the request encoder 418
according to the determination of the file sharing detector 510.
For example, if the file sharing detector 510 detects file sharing
content, the mode selector 520 may select the "file sharing"
operating mode. In this mode, the request encoder 418 may generate
a weak identifier for use in finding matching digests in the server
dictionary model 224 without any reference to the client dictionary
524. As discussed above, any matches found in the server dictionary
model 224 may then be used to compress the upload request, for
example, for return-link optimization.
[0080] It will be appreciated that, while the above descriptions of
content transactions focus on requests and responses, these terms
are intended to be broadly construed, and embodiments of the
invention function within many other contexts. For example,
embodiments of the communication system 400 are used to provide
interactive Internet services (e.g., access to the world-wide web,
email communications, file serving and sharing, etc.), television
services (e.g., satellite broadcast television, Internet protocol
television (IPTV), on-demand programming, etc.), voice
communications (e.g., telephone services,
voice-over-Internet-protocol (VoIP) telephony, etc.), networking
services (e.g., mesh networking, VPN, VLAN, MPLS, VPLS, etc.), and
other communication services. As such, the "response" data
discussed above is intended only as an illustrative type of data
that may be received by the server optimizer 230 from a content
source (e.g., a content server 250). For example, the "response"
data may actually be pushed, multicast, or otherwise communicated
to the user without an explicit request from the user.
[0081] It will be further appreciated that embodiments of systems
and components described above include merely some exemplary
embodiments, and various methods of the invention can be performed
by those and other system embodiments. FIG. 6 shows an illustrative
method 600 for performing return-link optimization, according to
various embodiments. Particularly, the method 600 illustrates an
"upload-after-download" scenario (e.g., a re-upload to the
Internet, P2P file sharing of previously downloaded content, etc.).
In some embodiments, the method 600 is performed by one or more
components of a client optimizer 220, as described above with
reference to FIGS. 1-5.
[0082] For the sake of added clarity, the method 600 is shown with
reference to client-side activities 602 and server-side activities
604, and with reference to illustrative timing on a timeline 605.
Of course, certain client-side functions may be performed by
server-side components, certain server-side functions may be
performed by client-side components, and specific timing of process
blocks may be changed without affecting the method 600. Further, it
will be appreciated that the timeline 605 is not intended to show
any time scale (relative or absolute), and certain process blocks
may occur in series, in parallel, or otherwise, according to
various embodiments. For at least these reasons, it will be
appreciated that these elements of FIG. 6 are intended only for
clarity and are not intended to limit the scope of the method 600
in any way.
[0083] Some embodiments of the method 600 begin at a first time
610a (shown on timeline 605), when the client side 602 (e.g., a
user of a user machine) requests content for download in block 620.
At a second time 610b, (e.g., after some delay due to latency of a
satellite communication link, etc.), the server side 604 receives
and processes the request at block 624. At block 628, the server
side 604 transmits the requested content to the requesting client
side 602 in response to the request. In some embodiments, the
server side 604 also determines whether the content represents an
optimization candidate at block 636c. For example, the server side
604 may evaluate the response data to determine whether it includes
file sharing content. If the traffic is deemed an optimization
candidate (e.g., or in all cases, for example, where a
determination is not made at block 636c), the server side 604
stores the response data in a local dictionary at block 630.
[0084] At a third time 610c, the client side 602 receives the
content at block 632. In some embodiments, the client side 602
determines whether the content represents an optimization candidate
at block 636a. Embodiments of the determination may be similar to
those made at the server side 604 in block 636c. For example, the
client side 602 may evaluate the response data to determine whether
it includes file sharing content. If the traffic is deemed an
optimization candidate, identifiers (e.g., digests) may be
generated at block 640 and used to update a server dictionary model
at the client side 602.
[0085] Sometime later, at a fourth time 610d, the client side 602
makes a request at block 644 that involves upload of the content
received in block 632. For example, the client side 602 desires to
re-upload the content to the another location on the Internet,
share the content with another user via the communication system,
etc. In some embodiments, at a fifth time 610e, the upload request
is intercepted at block 636b and a determination is made (e.g., as
in block 636a) as to whether the upload request includes content
relating to an optimization candidate (e.g., file sharing
content).
[0086] If the upload request includes optimizable content,
according to the determination of block 636b, an identifier may be
generated at block 648. For example, a weak identifier may be
generated by applying a hashing function to the upload content
data. At block 652, the identifier is used to find any candidate
matches among the digests stored in the server dictionary model. If
matches are not found, the content data may be uploaded at block
656a. If matches are found, the matching digests may be used to
generate and upload a highly compressed version of the upload
content at block 656b.
[0087] At a sixth time 610f, (e.g., again after some delay due to
latency), the server side 604 may receive and process the uploaded
content at block 660. At block 664, the server dictionary may be
updated with any blocks not already in the dictionary. For example,
if the content is uploaded at block 656a without digest-based
compression, or if some of the blocks of the content were uploaded
without digest-based compression due to changes in the file, the
server dictionary may be updated at block 664.
[0088] In some embodiments, the upload-after-download scenario is
part of a peer-to-peer (P2P) file sharing process, or some other
process in which the upload is destined for a node of the
communications system other than the server side 604. In
embodiments of these transactions, the uploaded content may then be
communicated to a destination node at block 668. For example, the
content may be pushed to a user at the same or another client side
602.
[0089] It will be appreciated that various embodiments have been
described herein with reference to upload-after-download
transactions. However, similar functionality may be used to
optimize return-link bandwidth usage in the context of multiple
uploads of the same content. FIG. 7 shows an illustrative method
700 for performing return-link optimization for an
upload-after-upload transaction, according to various embodiments.
As with the method 600 of FIG. 6, the method 700 is shown with
reference to client-side activities 702 and server-side activities
704, and with reference to illustrative timing on a timeline
705.
[0090] Embodiments of the method 700 begin at a first time 710a
(shown on timeline 705), when the client side 702 (e.g., a user of
a user machine) requests upload of content in block 720. At a
second time 710b, the upload request is intercepted at block 724a
and a determination is made as to whether the upload request
includes content relating to an optimization candidate (e.g., file
sharing content). If so, an identifier (e.g., a digest) may be
generated at block 728 and added to the server dictionary model.
For example, it may be assumed that the data will be stored in the
server dictionary after it is received by the server side 704 as
part of the present upload request.
[0091] The content may then be uploaded at block 732. It is assumed
in the illustrative method 700 that this is the first time the
content is being uploaded to the server side 704. At a third time
710c, the uploaded content is received and processed by the server
side 704 at block 736. In some embodiments, at block 740, the
server dictionary is updated to reflect the uploaded content.
[0092] Sometime later, at a fourth time 710d, the client side 702
makes a request at block 744 that involves a second upload of the
content previously uploaded in block 732. For example, the client
side 702 desires to re-upload the content to another location on
the Internet, share the content with another user via the
communication system, etc. In some embodiments, at a fifth time
710e, the upload request is intercepted at block 724b and a
determination is made (e.g., as in block 724a) as to whether the
upload request includes content relating to an optimization
candidate (e.g., file sharing content).
[0093] If the upload request includes optimizable content,
according to the determination of block 724b, an identifier may be
generated at block 748. For example, a weak identifier may be
generated by applying a hashing function to the upload content
data. At block 752, the identifier is used to find any candidate
matches among the digests stored in the server dictionary model. If
matches are not found, the content data may be uploaded at block
756a. If matches are found, the matching digests may be used to
generate and upload a highly compressed version of the upload
content at block 756b.
[0094] At a sixth time 710f, the server side 704 may receive and
processes the uploaded content at block 760. At block 764, the
server dictionary may be updated with any blocks not already in the
dictionary. For example, if the content is uploaded at block 756a
without digest-based compression, or if some of the blocks of the
content were uploaded without digest-based compression due to
changes in the file, the server dictionary may be updated at block
764. In some embodiments, the uploaded content may then be
communicated to a destination node other than the server side 704
at block 768.
[0095] It is worth noting, that blocks 744, 748, 752, 756a, 756b,
760, 764, and 768 of the method 700 of FIG. 7 may be implemented
substantially identically to blocks 644, 648, 652, 656a, 656b, 660,
664, and 668 of the method 600 of FIG. 6, respectively. For
example, once content is uploaded once to the server (e.g., either
after a download, as in FIG. 6, or not, as in FIG. 7) the data may
be used to reduce return-link bandwidth on future uploads of the
same content. As such, embodiments of systems and methods described
herein handle both upload-after-download and upload-after-upload
transactions.
[0096] The above description is intended to provide various
embodiments of the invention, but does not represent an exhaustive
list of all embodiments. For example, those of skill in the art
will appreciate that various modifications are available within the
scope of the invention. Further, while the disclosure includes
various sections and headings, the sections and headings are not
intended to limit the scope of any embodiment of the invention.
Rather, disclosure presented under one heading may inform
disclosure presented under a different heading. For example,
descriptions of embodiments of method steps for handling
overlapping content requests may be used to inform embodiments of
methods for handling anticipatory requests.
[0097] Specific details are given in the above description to
provide a thorough understanding of the embodiments. However, it is
understood that the embodiments may be practiced without these
specific details. For example, well-known processes, algorithms,
structures, and techniques may be shown without unnecessary detail
in order to avoid obscuring the embodiments. Implementation of the
techniques, blocks, steps, and means described above may be done in
various ways. For example, these techniques, blocks, steps, and
means may be implemented in hardware, software, or a combination
thereof. For a hardware implementation, the processing units may be
implemented within one or more application specific integrated
circuits (ASICs), digital signal processors (DSPs), digital signal
processing devices (DSPDs), programmable logic devices (PLDs),
field-programmable gate arrays (FPGAs), soft core processors, hard
core processors, controllers, micro-controllers, microprocessors,
other electronic units designed to perform the functions described
above, and/or a combination thereof. Software can be used instead
of or in addition to hardware to perform the techniques, blocks,
steps, and means.
[0098] Also, it is noted that the embodiments may be described as a
process which is depicted as a flowchart, a flow diagram, a data
flow diagram, a structure diagram, or a block diagram. Although a
flowchart may describe the operations as a sequential process, many
of the operations can be performed in parallel or concurrently. In
addition, the order of the operations may be re-arranged. A process
is terminated when its operations are completed, but could have
additional steps not included in the figure. A process may
correspond to a method, a function, a procedure, a subroutine, a
subprogram, etc. When a process corresponds to a function, its
termination corresponds to a return of the function to the calling
function or the main function.
[0099] Furthermore, embodiments may be implemented by hardware,
software, scripting languages, firmware, middleware, microcode,
hardware description languages, and/or any combination thereof.
When implemented in software, firmware, middleware, scripting
language, and/or microcode, the program code or code segments to
perform the necessary tasks may be stored in a machine readable
medium such as a storage medium. A code segment or
machine-executable instruction may represent a procedure, a
function, a subprogram, a program, a routine, a subroutine, a
module, a software package, a script, a class, or any combination
of instructions, data structures, and/or program statements. A code
segment may be coupled to another code segment or a hardware
circuit by passing and/or receiving information, data, arguments,
parameters, and/or memory contents. Information, arguments,
parameters, data, etc. may be passed, forwarded, or transmitted via
any suitable means including memory sharing, message passing, token
passing, network transmission, etc.
[0100] For a firmware and/or software implementation, the
methodologies may be implemented with modules (e.g., procedures,
functions, and so on) that perform the functions described herein.
Any machine-readable medium tangibly embodying instructions may be
used in implementing the methodologies described herein. For
example, software codes may be stored in a memory. Memory may be
implemented within the processor or external to the processor. As
used herein the term "memory" refers to any type of long term,
short term, volatile, nonvolatile, or other storage medium and is
not to be limited to any particular type of memory or number of
memories, or type of media upon which memory is stored.
[0101] Moreover, as disclosed herein, the term "storage medium" may
represent one or more memories for storing data, including read
only memory (ROM), random access memory (RAM), magnetic RAM, core
memory, magnetic disk storage mediums, optical storage mediums,
flash memory devices and/or other machine readable mediums for
storing information. Similarly, terms like "cache" are intended to
broadly include any type of storage, including temporary or
persistent storage, queues (e.g., FIFO, LIFO, etc.), buffers (e.g.,
circular, etc.), etc. The term "machine-readable medium" includes,
but is not limited to, portable or fixed storage devices, optical
storage devices, wireless channels, and/or various other storage
mediums capable of storing that contain or carry instruction(s)
and/or data.
[0102] Further, certain portions of embodiments (e.g., method
steps) are described as being implemented "as a function of" other
portions of embodiments. This and similar phraseologies, as used
herein, intend broadly to include any technique for determining one
element partially or completely according to another element. In
various embodiments, determinations "as a function of" a factor may
be made in any way, so long as the outcome of the determination is
at least partially dependant on the factor.
[0103] While the principles of the disclosure have been described
above in connection with specific apparatuses and methods, it is to
be clearly understood that this description is made only by way of
example and not as limitation on the scope of the disclosure.
* * * * *