Managing Connections In A Data Storage System Varadharajan; Prakash ; et al. [COMMVAULT SYSTEMS, INC.]

Managing Connections In A Data Storage System

Varadharajan; Prakash ; et al.

Patent Application Summary

U.S. patent application number 13/874358 was filed with the patent office on 2013-09-19 for managing connections in a data storage system. This patent application is currently assigned to COMMVAULT SYSTEMS, INC.. The applicant listed for this patent is COMMVAULT SYSTEMS, INC.. Invention is credited to Henry Wallace Dornemann, Parag Gokhale, Prakash Varadharajan.

Application Number	20130247154 13/874358
Document ID	/
Family ID	42738802
Filed Date	2013-09-19

United States Patent Application	20130247154
Kind Code	A1
Varadharajan; Prakash ; et al.	September 19, 2013

MANAGING CONNECTIONS IN A DATA STORAGE SYSTEM

Abstract

Described in detail herein are systems and methods for managing connections in a data storage system. For example, the systems and methods may be used to manage connections between two or more computing devices for purposes of performing storage operations on the data of one of the computing devices. The data storage system includes at least two computing devices. A first computing device includes an unauthorized connection data structure and a connection manager component. The connection manager component receives a connection request from a second computing device. If the second computing device is not identified on the unauthorized connection data structure, the connection manager component can request that an authentication manager authenticate the second computing device and/or determine whether the second computing device is properly authorized. If so, the connection manager component can allow the second computing device to connect to the first computing device.

Inventors:

Varadharajan; Prakash; (Manalapan, NJ) ; Dornemann; Henry Wallace; (Eatontown, NJ) ; Gokhale; Parag; (Marlboro, NJ)

Applicant:

Name	City	State	Country	Type
COMMVAULT SYSTEMS, INC.	Oceanport	NJ	US

Assignee:

COMMVAULT SYSTEMS, INC.
Oceanport
NJ

Family ID:

42738802

Appl. No.:

13/874358

Filed:

April 30, 2013

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
12643653	Dec 21, 2009	8434131
13874358
61162140	Mar 20, 2009

Current U.S. Class:	726/4
Current CPC Class:	H04L 63/08 20130101; H04L 63/029 20130101; H04L 63/101 20130101
Class at Publication:	726/4
International Class:	H04L 29/06 20060101 H04L029/06

Claims

1. A system for managing connections in a data storage system, wherein the data storage system includes at least one client computing device storing data, the system comprising: an authentication manager; a storage device; and at least one secondary storage computing device configured to receive a request from the client computing device to store the data on the storage device, wherein the secondary storage computing device includes: a blacklist that includes one or more entries, wherein the entries include an identifier of a computing device; a connection manager component configured to: receive, at a first time, from the client computing device a connection request, wherein the connection request includes a identifier identifying the client computing device; based upon the identifier of the client computing device or the combination of the identifier of the client computing device and the first time, determine from the blacklist whether the connection request from the client computing device should be refused; refuse the connection request from the client computing device if the connection request from the client computing device should be refused based upon the determination from the blacklist; if the connection request from the client computing device should not be refused based upon the determination from the blacklist, then determine whether the client computing device is authenticated or authorized to connect to the secondary storage computing device; if the client computing device is either not authenticated or not authorized to connect to the secondary storage computing device, refuse the connection request from the client computing device; and if the client computing device is authenticated, or if the client computing device is authorized to connect to the secondary storage computing device, allow the client computing device to connect to the secondary storage computing device, wherein the secondary storage computing device is located at a friendly side of a firewall, wherein the client computing device is not located at the friendly side of the firewall, and wherein the secondary storage computing device receives the request from the client computing device through the firewall.

2. The system of claim 1 wherein the entries of the blacklist further include a timestamp indicating a time at which the secondary storage computing device received a connection request from the identified computing device.

3. The system of claim 1 wherein the connection manager component is further configured to either: add an entry to the blacklist that includes the identifier of the client computing device and a timestamp indicating the first time; or modify a timestamp of an existing entry of the blacklist that includes the identifier of the client computing device to indicate the first time.

4. The system of claim 1 wherein the secondary storage computing device further includes: an interface at which the secondary storage computing device receives the connection request; and an interface blacklist, wherein the interface blacklist includes one or more entries, wherein at least one of the entries is configured to include an identifier of an interface, wherein the connection manager component is further configured to: determine an identifier of the interface; based upon the identifier of the interface, determine from the interface blacklist whether the connection request from the client computing device should be refused; and refuse the connection request from the client computing device if the connection request from the client computing device should be refused based upon the determination from the interface blacklist.

5. The system of claim 1 wherein the connection manager component is further configured to remove an existing entry that includes the identifier of the client computing device from the blacklist if the client computing device is authorized.

6. The system of claim 1 wherein the connection manager component is further configured to: receive an identifier of a computing device to which the connection manager component should refuse connection requests; and add the identifier of the computing device to the blacklist.

7. The system of claim 1 wherein the connection manager component is further configured to receive an indication to enable refusing connection requests from computing devices that are not authorized.

8. The system of claim 1 wherein the blacklist includes at least two entries, wherein a first entry includes an identifier of a computing device, and wherein a second entry includes an identifier of another computing device and a timestamp indicating a time at which the connection manager component received a connection request from the identified other computing device.

9. The system of claim 1 wherein the data storage system includes at least two different hierarchical tiers of data storage, wherein the client computing device is at a first hierarchical tier of data storage, and wherein the secondary storage computing device is at a second hierarchical tier of data storage.

10. The system of claim 1, wherein the authentication manager performs both authentication of the client computing device and determining whether the client computing device is authorized to access the secondary storage computing device.

11. A method of managing connections in a data storage system, wherein the data storage system includes at least two computing devices, the method comprising: receiving at a first time at a local computing device a connection request from a remote computing device, wherein the connection request includes an identifier that identifies the remote computing device, wherein the local computing device is located at a friendly side of a firewall, and wherein the connection request is received through the firewall; accessing an unauthorized connection data structure, wherein the unauthorized connection data structure includes one or more entries, wherein the entries include an identifier of a computing device; receiving an indication to enable refusing connection requests at the local computing device to computing devices that are not at the friendly side of the firewall; based upon the identifier of the remote computing device or the combination of the identifier of the remote computing device and the first time, determining from the unauthorized connection data structure whether the connection request from the remote computing device should be refused; if the connection request from the remote computing device should be refused based upon the determination from the unauthorized connection data structure, then refusing the connection request from the remote computing device.

12. The method of claim 11, further comprising, if the connection request from the remote computing device is refused, then either: adding an entry to the unauthorized connection data structure that includes the identifier of the remote computing device and a timestamp indicating the first time; or modifying a timestamp of an existing entry of the unauthorized connection data structure that includes the identifier of the remote computing device to indicate the first time.

13. The method of claim 11 wherein the local computing device receives the connection request at an interface, and further comprising: determining an identifier of the interface at which the connection request is received; accessing an interface blacklist data structure, wherein the interface blacklist data structure includes one or more entries, wherein at least one of the entries is configured to store an identifier of an interface; based upon the identifier of the interface, determining from the interface blacklist data structure whether the connection request from the remote computing device should be refused; and if the connection request from the remote computing device should be refused based upon the determination from the interface blacklist data structure, then refusing the connection request from the remote computing device.

14. The method of claim 11 further comprising: requesting that an authorization computing device determine whether the remote computing device is authorized to connect to the local computing device; and receiving an indication from the authorization computing device whether the remote computing device is authorized to connect to the local computing device, wherein, if the remote computing device is authorized to connect to the local computing device, then removing an existing entry from the unauthorized connection data structure that includes the identifier of the remote computing device.

15. The method of claim 11, further comprising: receiving an identifier of a computing device to which the local computing device should refuse connection requests; and adding the identifier of the computing device to the unauthorized connection data structure.

16. The method of claim 11, wherein the entries of the unauthorized connection data structure further include a timestamp indicating a time at which the secondary storage computing device received a connection request from the identified computing device.

17. The method of claim 11 wherein the unauthorized connection data structure includes at least two entries, wherein a first entry includes only an identifier of a computing device, and wherein a second entry includes an identifier of another computing device and a timestamp indicating a time at which the local computing device received a connection request from the identified other computing device.

18. The method of claim 11 wherein at least one of the entries included in the unauthorized connection data structure includes an identifier of a computing device that is not licensed in the data storage system.

19. A computer-readable medium including instructions for managing connections in a data storage system, wherein the data storage system includes at least two computing devices, comprising: receiving at a first time at a first computing device a connection request from a second computing device, wherein the connection request includes an identifier that identifies the second computing device; accessing an unauthorized connection data structure, wherein the unauthorized connection data structure includes zero or more entries, and, wherein the entries are configured to include an identifier of a computing device; receiving an indication to enable refusing connection requests at the local computing device to computing devices that are not on a friendly side of a firewall and that are not authenticated; based upon the combination of the identifier of the second computing device and the first time, determining from the unauthorized connection data structure whether the connection request from the second computing device should be refused; if the connection request from the second computing device should be refused based upon the determination from the unauthorized connection data structure, then refusing the connection request from the second computing device.

20. The computer-readable medium of claim 19 further comprising, if the connection request from the second computing device is refused, then either: adding an entry to the unauthorized connection data structure that includes the identifier of the second computing device and a timestamp indicating the first time; or modifying a timestamp of an existing entry of the unauthorized connection data structure that includes the identifier of the second computing device to indicate the first time.

Description

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application is a continuation of U.S. patent application Ser. No. 12/643,653 filed Dec. 21, 2009, now U.S. Pat. No. 8,434,131 (entitled MANAGING CONNECTIONS IN A DATA STORAGE SYSTEM, Attorney Docket No. 60692-8070.US01), which claims priority to U.S. patent application Ser. No. 61/162,140 filed Mar. 20, 2009 (entitled MANAGING CONNECTIONS IN A DATA STORAGE SYSTEM, Attorney Docket No. 60692-8070.US00), the entirety of each of which is incorporated by reference herein.

BACKGROUND

[0002] A data storage system implemented by an organization may include numerous entities (e.g., computing devices such as personal computers, server computers, mobile devices, etc, as well as storage devices such as magnetic storage devices, tape libraries, etc.). For example, the data storage system may include a set of first entities storing data that the organization wishes to protect (e.g., a set of computing devices external to the organization) and a second entity that performs data storage operations upon the data (e.g., a local computing device that copies the data of external computing devices to a storage device). The data storage system may also include a third entity that manages the data storage operations (e.g., a managing computing device that authenticates the external computing devices, determines if they are authorized to access the local computing device, and schedules copy operations).

[0003] In such a data storage system or in other data storage systems, the organization may wish, for various reasons, to exclude certain entities from performing data storage operations or from having data storage operations performed on their data. For example, the organization may no longer wish to protect the data of certain of the first entities, such as those of an external organization. One way the organization may implement this is by removing the authorization of these external computing devices to access resources in the data storage system and their ability to access such resources. However, if the organization is unable to perform either of these steps for one reason or another (e.g., the organization does not have effective control over them), the external computing devices in this example may not be effectively excluded from requesting the use of resources in the data storage system. Therefore, these external computing devices may continue to request that the organization's local computing device perform data storage operations upon their data. This may result in the local computing device being unable to perform data storage operations upon the data of external computing devices that are still authorized in the data storage system. Therefore, the inability to effectively exclude such external computing devices in this example from requesting the use of resources in the data storage system may result in a denial of data storage operation services to authorized entities.

[0004] An entity in a data storage system that has multiple Network Interface Controllers (NICs) may be required, for one or more reasons, to receive and/or respond to connection requests on all of the NICs. For example, an entity that is in a clustered configuration (e.g., a Microsoft Windows clustered configuration), and that performs data storage operations upon the data of other entities, may be required (e.g., by the Microsoft Windows clustering software) to receive connection requests on the NIC of each node in the cluster. However, it may be desirable to configure the entity to receive and/or respond to connection requests on a subset of all of its NICs. As another example, it may be desirable to configure an entity that has multiple NICs, each attached to a different network (e.g., one NIC attached to a private network such as a Local Area Network (LAN) or a Wide Area Network (WAN) and one NIC attached to a public network such as the Internet), to only receive and/or respond to connection requests received at one NIC (e.g., to only receive and/or respond to connection requests received at the NIC attached to the private network).

[0005] The need exists for systems and methods that overcome the above problems, as well as systems and methods that provide additional benefits. Overall, the examples herein of some prior or related systems and methods and their associated limitations are intended to be illustrative and not exclusive. Other limitations of existing or prior systems and methods will become apparent to those of skill in the art upon reading the following Detailed Description.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] FIG. 1 is a block diagram illustrating an example of a data storage system that may employ aspects of the invention.

[0007] FIGS. 2A-2C are block diagrams illustrating example computing networks that may employ aspects of the invention.

[0008] FIG. 3A is a block diagram illustrating a computing device configured in accordance with aspects of the invention.

[0009] FIG. 3B is a diagram illustrating a suitable data structure that may be utilized by aspects of the invention.

[0010] FIG. 4 is a flow diagram of a process for receiving connection requests.

[0011] FIGS. 5A and 5B are a flow diagram of a process for allowing or refusing connection requests.

DETAILED DESCRIPTION

[0012] The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the claimed invention.

Overview

[0013] Described in detail herein are systems and methods for managing connections in a data storage system (alternatively called a data storage network, a data storage environment, or a data storage enterprise). For example, the systems and methods may be used to manage connections for purposes of performing data storage operations in the data storage system. Data storage operations include, for example and without limitation, backup operations, restore operations, archival operations, copy operations, Continuous Data Replication (CDR) operations, recovery operations, migration operations, and Hierarchical Storage Management (HSM) operations. In some examples, a system for managing connections in a data storage system includes at least two computing devices. A first computing device includes an unauthorized connection data structure (e.g., a blacklist of unauthorized computing devices) and a connection manager component that uses the unauthorized connection data structure to manage connection requests from other computing devices.

[0014] The unauthorized connection data structure includes one or more entries (although it may begin with zero entries). The entries are configured to store information including an identifier of a computing device and, optionally, a timestamp indicating a time at which the first computing device received a connection request from the identified computing device. Entries in the unauthorized connection data structure may be permanent or semi-permanent entries, which include information identifying computing devices to which connection requests should generally always be refused. Or, entries may be dynamic entries, which include information identifying computing devices to which connection requests have been refused based upon the occurrence of an event or the satisfaction of a rule (e.g., an unsuccessful authentication and/or authorization attempt).

[0015] For example, the unauthorized connection data structure may include Internet Protocol (IP) addresses of several computers that are permanently or semi-permanently not authorized to perform data storage operations. These IP addresses correspond to permanent or semi-permanent entries. The connection manager component will generally always refuse connection requests from such computers. The unauthorized connection data structure may also include IP addresses of several computers that may also no longer be authorized to perform data storage operations, but their IP address are not permanently or semi-permanently added to the unauthorized connection data structure. These IP addresses have corresponding timestamps from which the period of time for which the identified computers are prohibited from connecting can be inferred. These entries correspond to dynamic entries in the unauthorized connection data structure.

[0016] When the first computing device's connection manager component receives a connection request from a second computing device, it determines an identifier identifying the second computing device (e.g., an IP address) from the connection request. The connection manager component uses either the identifier or the combination of the identifier and the time at which the connection request is received to determine from the unauthorized connection data structure whether the connection request from the second computing device should be refused. If so, the connection manager component refuses the connection request. If not, the connection manager component attempts to authenticate the second computing device and determine whether the second computing device is authorized to access resources of the first computing device.

[0017] For example, if the second computing device's IP address is not on the unauthorized connection data structure, the connection manager component can request that a third computing device (e.g., an authentication computing device) authenticate the second computing device. The connection manager component can determine whether the second computing device is properly authorized, or the third computing device can perform the authorization process.

[0018] If the second computing device is authenticated and properly authorized, then the connection manager component allows the second computing device to connect to the first computing device. If not, the connection manager component refuses the connection request. In some cases, the connection manager component either adds an entry to the unauthorized connection data structure that includes the identifier of the first computing device and a timestamp indicating the connection request time, or modifies a timestamp of an existing entry that already includes the identifier of the first computing device to indicate the connection request time.

[0019] For example, if the second computing device does not have an entry in the unauthorized connection data structure identifying it (this may indicate that the second computing device has not tried to connect to the first computing device within a certain time period), the connection manager component will add an entry to the unauthorized connection data structure. If the second computing device already has an entry in the unauthorized connection data structure (this typically indicates that the second computing device has previously tried to connect to the first computing device within the certain time period), the connection manager component will update the existing entry to update the time of the latest connection request. Therefore, a second unsuccessful connection request within the certain time period extends the period of time for which the second computing device is prohibited from connecting to the first computing device.

[0020] As an example of how these systems and methods can be implemented, consider an organization that contracts with a vendor to perform data storage operations on the contracting organization's computers (e.g., in a Software as a Service (SaS) context). In some situations, it would be advantageous for the vendor to refuse, with minimal effort, data storage operation requests by the contracting organization's computers. For example, if the contracting organization is no longer paying for the vendor's services, the vendor typically would like to refuse data storage operation requests by its computers. The vendor would ideally like to refuse such requests with minimal use of the vendor's resources. The vendor can do so in one or both of two ways. First, the vendor can add identifiers of the prohibited computers to the unauthorized connection data structure used by the first computing device (permanent or semi-permanent entries). When the prohibited computers request connections to the computing device, the first computing device will consult the unauthorized connection data structure, determine that they are identified on it and thus prohibited, and thus refuse the connection requests.

[0021] Second, the vendor can remove the accounts of the prohibited computers from the authentication manager computer that manages data storage operations and/or remove the authorization of the prohibited computers to perform data storage operations in the data storage system. This will result in the prohibited computers not being authenticated and/or properly authorized when they request connections to the first computing device. When they do request connections, the first computing device can then add their identifiers to the unauthorized connection data structure along with a timestamp indicating the connection request time (dynamic entries). Any connection requests from the prohibited computers during a certain period of time from the connection request time would then be automatically refused.

[0022] Accordingly, the first computing device's act of identifying prohibited computing devices can be thought of as short-cutting the authentication and authorization processes that normally would have to occur whenever a computing device attempts to connect to the first computing device. These processes may consume resources of the first computing device and/or the authentication manager computing device that could be directed elsewhere, such as to performing data storage operations upon the data of authorized computing devices. Accordingly, the systems and methods described herein can provide significant benefits.

[0023] Reference to this particular example is made throughout this application. Those of skill in the art will understand, however, that aspects of the invention are not limited to this particular example, and that other situations are entirely possible. For example, the prohibited computing devices may have been licensed at one time to engage in data storage operations but are no longer licensed (e.g., a trial period of software installed on the prohibited computing devices has expired). As another example, the prohibited computing devices may be using a version of software that is no longer supported or maintained. As another example, the vendor may wish to limit the time periods during which the computing devices can request data storage operations (e.g., from 2 am to 4 am) or the scope of the data storage operations requested by the computing devices (e.g., only upon certain data, or only for certain data storage operations).

[0024] In some examples, the first computing device also includes an interface blacklist data structure. The interface blacklist data structure includes zero or more entries. The entries are configured to store information including an identifier of an interface at which connection requests are received. The first computing device receives connection requests at one or more interfaces (e.g., NICs). The connection manager component determines one or more identifiers of the one or more interfaces at which a connection request from a second computing device is received. The connection manager component uses the one or more identifiers to determine from the interface blacklist data structure whether the connection request from the second computing device should be refused at the one or more interfaces. If so, the connection manager component refuses the connection request.

[0025] Various examples of the invention will now be described. The following description provides specific details for a thorough understanding and enabling description of these examples. One skilled in the relevant art will understand, however, that the invention may be practiced without many of these details. Likewise, one skilled in the relevant art will also understand that the invention may include many other obvious features not described in detail herein. Additionally, some well-known structures or functions may not be shown or described in detail below, so as to avoid unnecessarily obscuring the relevant description.

[0026] The terminology used below is to be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a detailed description of certain specific examples of the invention. Indeed, certain terms may even be emphasized below; however, any terminology intended to be interpreted in any restricted manner will be overtly and specifically defined as such in this Detailed Description section.

[0027] FIGS. 1, 2A-2C, and the discussion herein provide a brief, general description of a suitable specialized environment in which the invention can be implemented. Those skilled in the relevant art will appreciate that aspects of the invention can be practiced with other communications, data processing, or computer system configurations, including: Internet appliances, hand-held devices (including personal digital assistants (PDAs)), wearable computers, all manner of cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers, and the like. The terms "computer," "server," "host," "host system," and the like are generally used interchangeably herein, and refer to any of the above devices and systems, as well as any data processor.

[0028] While aspects of the invention, such as certain functions, are described as being performed exclusively on a single device, the invention can also be practiced in distributed environments where functions or modules are shared among disparate processing devices, which are linked through a communications network, such as a Local LAN, WAN, or the Internet. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

[0029] Aspects of the invention may be stored or distributed on tangible computer-readable media, including magnetically or optically readable computer discs, hard-wired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media. Alternatively, computer implemented instructions, data structures, screen displays, and other data under aspects of the invention may be distributed over the Internet or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme).

[0030] Aspects of the invention will now be described in detail with respect to FIGS. 1 through 5. FIG. 1 is a block diagram illustrating an example of a data storage system that may employ aspects of the invention. Entities in the data storage system may be arranged in various configurations, with certain entities on a friendly side of a firewall and others on a hostile side of the firewall or on a friendly side of a second firewall (see FIGS. 2A-2C for example configurations). An entity configured in accordance with aspects of the invention may include various components (see FIG. 3A, illustrating some components) that perform the various functions described herein.

[0031] The entity may use various data structures to carry out the performance of these functions (see FIG. 3B, illustrating one such suitable data structure containing IP address that identify blacklisted computing devices). Functions performed by the entity include receiving connection requests (see FIG. 4, illustrating a process for receiving connection requests, such as connection requests for performing data storage operations). Functions performed by the entity also include allowing or refusing connection requests (see FIGS. 5A and 5B, illustrating a process for allowing or refusing connection requests, such as those received in the process of FIG. 4).

Suitable Data Storage System

[0032] FIG. 1 illustrates an example of one arrangement of resources in a computing network, comprising a data storage system 150. The resources in the data storage system 150 may employ the processes and techniques described herein. The system 150 includes a storage manager 105, one or more data agents 195, one or more secondary storage computing devices 165, one or more storage devices 115, one or more computing devices 130 (called clients 130), one or more data or information stores 160 and 162, a single instancing database 123, an index 111, a jobs agent 120, an interface agent 125, and a management agent 131. The system 150 may represent a modular storage system such as the CommVault QiNetix system, and also the CommVault GALAXY backup system, available from CommVault Systems, Inc. of Oceanport, N.J., aspects of which are further described in the commonly-assigned U.S. patent application Ser. No. 09/610,738, now U.S. Pat. No. 7,035,880, the entirety of which is incorporated by reference herein. The system 150 may also represent a modular storage system such as the CommVault Simpana system, also available from CommVault Systems, Inc.

[0033] The system 150 may generally include combinations of hardware and software components associated with performing storage operations on electronic data. Storage operations include copying, backing up, creating, storing, retrieving, and/or migrating primary storage data (e.g., data stores 160 and/or 162) and secondary storage data (which may include, for example, snapshot copies, backup copies, hierarchical storage management (HSM) copies, archive copies, and other types of copies of electronic data stored on storage devices 115). The system 150 may provide one or more integrated management consoles for users or system processes to interface with in order to perform certain storage operations on electronic data as further described herein. Such integrated management consoles may be displayed at a central control facility or several similar consoles distributed throughout multiple network locations to provide global or geographically specific network data storage information.

[0034] In one example, storage operations may be performed according to various storage preferences, for example, as expressed by a user preference, a storage policy, a schedule policy, and/or a retention policy. A "storage policy" is generally a data structure or other information source that includes a set of preferences and other storage criteria associated with performing a storage operation. The preferences and storage criteria may include, but are not limited to, a storage location, relationships between system components, network pathways to utilize in a storage operation, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, a single instancing or variable instancing policy to apply to the data, and/or other criteria relating to a storage operation. For example, a storage policy may indicate that certain data is to be stored in the storage device 115, retained for a specified period of time before being aged to another tier of secondary storage, copied to the storage device 115 using a specified number of data streams, etc.

[0035] A "schedule policy" may specify a frequency with which to perform storage operations and a window of time within which to perform them. For example, a schedule policy may specify that a storage operation is to be performed every Saturday morning from 2:00 a.m. to 4:00 a.m. In some cases, the storage policy includes information generally specified by the schedule policy. (Put another way, the storage policy includes the schedule policy.) A "retention policy" may specify how long data is to be retained at specific tiers of storage or what criteria must be met before data may be pruned or moved from one tier of storage to another tier of storage. Storage policies, schedule policies and/or retention policies may be stored in a database of the storage manager 105, to archive media as metadata for use in restore operations or other storage operations, or to other locations or components of the system 150.

[0036] The system 150 may comprise a storage operation cell that is one of multiple storage operation cells arranged in a hierarchy or other organization. Storage operation cells may be related to backup cells and provide some or all of the functionality of backup cells as described in the assignee's U.S. patent application Ser. No. 09/354,058, now U.S. Pat. No. 7,395,282, which is incorporated herein by reference in its entirety. However, storage operation cells may also perform additional types of storage operations and other types of storage management functions that are not generally offered by backup cells.

[0037] Storage operation cells may contain not only physical devices, but also may represent logical concepts, organizations, and hierarchies. For example, a first storage operation cell may be configured to perform a first type of storage operations such as HSM operations, which may include backup or other types of data migration, and may include a variety of physical components including a storage manager 105 (or management agent 131), a secondary storage computing device 165, a client 130, and other components as described herein. A second storage operation cell may contain the same or similar physical components; however, it may be configured to perform a second type of storage operations, such as storage resource management (SRM) operations, and may include monitoring a primary data copy or performing other known SRM operations.

[0038] Thus, as can be seen from the above, although the first and second storage operation cells are logically distinct entities configured to perform different management functions (i.e., HSM and SRM, respectively), each storage operation cell may contain the same or similar physical devices. Alternatively, different storage operation cells may contain some of the same physical devices and not others. For example, a storage operation cell configured to perform SRM tasks may contain a secondary storage computing device 165, client 130, or other network device connected to a primary storage volume, while a storage operation cell configured to perform HSM tasks may instead include a secondary storage computing device 165, client 130, or other network device connected to a secondary storage volume and not contain the elements or components associated with and including the primary storage volume. (The term "connected" as used herein does not necessarily require a physical connection; rather, it could refer to two devices that are operably coupled to each other, communicably coupled to each other, in communication with each other, or more generally, refer to the capability of two devices to communicate with each other.) These two storage operation cells, however, may each include a different storage manager 105 that coordinates storage operations via the same secondary storage computing devices 165 and storage devices 115. This "overlapping" configuration allows storage resources to be accessed by more than one storage manager 105, such that multiple paths exist to each storage device 115 facilitating failover, load balancing, and promoting robust data access via alternative routes.

[0039] Alternatively or additionally, the same storage manager 105 may control two or more storage operation cells (whether or not each storage operation cell has its own dedicated storage manager 105). Moreover, in certain embodiments, the extent or type of overlap may be user-defined (through a control console) or may be automatically configured to optimize data storage and/or retrieval.

[0040] Data agent 195 may be a software module or part of a software module that is generally responsible for performing storage operations on the data of the client 130 stored in data store 160/162 or other memory location. Each client 130 may have at least one data agent 195 and the system 150 can support multiple clients 130. Data agent 195 may be distributed between client 130 and storage manager 105 (and any other intermediate components), or it may be deployed from a remote location or its functions approximated by a remote process that performs some or all of the functions of data agent 195.

[0041] The overall system 150 may employ multiple data agents 195, each of which may perform storage operations on data associated with a different application. For example, different individual data agents 195 may be designed to handle Microsoft Exchange data, Lotus Notes data, Microsoft Windows 2000 file system data, Microsoft Active Directory Objects data, and other types of data known in the art. Other embodiments may employ one or more generic data agents 195 that can handle and process multiple data types rather than using the specialized data agents described above.

[0042] If a client 130 has two or more types of data, one data agent 195 may be required for each data type to perform storage operations on the data of the client 130. For example, to back up, migrate, and restore all the data on a Microsoft Exchange 2000 server, the client 130 may use one Microsoft Exchange 2000 Mailbox data agent 195 to back up the Exchange 2000 mailboxes, one Microsoft Exchange 2000 Database data agent 195 to back up the Exchange 2000 databases, one Microsoft Exchange 2000 Public Folder data agent 195 to back up the Exchange 2000 Public Folders, and one Microsoft Windows 2000 File System data agent 195 to back up the file system of the client 130. These data agents 195 would be treated as four separate data agents 195 by the system even though they reside on the same client 130.

[0043] Alternatively, the overall system 150 may use one or more generic data agents 195, each of which may be capable of handling two or more data types. For example, one generic data agent 195 may be used to back up, migrate and restore Microsoft Exchange 2000 Mailbox data and Microsoft Exchange 2000 Database data while another generic data agent 195 may handle Microsoft Exchange 2000 Public Folder data and Microsoft Windows 2000 File System data, etc.

[0044] Data agents 195 may be responsible for arranging or packing data to be copied or migrated into a certain format such as an archive file. Nonetheless, it will be understood that this represents only one example, and any suitable packing or containerization technique or transfer methodology may be used if desired. Such an archive file may include metadata, a list of files or data objects copied, the file, and data objects themselves. Moreover, any data moved by the data agents may be tracked within the system by updating indexes associated with appropriate storage managers 105 or secondary storage computing devices 165. As used herein, a file or a data object refers to any collection or grouping of bytes of data that can be viewed as one or more logical units.

[0045] Generally speaking, storage manager 105 may be a software module or other application that coordinates and controls storage operations performed by the system 150. Storage manager 105 may communicate with some or all elements of the system 150, including clients 130, data agents 195, secondary storage computing devices 165, and storage devices 115, to initiate and manage storage operations (e.g., backups, migrations, data recovery operations, etc.).

[0046] Storage manager 105 may include a jobs agent 120 that monitors the status of some or all storage operations previously performed, currently being performed, or scheduled to be performed by the system 150. (One or more storage operations are alternatively referred to herein as a "job" or "jobs.") Jobs agent 120 may be communicatively coupled to an interface agent 125 (e.g., a software module or application). Interface agent 125 may include information processing and display software, such as a graphical user interface ("GUI"), an application programming interface ("API"), or other interactive interface through which users and system processes can retrieve information about the status of storage operations. For example, in an arrangement of multiple storage operations cell, through interface agent 125, users may optionally issue instructions to various storage operation cells regarding performance of the storage operations as described and contemplated herein. For example, a user may modify a schedule concerning the number of pending snapshot copies or other types of copies scheduled as needed to suit particular needs or requirements. As another example, a user may employ the GUI to view the status of pending storage operations in some or all of the storage operation cells in a given network or to monitor the status of certain components in a particular storage operation cell (e.g., the amount of storage capacity left in a particular storage device 115).

[0047] Storage manager 105 may also include a management agent 131 that is typically implemented as a software module or application program. In general, management agent 131 provides an interface that allows various management agents 131 in other storage operation cells to communicate with one another. For example, assume a certain network configuration includes multiple storage operation cells hierarchically arranged or otherwise logically related in a WAN or LAN configuration. With this arrangement, each storage operation cell may be connected to the other through each respective interface agent 125. This allows each storage operation cell to send and receive certain pertinent information from other storage operation cells, including status information, routing information, information regarding capacity and utilization, etc. These communications paths may also be used to convey information and instructions regarding storage operations.

[0048] For example, a management agent 131 in a first storage operation cell may communicate with a management agent 131 in a second storage operation cell regarding the status of storage operations in the second storage operation cell. Another illustrative example includes the case where a management agent 131 in a first storage operation cell communicates with a management agent 131 in a second storage operation cell to control storage manager 105 (and other components) of the second storage operation cell via management agent 131 contained in storage manager 105.

[0049] Another illustrative example is the case where management agent 131 in a first storage operation cell communicates directly with and controls the components in a second storage operation cell and bypasses the storage manager 105 in the second storage operation cell. If desired, storage operation cells can also be organized hierarchically such that hierarchically superior cells control or pass information to hierarchically subordinate cells or vice versa.

[0050] Storage manager 105 may also maintain an index, a database, or other data structure 111. The data stored in database 111 may be used to indicate logical associations between components of the system, user preferences, management tasks, media containerization and data storage information or other useful data. For example, the storage manager 105 may use data from database 111 to track logical associations between secondary storage computing device 165 and storage devices 115 (or movement of data as containerized from primary to secondary storage).

[0051] Generally speaking, the secondary storage computing device 165, which may also be referred to as a media agent, may be implemented as a software module that conveys data, as directed by storage manager 105, between a client 130 and one or more storage devices 115 such as a tape library, a magnetic media storage device, an optical media storage device, or any other suitable storage device. In one embodiment, secondary storage computing device 165 may be communicatively coupled to and control a storage device 115. A secondary storage computing device 165 may be considered to be associated with a particular storage device 115 if that secondary storage computing device 165 is capable of routing and storing data to that particular storage device 115.

[0052] In operation, a secondary storage computing device 165 associated with a particular storage device 115 may instruct the storage device to use a robotic arm or other retrieval means to load or eject a certain storage media, and to subsequently archive, migrate, or restore data to or from that media. Secondary storage computing device 165 may communicate with a storage device 115 via a suitable communications path such as a SCSI or Fibre Channel communications link. In some embodiments, the storage device 115 may be communicatively coupled to the storage manager 105 via a SAN.

[0053] Each secondary storage computing device 165 may maintain an index, a database, or other data structure 161 that may store index data generated during storage operations for secondary storage (SS) as described herein, including creating a metabase (MB). For example, performing storage operations on Microsoft Exchange data may generate index data. Such index data provides a secondary storage computing device 165 or other external device with a fast and efficient mechanism for locating data stored or backed up. Thus, a secondary storage computing device index 161, or a database 111 of a storage manager 105, may store data associating a client 130 with a particular secondary storage computing device 165 or storage device 115, for example, as specified in a storage policy, while a database or other data structure in secondary storage computing device 165 may indicate where specifically the data of the client 130 is stored in storage device 115, what specific files were stored, and other information associated with storage of the data of the client 130. In some embodiments, such index data may be stored along with the data backed up in a storage device 115, with an additional copy of the index data written to index cache in a secondary storage device. Thus the data is readily available for use in storage operations and other activities without having to be first retrieved from the storage device 115.

[0054] Generally speaking, information stored in cache is typically recent information that reflects certain particulars about operations that have recently occurred. After a certain period of time, this information is sent to secondary storage and tracked. This information may need to be retrieved and uploaded back into a cache or other memory in a secondary computing device before data can be retrieved from storage device 115. In some embodiments, the cached information may include information regarding format or containerization of archives or other files stored on storage device 115.

[0055] One or more of the secondary storage computing devices 165 may also maintain one or more single instance databases 123. Single instancing (alternatively called data deduplication) generally refers to storing in secondary storage only a single instance of each data object (or data block) in a set of data (e.g., primary data). More details as to single instancing may be found in one or more of the following commonly-assigned U.S. patent applications: 1) U.S. patent application No. 11/269,512 (entitled SYSTEM AND METHOD TO SUPPORT SINGLE INSTANCE STORAGE OPERATIONS, Attorney Docket No. 60692-8023US00); 2) U.S. patent application Ser. No. 12/145,347 (entitled APPLICATION-AWARE AND REMOTE SINGLE INSTANCE DATA MANAGEMENT, Attorney Docket No. 60692-8056US00); or 3) U.S. patent application Ser. No. 12/145,342 (entitled APPLICATION-AWARE AND REMOTE SINGLE INSTANCE DATA MANAGEMENT, Attorney Docket No. 60692-8057US00), 4) U.S. patent application Ser. No. 11/963,623 (entitled SYSTEM AND METHOD FOR STORING REDUNDANT INFORMATION, Attorney Docket No. 60692-8036US02); 5) U.S. Patent application Ser. No. 11/950,376 (entitled SYSTEMS AND METHODS FOR CREATING COPIES OF DATA SUCH AS ARCHIVE COPIES, Attorney Docket No. 60692-8037US01); or 6) U.S. patent application Ser. No. 61/100,686 (entitled SYSTEMS AND METHODS FOR MANAGING SINGLE INSTANCING DATA, Attorney Docket No. 60692-8067US00), each of which is incorporated by reference herein in its entirety.

[0056] In some examples, the secondary storage computing devices 165 maintain one or more variable instance databases. Variable instancing generally refers to storing in secondary storage one or more instances, but fewer than the total number of instances, of each data block (or data object) in a set of data (e.g., primary data). More details as to variable instancing may be found in the commonly-assigned U.S. patent application Ser. No. 61/164,803 (entitled STORING A VARIABLE NUMBER OF INSTANCES OF DATA OBJECTS, Attorney Docket No. 60692-8068US00).

[0057] In some embodiments, certain components may reside and execute on the same computer. For example, in some embodiments, a client 130 such as a data agent 195, or a storage manager 105, coordinates and directs local archiving, migration, and retrieval application functions as further described in the previously-referenced U.S. patent application Ser. No. 09/610,738. This client 130 can function independently or together with other similar clients 130.

[0058] As shown in FIG. 1, secondary storage computing devices 165 each has its own associated metabase 161. Each client 130 may also have its own associated metabase 170. However in some embodiments, each "tier" of storage, such as primary storage, secondary storage, tertiary storage, etc., may have multiple metabases or a centralized metabase, as described herein. For example, rather than a separate metabase or index associated with each client 130 in FIG. 1, the metabases on this storage tier may be centralized. Similarly, second and other tiers of storage may have either centralized or distributed metabases. Moreover, mixed architecture systems may be used if desired, that may include a first tier centralized metabase system coupled to a second tier storage system having distributed metabases and vice versa, etc.

[0059] Moreover, in operation, a storage manager 105 or other management module may keep track of certain information that allows the storage manager 105 to select, designate, or otherwise identify metabases to be searched in response to certain queries as further described herein. Movement of data between primary and secondary storage may also involve movement of associated metadata and other tracking information as further described herein.

[0060] In some examples, primary data may be organized into one or more sub-clients. A sub-client is a portion of the data of one or more clients 130, and can contain either all of the data of the clients 130 or a designated subset thereof. As depicted in

[0061] FIG. 1, the data store 162 includes two sub-clients. For example, an administrator (or other user with the appropriate permissions; the term administrator is used herein for brevity) may find it preferable to separate email data from financial data using two different sub-clients having different storage preferences, retention criteria, etc.

Suitable Computing Networks

[0062] FIGS. 2A through 2C are block diagrams, each of which illustrates an example of an arrangement of resources in a computing network that may employ the systems and methods described herein. In FIG. 2A, a firewall 250 divides a computing network 200 into a friendly side 203 and a hostile side 201. The friendly side 203 may include, for example, a LAN of an organization, and the hostile side 201 may include, for example, a public network such as the Internet. An authentication/authorization manager 185, a secondary storage computing device 165, and a storage device 115 are located on the friendly side 203 of the computing network 200, and a client 130 is located on the hostile side 201 of the computing network 200.

[0063] In FIG. 2B, two firewalls 250 divide a computing network 250 into two friendly sides 203 and a hostile side 201. An authentication/authorization manager 185, a secondary storage computing device 165, and a storage device 115 are located on the first friendly side 203 and a client 130 is located on the second friendly side 203.

[0064] In FIG. 2C, a firewall 250 divides a computing network 210 into a friendly side 203 and a hostile side 201. An authentication/authorization manager 185 is on the friendly side 203 and a client 130, a secondary storage computing device 165, and a storage device 115 are located on the hostile side 201 of the computing network 210.

[0065] For example, the configurations illustrated in FIGS. 2A or 2B could correspond to the situation described herein where an organization with multiple clients 130 has contracted with a vendor (with which the authentication/authorization manager 185, the secondary storage computing device 165, and the storage device 115 are associated) to perform data storage operations (such as backup operations) upon the data of the clients 130. In such a situation, the data would typically travel from the clients 130 through the firewall 250 (or the two firewalls 250 in FIG. 2B) to the secondary storage computing device 165, and ultimately to the storage device 115.

[0066] As can be seen from FIGS. 2A-2C, data storage operations may be performed entirely on a hostile side 201, entirely on a friendly side 203, or passing from a hostile side 201 through a firewall 250 to a friendly side 203, and in other fashions. Those of skill in the art will understand that resources may be arranged in computing networks other than those illustrated in FIGS. 2A-2C and therefore, that the aspects of the invention are not limited to being practiced in the computing networks described herein.

Computing Device

[0067] FIG. 3A is a block diagram illustrating a computing device 300 configured in accordance with aspects of the invention. The computing device 300 may be a specialized computing device that functions as described herein. Alternatively, the computing device 300 may be a general purpose computing device configured to function as described herein and that performs other functions. In some examples, the computing device 300 may be any of the computing devices described with reference to FIGS. 1 and 2A-2C (e.g., a storage manager 105, an authentication/authorization manager 185, a client 130, and/or a secondary storage computing device 165). The computing device 300 includes three blacklist data structures, one or more of which it uses to determine whether a connection request should be allowed or refused: an entity blacklist data structure 302, an unauthorized connection data structure 304, and an interface blacklist data structure 306.

[0068] The entity blacklist data structure 302 includes a list of identifiers that identify entities (e.g., computing devices). For example, an identifier may include an IP address of a computing device, a Media Access Control (MAC) addresses of a NIC of a computing device, a Universally Unique Identifier (UUID) of a computing device, a name of a computing device, etc. The entity blacklist data structure 302 may be static or relatively static, in that it changes infrequently and/or only upon actions of an administrator. The entities identified in the entity blacklist data structure 302 are those to which a connection request should be refused on a permanent or semi-permanent basis (at least until their identifiers are removed from the entity blacklist data structure 302).

[0069] For example, the administrator may wish to deny connections to certain clients 130, so that the clients 130 are unable to request that the associated computing device 300 perform data storage operations on their data. This may be desirable, for example, where the vendor wishes to deny the clients 130 of the contracting organization that is no longer paying for data storage operation services the ability to perform data storage operations or have data storage operations performed on their data. In some examples, the entity blacklist data structure 302 is a text file on a file system of the computing device 300 that contains a listing of IP addresses, one IP address per line in the text file, of entities to which connection requests should be refused.

[0070] The unauthorized connection data structure 304 likewise includes a list of identifiers (e.g., of the same type as those in the entity blacklist data structure 302, or of a different type) that identify entities (e.g., computing devices) in the data storage system. The unauthorized connection data structure 304 may also include a timestamp indicating a time at which the identified entity last made a connection request to the computing device 300. The computing device 300 creates the unauthorized connection data structure 304 when it commences performing data storage operations. The computing device 300 then adds the entries from the entity blacklist data structure 302 (the identifiers of entities to which connection requests should be refused) to the unauthorized connection data structure 304. As described in more detail herein, the computing device 300 may also add entries to the unauthorized connection data structure 304 for entities that the computing device 300 dynamically determines connection requests should be refused to. The dynamically determined entries in the unauthorized connection data structure 304 correspond to entities to which a connection request should be refused until a timeout period has expired (e.g., a one hour timeout period). After the timeout period has expired, the dynamic entries may be removed (or they may be voided or otherwise rendered inconsequential).

[0071] In some examples, instead of using the unauthorized connection data structure 304 (which, because it identifies entities to which connection requests may be refused, is effectively a blacklist), the computing device 300 includes another data structure that the connection manager component 308 uses to determine whether a connection request should be allowed or refused. This other data structure may include identifiers of entities to which connection requests should be allowed (a whitelist of entities). When the connection manager component 308 receives a connection request from an entity, it analyzes the other data structure to determine if the entity identifier is included on it. Only if it is does the connection manager component 308 allow the connection request from the identified entity.

[0072] In some cases, the dynamic entries in the unauthorized connection data structure 304 may be removed prior to the expiration of the timeout period, such as upon the occurrence of an event. For example, the vendor could remove a dynamic entry from the unauthorized connection data structure 304 if the contracting organization has resumed paying for data storage operations services. The vendor could similarly remove any permanent or semi-permanent entries from the unauthorized connection data structure 304.

[0073] The interface blacklist data structure 306 likewise includes a list of identifiers (e.g., of the same type as those in the persistent connection blacklist data structure 302, or of a different type) that identify interfaces (e.g., IP addresses) in the data storage system at which data storage operation connection requests are received. The interface blacklist data structure 306 may also be static or relatively static, in that it changes infrequently and/or only upon intervention of an administrator. The interfaces listed in the interface blacklist data structure 306 are those at which a connection request should be refused on a permanent or semi-permanent basis (at least until their identifiers are removed from the interface blacklist data structure 306).

[0074] For example, the computing device 300 may have two or more NICs, with at least one NIC configured to receive connections over a public network (e.g., the Internet) and at least one NIC configured to receive connections over a private network (e.g., a LAN). An administrator may wish to configure the computing device 300 to accept only connection requests that are received over the private network, and to ignore those received over the public network. The administrator can add the identifier of the public NIC interface (e.g., its IP address) to the interface blacklist data structure 306. This will cause the computing device 300 to refuse any connection requests received by the public NIC interface. In some examples, the interface blacklist data structure 306 is a text file on a file system of the computing device 300 that contains a listing of IP addresses, one IP address per line in the text file, of interfaces at which connection requests should be refused.

[0075] In some examples, instead of using the interface blacklist data structure 306, the computing device 300 uses a whitelist data structure that positively identifies only the interfaces at which connection requests should be allowed.

[0076] The computing device 300 also includes a connection manager component 308, a logging component 310, and a log data structure 312. The connection manager component 308, among other things, determines whether various data structures are present and if so, loads them into memory, enables blocking of connections at interfaces and blocking of connections from entities, determines whether dynamic connection blocking is enabled, waits for connection requests, receives connection requests, accesses various data structures to determine whether connection requests should be refused, attempts to authenticate entities requesting connections and determine if they are properly authorized, allows connections to authenticated and authorized entities, refuses connections to unauthenticated or unauthorized entities, and/or updates various data structures when it refuses connection requests. The connection manager component 308 may be logically located at the application layer (the top protocol layer in both the seven-layer OSI model and the four-layer TCP/IP model). The logging component 310 stores records of entities to which connection requests have been refused in the log data structure 312.

Suitable Data Structure

[0077] FIG. 3B is a diagram illustrating a suitable unauthorized connection data structure 304. The unauthorized connection data structure 304 includes multiple rows (e.g., rows 334, 336, 338, and 340), each of which is divided into columns in which information about an entity is stored. Column 330 stores an IP address of an entity, and column 332 stores a timestamp indicating a time at which the entity last requested a connection to the computing device 300 associated with the unauthorized connection data structure 304. For example, row 334 contains information about an entity having an IP address of "192.168.0.100" (column 330), and for which the timestamp is <null> (column 332). The <null> timestamp indicates that the information in row 334 was loaded into the unauthorized connection data structure 304 from the entity blacklist data structure 302. Accordingly, an entry with a <null> timestamp indicates that the connection requests from the corresponding entity are to be always or nearly always blocked (on a permanent or semi-permanent basis). In some examples, instead of using a <null> timestamp for those entries loaded from the entity blacklist data structure 302, the entry in the timestamp column 332 is empty, or the timestamp is that of a time well into the future (e.g., 2999-12-31 23:59:59).

[0078] As another example, row 336 contains information about an entity having an IP address of "192.105.1.108" (column 330), and for which the timestamp "2009-02-23 14:57:23." This timestamp indicates that the entity having this IP address last requested a connection to the associated computing device 300 at 14:57:23 on Feb. 23, 2009. The non-null timestamp indicates that the associated computing device 300 dynamically added the information in row 336. As another example, row 340 contains information about multiple entities for which the IP address equals "72.32.209.*" (column 330), with the "*" wildcard character indicating any number between 0 and 255. In some examples, the unauthorized connection data structure 304 allows for a further level of granularity by identifying ports, and only connections matching the combination of the IP address and the port are refused. In some examples, the unauthorized connection data structure 304 uses IPv6 IP addresses to identify entities. The unauthorized connection data structure 304 may include other columns storing other information, such as a column storing information about whether the entry is from the entity blacklist data structure 302 or the unauthorized connection data structure 304.

Process for Receiving Connection Requests

[0079] FIG. 4 is a flow diagram of a process 400 for receiving connection requests on a computing device 300. The process 400 begins at step 402 where the computing device 300 begins receiving connection requests. For example, the computing device 300 may be configured to start receiving connection requests automatically upon startup, or an administrator may manually configure the computing device 300 to start receiving connection requests. For example, a client 130 storing a set of data may attempt to connect to a secondary storage computing device 165 so that the secondary storage computing device 165 can perform a data storage operation upon the data stored on the client 130 or upon data in a storage device 115.

[0080] At step 404 the connection manager component 308 determines whether the interface blacklist data structure 306 is present (e.g., by determining whether the corresponding text file is present on the file system of the computing device 300). If so, the process 400 continues to step 406, where the connection manager component 308 loads the interface blacklist data structure 306 into memory, and to step 408, where the computing device enables blocking connections at interfaces. The process 400 then continues to step 410. If the interface blacklist data structure 306 is not present, the process 400 also continues to step 410, where the connection manager component 308 determines whether the entity blacklist data structure 402 is present (e.g., by determining whether the corresponding text file is present on the file system of the computing device 300). If so, the process 400 continues to step 412, where the connection manager component 308 adds the entries from the entity blacklist data structure 302 to the unauthorized connection data structure 304 and loads the unauthorized connection data structure 304 into memory. At step 414, the connection manager component 308 enables blocking connections from entities. The process 400 then continues to step 416.

[0081] If the entity blacklist data structure 302 is not present, the process 400 also continues at step 416, where the connection manager component 308 determines whether dynamic connection blocking is enabled. The connection manager component 308 may perform such analyzing by accessing a data structure (e.g., a registry key in a system registry) to determine if dynamic connection blocking is enabled. The process 400 then continues to step 418, where the connection manager component 308 waits for connection requests (e.g., from other entities in the data storage system, such as clients 130). At step 420 the connection manager component 308 receives a connection request (e.g., from another entity in the data storage system). At step 422, the connection manager component 308 determines whether to allow or refuse the connection request, such as by undergoing the process described with reference to FIG. 5. At step 424 the connection manager component 308 receives an indication to stop receiving connection requests. This may occur when the administrator configures the computing device 300 to stop receiving connection requests or when the computing device 300 shuts down. At step 426 the connection manager component 308 unloads the interface blacklist data structure 306 and the unauthorized connection data structure 304 from memory. At step 428 the computing device 300 stops receiving connection requests. The process 400 then concludes.

Process for Refusing or Allowing Connection Requests

[0082] FIG. 5 is a flow diagram of a process 500 implemented by the connection manager component 308 to determine whether to allow or refuse connection requests (e.g., requests to connect for purposes of performing data storage operations, or having data storage operations performed). These steps may be performed by the connection manager component 308 when it receives connection requests as described in FIG. 4 or at other times. The process 500 begins at step 502 where the connection manager component 308 determines whether blocking connections at interfaces is enabled. If so, the process 500 continues at step 504, where the connection manager component 308 determines the one or more interfaces at which the connection is received.

[0083] At step 506, the connection manager component 308 determines whether the connection was received at one or more interfaces identified on the interface blacklist data structure 306. If so, the process 500 continues at step 536, where the connection manager component 308 refuses the connection request, and the process 500 concludes. As an example of this aspect of the process 500, the computing device 300 may be a clustered computing device 300 having two or more nodes, each with its own NIC and corresponding interface. If an administrator wishes to block connections at one of the interfaces, the administrator can add the interface identifier (e.g., the IP address) to the interface blacklist data structure 306. Following that, the connection manager component 308 of the clustered computing device 300 will refuse any connection requests received at that interface.

[0084] If the connection was received at an interface that is not identified on the interface blacklist data structure 306, the process 500 continues at step 508, where the connection manager component 308 determines the identifier of the requestor of the connection. For example, if IP addresses are used as identifiers, the connection manager component 308 determines the IP address of the requestor. At step 510 the connection manager component 308 determines whether blocking connections from entities is enabled. If so, at step 512 the connection manager component 308 determines whether the requestor identifier is on the unauthorized connection data structure 304. The connection manager component 308 may do so by examining only entries on the unauthorized connection data structure 304 for which there is no timestamp (e.g., the timestamp is <null>), which indicate that connection requests from the corresponding entities are to be always or nearly always refused. If the connection manager component 308 determines that the requestor identifier is on the unauthorized connection data structure 304, the process 500 continues to step 536, where the connection manager component 308 refuses the connection request, and the process 500 concludes.

[0085] If the connection manager component 308 determines that the requestor identifier is not on the unauthorized connection data structure 304, the process 500 continues to step 514, where the connection manager component 308 determines whether dynamic connection blocking is enabled. If it is, the process continues to step 516 where the connection manager component 308 determines if the requestor identifier is on the unauthorized connection data structure 304 and if the time at which the connection request is made is within a particular time period following a time of an immediately prior connection request (e.g., within one hour of a time at which an immediately prior connection request was made). The connection manager component 308 may do so by examining only entries on the unauthorized connection data structure 304 for which there is a timestamp (e.g., the timestamp is not <null>), which indicate that the connection manager component 308 dynamically added these entries.

[0086] If the connection manager component 308 determines that the requestor identifier is either not on the unauthorized connection data structure 304 or the time at which the connection request is made is not within the particular time period, the process 500 continues to step 518, where the connection manager component 308 attempts to authenticate the requestor (e.g., to determine that the requestor is the entity that it purports to be) and/or determine whether the requestor is properly authorized (e.g., to determine whether the requestor is allowed to access resources of the computing device 300). The connection manager component 308 may attempt to do so in various ways. For example, the connection manager component 308 may request that the requestor provide it with an identifier (e.g., a name, a host name, an IP address, etc.) and a token (e.g., an encrypted password). The connection manager component 308 may then compare the provided identifier and token with a stored identifier and token in order to attempt to authenticate the requestor. If the connection manager component 308 successfully authenticates the requestor, then it may consult one or more data structures (e.g., Access Control Lists (ACLs) or other authorization data structures) to determine whether or not the requestor is properly authorized (e.g., to access resources of the computing device 300).

[0087] Additionally or alternatively, the connection manager component 308 may request that that the requestor provide it with its identifier and its token, and also request that a third party (e.g., the storage manager 105, the authentication/authorization manager 185, or another computing device that performs authentication and/or authorization) also provide it with the token corresponding to the requestor's identifier. Upon receipt, the requestor may then compare the two tokens to see if they match. If so, then the connection manager component 308 has authenticated the requestor and may then determine if the requestor is properly authorized, as described above. If they do not match, then the connection manager component 308 has not authenticated the requestor. Those of skill in the art will understand that various ways of authenticating a requestor and/or determining a requestor's authorization to access resources exist and that aspects of the invention are not limited to those described herein.

[0088] At step 520, the connection manager component 308 determines whether it has authenticated and/or authorized the requestor. If so, the process 500 continues to step 522, where the connection manager component 308 allows the requestor to connect to the computing device 300. At step 524, the connection manager component 308, if necessary, removes the requestor identifier from the unauthorized connection data structure 304 (e.g., removes the entry corresponding to the requestor from the unauthorized connection data structure, or voids or otherwise renders inconsequential the entry). The process 500 then concludes.

[0089] Returning to step 514, if the connection manager component 308 determines that dynamic connection blocking is not enabled, the process 500 continues at step 526, where the connection manager component 308 attempts to authenticate the requestor and/or determine whether the requestor is properly authorized. This authentication and/or authorization attempt may be similar to that described with respect to step 518. At step 528, if the connection manager component 308 determines that it has authenticated the requestor and/or the requestor is properly authorized, the process 500 continues to step 530, where the connection manager component 308 allows the requestor to connect to the computing device 300. If not, the process continues at step 532, where the connection manager component 308 refuses the connection request. In either case, the process 500 then concludes.

[0090] Returning to step 516, if the connection manager component 308 determines that the requestor identifier is on the unauthorized connection data structure 304 and the time at which the connection request is made is within the particular time period, the connection manager component 308 thus determines that the requestor is not authorized to connect to the computing device 300. The process 500 then continues at step 534. Similarly, at step 520, if the connection manager component 308 did not authenticate the requestor, the process 500 continues at step 534. At this step, the connection manager component 308 performs one of two actions. If the requestor identifier is already in an entry on the unauthorized connection blacklist 304 and there is a corresponding timestamp (e.g., the corresponding timestamp is not <null>), the connection manager component 308 updates the timestamp of the entry to the time of the connection request. This particular circumstance typically indicates that the immediately previous connection request made by the requestor was refused.

[0091] Alternately, if the requestor identifier is not in an entry on the unauthorized connection data structure 304, the connection manager component 308 adds an entry to the unauthorized connection blacklist 304 containing the requestor identifier and the time at which the connection request was made. The process then continues at step 536, where the connection manager component 308 refuses the connection request. The process 500 then concludes. In some examples, prior to the conclusion of the process 500, the logging component 310 stores a record of the allowance or refusal of the connection request in the log data structure 312.

[0092] As described herein, the vendor may employ the computing device 300 to perform data storage operations on data of clients 130 of several contracting organizations. If the vendor wishes to preclude one of the contracting organizations from requesting connections, the vendor administrator can add the identifiers of the contracting organization's clients 130 to the entity blacklist data structure 302 (and thus to the unauthorized connection data structure 304), and the computing device 300 will refuse connection requests from the identified clients 130.

[0093] Alternatively, the vendor can remove the accounts of the contracting organization's clients 130 from the third party managing computer (e.g., the storage manager 105, the authentication/authorization manager 185, and/or another computing device that performs authentication and/or authorization) and/or remove the authorization of the prohibited clients 130 to perform data storage operations. This will result in the prohibited clients 130 not being authenticated and/or properly authorized when they request connections to the computing device 300. The computing device 300 can then add their identifiers to the unauthorized connection data structure 304 along with a timestamp indicating the connection request time. Any connection requests from the prohibited clients 130 during a certain period of time from the connection request time would then be automatically refused by the connection manager component 308 of the computing device 300.

[0094] One advantage of adding identifiers of prohibited clients 130 to the entity blacklist data structure 302 is that the computing device 300 will forego any attempt to authenticate and/or determine authorization of identified clients 130, which conserves resources of the vendor's data storage system (e.g., the resources of the authentication/authorization manager 105). Accordingly, the vendor can devote its limited resources to the other organizations that have contracted it to perform data storage operations.

CONCLUSION

[0095] From the foregoing, it will be appreciated that specific examples of data storage systems have been described herein for purposes of illustration, but that various modifications may be made without deviating from the spirit and scope of the invention. For example, although copy operations may have been described, the system may be used to perform many types of storage operations (e.g., backup operations, restore operations, archival operations, copy operations, CDR operations, recovery operations, migration operations, HSM operations, etc.). Accordingly, the invention is not limited except as by the appended claims.

[0096] Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise," "comprising," and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to." The word "coupled," as generally used herein, refers to two or more elements that may be either directly connected, or connected by way of one or more intermediate elements. Additionally, the words "herein," "above," "below," and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively. The word "or" in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.

[0097] The above detailed description of embodiments of the invention is not intended to be exhaustive or to limit the invention to the precise form disclosed above. While specific embodiments of, and examples for, the invention are described above for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative embodiments may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel, or may be performed at different times.

[0098] The teachings of the invention provided herein can be applied to other systems, not necessarily the system described above. The elements and acts of the various embodiments described above can be combined to provide further embodiments.

[0099] Any patents and applications and other references noted above, including any that may be listed in accompanying filing papers, are incorporated herein by reference in their entireties. Aspects of the invention can be modified, if necessary, to employ the systems, functions, and concepts of the various references described above to provide yet further implementations of the invention.

[0100] These and other changes can be made to the invention in light of the above Detailed Description. While the above description details certain embodiments of the invention and describes the best mode contemplated, no matter how detailed the above appears in text, the invention can be practiced in many ways. Details of the system may vary considerably in implementation details, while still being encompassed by the invention disclosed herein. For example, while the computing networks described in FIGS. 2A-2C contemplate use of a firewall, the methods and systems described herein may be employed in computing networks that do not include a firewall. As noted above, particular terminology used when describing certain features or aspects of the invention should not be taken to imply that the terminology is being redefined herein to be restricted to any specific characteristics, features, or aspects of the invention with which that terminology is associated. In general, the terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification, unless the above Detailed Description section explicitly defines such terms. Accordingly, the actual scope of the invention encompasses not only the disclosed embodiments, but also all equivalent ways of practicing or implementing the invention under the claims.

[0101] While certain aspects of the invention are presented below in certain claim forms, the inventors contemplate the various aspects of the invention in any number of claim forms. For example, while only one aspect of the invention is recited as embodied in a computer-readable medium, other aspects may likewise be embodied in a computer-readable medium. As another example, while only one aspect of the invention is recited as a means-plus-function claim under 35 U.S.C. .sctn.112, sixth paragraph, other aspects may likewise be embodied as a means-plus-function claim, or in other forms, such as being embodied in a computer-readable medium. (Any claims intended to be treated under 35 U.S.C. .sctn.112, 6 will begin with the words "means for.") Accordingly, the inventors reserve the right to add additional claims after filing the application to pursue such additional claim forms for other aspects of the invention.

* * * * *