Chained, Scalable Storage Devices

Cohen; Earl T.

Patent Application Summary

U.S. patent application number 13/765253 was filed with the patent office on 2013-06-20 for chained, scalable storage devices. This patent application is currently assigned to LSI CORPORATION. The applicant listed for this patent is LSI CORPORATION. Invention is credited to Earl T. Cohen.

Application Number20130159622 13/765253
Document ID /
Family ID45348918
Filed Date2013-06-20

United States Patent Application 20130159622
Kind Code A1
Cohen; Earl T. June 20, 2013

CHAINED, SCALABLE STORAGE DEVICES

Abstract

Described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.


Inventors: Cohen; Earl T.; (Oakland, CA)
Applicant:
Name City State Country Type

LSI CORPORATION;

Milpitas

CA

US
Assignee: LSI CORPORATION
Milpitas
CA

Family ID: 45348918
Appl. No.: 13/765253
Filed: February 12, 2013

Related U.S. Patent Documents

Application Number Filing Date Patent Number
13702976 Dec 7, 2012
PCT/US11/40996 Jun 17, 2011
13765253
61356443 Jun 18, 2010

Current U.S. Class: 711/114 ; 711/154
Current CPC Class: G06F 3/0626 20130101; G06F 13/4022 20130101; G06F 3/067 20130101; G06F 2213/0026 20130101; G06F 3/0631 20130101; G06F 3/0659 20130101; G06F 3/0683 20130101; G06F 3/061 20130101; G06F 12/00 20130101; G06F 3/0664 20130101; G06F 3/0688 20130101; G06F 3/0604 20130101
Class at Publication: 711/114 ; 711/154
International Class: G06F 3/06 20060101 G06F003/06

Claims



1. A method of accessing data in a chained, scalable storage system, the method comprising: receiving, by a primary agent of one or more storage devices, a host request from a host device coupled to the primary agent via a host interface network, the request to access a logical address of the one or more storage devices; determining, by the primary agent based on the logical address, a corresponding physical address in at least one of the one or more storage devices; generating, by the primary agent based on the physical address, a sub-request corresponding to the host request and each of the determined corresponding physical addresses in at least one of the one or more storage devices; sending, by the primary agent via a storage device interface network operable independently of the host device, the sub-requests to the at least one storage device, the storage device interface network a peer-to-peer network coupling the storage devices to the primary agent; and receiving, by the primary agent from the at least one storage device, respective sub-statuses in response to the sub-requests, determining an overall status based on each respective sub-status, and providing the overall status to the host device, wherein the host device is coupled to the one or more storage devices without employing a network switch.

2. The method of claim 1, wherein the storage device interface network is not directly accessible to the host interface network.

3. The method of claim 2, further comprising: sending, by each of the storage devices, data communication via a respective separate data communication path with the host separate from the storage device interface network, whereby control traffic between the host device and the storage devices is solely between the host device and the primary agent, while data communication bandwidth scales with a number of the storage devices.

4. The method of claim 1, wherein, for the method, the host interface network and the storage device interface network comprise transmission media comprising at least one of: a backplane, one or more copper cables, one or more optical fibers, one or more coaxial cables, one or more twisted pair copper wires.

5. The method of claim 4, further comprising: selectively providing higher bandwidth storage device interface network connections to a subset of the one or more storage devices.

6. The method of claim 5, wherein the subset of the one or more storage devices comprises one or more of the storage devices located proximately to the host device.

7. The method of claim 4, wherein the host interface network comprises a Peripheral Component Interconnect Express (PCI-E) network.

8. The method of claim 7, wherein the host interface network comprises a PCI-E Gen4 network, and the storage device interconnect network comprises one or more of: a PCI-E Gen3 network, an Ethernet network, a Serial Attached Small Computer System Interface (SAS) network, and a Serial Advanced Technology Attachment (SATA) network.

9. The method of claim 1, wherein, for the method, the one or more storage devices comprise at least one of: a Solid State Disk (SSD), a Hard Disk Drive (HDD), a Magnetoresistive Random Access Memory (MRAM), a tape library, and a hybrid magnetic and solid state storage system.

10. The method of claim 1, further comprising: providing a bandwidth to the host interface network that is related to an aggregate deliverable bandwidth of the one or more storage devices.

11. The method of claim 10, wherein the storage device interface network comprises one or more physical links, each link having an independent bandwidth.

12. The method of claim 11, wherein each of the one or more physical links comprise (i) a relatively lower-bandwidth sideband coupling for transferring control data, and (ii) a relatively higher-bandwidth main band coupling for transferring user data.

13. The method of claim 10, wherein the providing comprises providing each of the storage devices with a separate physical link of the host interface network.

14. The method of claim 1, further comprising: employing the one or more storage devices in a Redundant Array of Independent Disks (RAID) system.

15. A chained, scalable storage system comprising: a plurality of storage devices, at least one of the storage devices a primary agent for one or more of the plurality of storage devices; a host device coupled via a host interface network to the at least one primary agent, wherein the at least one primary agent is configured to: receive a host request from the host device, the request to access a logical address of the one or more of the plurality of storage devices; determine, based on the logical address, a corresponding physical address in at least one of the one or more of the plurality of storage devices; generate, based on the physical address, a sub-request corresponding to the host request and each of the determined corresponding physical addresses in at least one of the one or more of the plurality of storage devices; send, via a storage device interface network operable independently of the host device, the sub-requests to the at least one storage device, the storage device interface network a peer-to-peer network coupling the storage devices to the primary agent; and receive, from the at least one storage device, respective sub-statuses in response to the sub-requests, determine an overall status based on each respective sub-status, and provide the overall status to the host device, wherein the host device is coupled to the one or more storage devices without employing a network switch.

16. The system of claim 15, wherein the storage device interface network is not directly accessible to the host interface network.

17. The system of claim 16, wherein control traffic between the host device and the storage devices is solely between the host device and the at least one primary agent, and data bandwidth scales with a number of the storage devices.

18. The system of claim 15, wherein the storage device interface network is configured to, at least one of: selectively provide higher bandwidth connections to a subset of the one or more storage devices; and provide a bandwidth to the host interface network that is related to an aggregate deliverable bandwidth of the one or more storage devices.

19. The system of claim 15, wherein: the host interface network comprises a Peripheral Component Interconnect Express (PCI-E) Gen4 network; the storage device interconnect network comprises one or more of: a PCI-E Gen3 network, an Ethernet network, a Serial Attached Small Computer System Interface (SAS) network, and a Serial Advanced Technology Attachment (SATA) network; and the one or more storage devices comprise at least one of: a Solid State Disk (SSD), a Hard Disk Drive (HDD), a Magnetoresistive Random Access Memory (MRAM), a tape library, a hybrid magnetic and solid state storage system; and a Redundant Array of Independent Disks (RAID).

20. A non-transitory machine-readable medium, having encoded thereon program code, wherein, when the program code is executed by a machine, the machine implements a method of accessing data in a chained, scalable storage system, the method comprising: receiving, by a primary agent of one or more storage devices, a host request from a host device coupled to the primary agent via a host interface network, the request to access a logical address of the one or more storage devices; determining, by the primary agent based on the logical address, a corresponding physical address in at least one of the one or more storage devices; generating, by the primary agent based on the physical address, a sub-request corresponding to the host request and each of the determined corresponding physical addresses in at least one of the one or more storage devices; sending, by the primary agent via a storage device interface network operable independently of the host device, the sub-requests to the at least one storage device, the storage device interface network a peer-to-peer network coupling the storage devices to the primary agent; and receiving, by the primary agent from the at least one storage device, respective sub-statuses in response to the sub-requests, determining an overall status based on each respective sub-status, and providing the overall status to the host device, wherein the host device is coupled to the one or more storage devices without employing a network switch.
Description



CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation-in-part, and claims the benefit of the filing date, of U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, which claims the benefit of the filing date of U.S. provisional application No. 61/497,525 filed Jun. 16, 2011, International Patent Application no. PCT/US2011/040996 filed Jun. 17, 2011, and U.S. provisional application No. 61/356,443 filed Jun. 18, 2010, the teachings of all which are incorporated herein in their entireties by reference.

BACKGROUND

[0002] A Storage Area Network (SAN) is a system that provides access to consolidated, block-level storage, such as disk arrays and tape libraries, to one or more host devices coupled to the SAN. A SAN represents a plurality of storage devices as a single logical interface to the host devices, conceptually aggregating the storage implemented by each of the storage devices into a single logical storage space. A typical SAN might be scalable, meaning that the amount of storage space (e.g., the number of storage devices) can be changed as needed in different SAN systems. As noted, a SAN provides block-level access, meaning that the file system is typically managed by the host devices. A typical SAN might employ block-level protocols such as Fibre Channel (FC), Advanced Technology Attachment (ATA) over Ethernet (AoE), Internet Small Computer System Interface (iSCSI) or HyperSCSI. A SAN directly transfers data between storage devices and host devices.

[0003] A Network Attached Storage (NAS) is a system that provides file-level access to one or more host devices coupled to the NAS. Unlike a SAN, the NAS system provides a file system for its attached storage devices, essentially acting as a file server accessing one or more local block-level storage devices. A typical NAS might employ file-level protocols such as Network File System (NFS) or Server Message Block/Common Internet File System (SMB/CIFS). A SAN-NAS hybrid system is a system that provides hosts with both file-level access like a NAS device and block-level access like a SAN system from the same storage system.

[0004] In SAN, NAS and SAN-NAS hybrid systems, it is desired to employ multiple storage devices such that the size of total system storage can be increased by grouping together a plurality of storage devices. Such grouping of storage devices typically requires communication hierarchy with a switch such that the storage devices are available to the host, either individually or in aggregate.

SUMMARY

[0005] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

[0006] Described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.

BRIEF DESCRIPTION OF THE DRAWING FIGURES

[0007] Other aspects, features, and advantages of described embodiments will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which like reference numerals identify similar or identical elements.

[0008] FIG. 1 shows a block diagram of a scalable storage system in accordance with exemplary embodiments;

[0009] FIG. 2 shows a block diagram of a scalable storage system in accordance with exemplary embodiments;

[0010] FIG. 3 shows a block diagram of a scalable storage system in accordance with exemplary embodiments;

[0011] FIG. 4 shows a block diagram of a scalable storage system in accordance with exemplary embodiments; and

[0012] FIG. 5 shows a block diagram of a scalable storage system in accordance with exemplary embodiments.

DETAILED DESCRIPTION

[0013] Described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.

[0014] Table 1 defines a list of acronyms employed throughout this specification as an aid to understanding the described embodiments:

TABLE-US-00001 TABLE 1 AoE Advanced Technology CD Compact Disc Attachment (ATA) over Ethernet DVD Digital Versatile Disc CIFS Common Internet File System HDD Hard Disk Drive FC Fibre Channel IC Integrated Circuit HIF Host InterFace iSCSI Internet SCSI I/O Input/Output NAS Network Attached Storage MRAM Magnetoresistive Random Access PCI-E Peripheral Component Interconnect Memory Express NFS Network File System RAID Redundant Array of Independent PHY PHysical Layer Disks RF Radio Frequency SAN Storage Area Network SAS Serial Attached SCSI SATA Serial Advanced Technology SCSI Small Computer System Interface Attachment SoC System on Chip SMB Server Message Block SSD Solid-State Disk SRIO Serial Rapid Input/Output USB Universal Serial Bus

[0015] In some SAN, NAS or SAN-NAS hybrid systems, the storage devices might have a primary agent of the devices accept storage requests received from host devices over a host-interface (HIF) protocol. The primary agent processes the host requests and generates one or more sub-requests to secondary agents of each storage device over a peer-to-peer protocol. The secondary agents accept and process the sub-requests, and report sub-status information for each of the sub-requests to the primary agent and/or the host. The primary agent optionally accumulates the sub-statuses into an overall status of the host request. Peer-to-peer communication between the agents is optionally used to communicate redundancy information during host accesses and/or failure recoveries. Various failure recovery techniques might reallocate storage, reassign agents and recover data via redundancy information.

[0016] FIG. 1 shows a block diagram of an exemplary scalable storage system, for example as described in related U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, which is incorporated herein by reference. As shown in FIG. 1, a scalable storage system includes at least one host device (100) coupled to pluggable storage module 190 via coupling 101. Coupling 101 might be implemented as a transmission medium, such as a backplane, copper cables, optical fibers, one or more coaxial cables, one or more twisted pair copper wires, and/or one or more radio frequency (RF) channels. For example, coupling 101 might be implemented as an FC, AoE, iSCSI, or HyperSCSI link (e.g., in a SAN system) or as an NFS or SMB/CIFS link (e.g., in a NAS system).

[0017] Pluggable storage module 190 includes at least one host/storage device interface (shown as 180). Although shown in FIG. 1 as being integrated with pluggable storage module 190, in some embodiments, host/storage device interface 180 might be integrated with each host device 100. In some embodiments, pluggable storage module 190 might be implemented as an add-in card. As shown in FIG. 1, pluggable storage module 190 includes host-visible storage 110, which includes one or more storage devices 110(1)-110(N). Host-visible storage 110 implements storage, part or all of which is configured to allow access by host devices 100 via host to storage device interface 180. Pluggable storage module 190 also includes host-invisible storage 120, which includes one or more storage devices 120(1)-120(M). Host-invisible storage 120 implements storage that is not directly reported and, thus, "invisible," to host devices 100. However, the storage that is invisible to the host is reported and is indirectly accessible to host devices 100 by elements of host-visible storage 110, for example via a peer-to-peer protocol. For example, a primary agent of the storage elements reports the combined storage capacity of the primary agent and any secondary agents in communication with the primary agent, even though the secondary agents are not visible to host device 100. In some embodiments, storage devices 110 and 120 are physical storage devices, such as Solid State Disks (SSDs), Hard Disk Drives (HDDs), tape libraries, hybrid magnetic and solid state storage systems, or some combination thereof.

[0018] Together, combinations of couplings 101, 111 and 121 enable request, status, and data transfers between host devices 100 and host-visible storage 110 (and host-invisible storage 120 via host-visible storage 110). For example, one or more of the couplings enable transfers via a host-interface protocol, for example by one of host devices 100 operating as a master and one of the storage elements of host-visible storage 110 operating as a slave. Further, one or more of the couplings enable transfers via a peer-to-peer protocol, for example by one of the elements of host-visible storage 110 operating as a primary agent and one of the elements of host-invisible storage 120 or another one of the elements of host-visible storage 110 operating as a secondary agent. Couplings 111 and 121 might be implemented as custom-designed communication links, or might be implemented as links conforming to a standard communication protocol such as, for example, a Small Computer System Interface (SCSI) link, a Serial Attached SCSI (SAS) link, a Serial Advanced Technology Attachment (SATA) link, a Universal Serial Bus (USB), a Fibre Channel (FC) link, an Ethernet link (e.g., a 10GE link), an IEEE 802.11 link, an IEEE 802.15 link, an IEEE 802.16 link, a Peripheral Component Interconnect Express (PCI-E) link, a Serial Rapid I/O (SRIO) link, an InfiniBand link, or other similar interface link.

[0019] In some embodiments, host/storage device interface 180 might typically be implemented as one or more PCI-E or InfiniBand switches such that host device 100, coupling 101 and host/storage device interface 180 implement a unified switch. In further embodiments, the unified switch is operable as a transparent switch with respect to host-visible storage 110 and also simultaneously operable as a non-transparent switch with respect to host-invisible storage 120. As shown in FIG. 1, the PCI-E switch (e.g., host/storage device interface 180) is a separate element distinct from each of storage devices 110 and 120.

[0020] Thus, related U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, incorporated herein by reference, describes a scalable storage system including one or more PCI-E or InfiniBand switches (e.g., host/storage device interface 180). If the PCI-E switch is a non-transparent switch, details of the topology below the switch and specifics of the configuration of individual storage devices is hidden from the host device (e.g., on host initialization discovery of attached devices). Thus, employing the non-transparent switch, described embodiments could select one of the storage devices to act as a master device (e.g., a primary agent) to handle all host communication with all the storage devices, and select the rest of the storage devices to act as slave devices (e.g., as a secondary agent) that are hidden from the host device, even though all of storage devices 110 and 120 might be duplicate devices. Further, the aggregate group of storage devices might appear as a single storage device to the host device.

[0021] Other described embodiments can provide scalable functionality without employing a separate PCI-E switch by employing "neighbor-to-neighbor" communication such that communications employ point-to-point links between each of the storage devices without a need for a higher level (e.g., a PCI-E hierarchy). By techniques such as routing or switching, all of the storage devices are able to communicate among each other even though all the connections are point-to-point between the storage devices.

[0022] FIG. 2 shows a block diagram of an exemplary storage device 110. Host device 100 is coupled to storage device 110 via coupling 101. Coupling 100 is in communication with PHY interface 202. As shown in FIG. 2, PHY interface 202 includes one or more upstream physical layer links or ports (PHYs) (shown as 101) and one or more downstream PHYs (shown as 218(1)-218(N)). As shown in FIG. 2, storage device 110 includes a mass storage device 216 that includes one or more of solid-state storage 210 (e.g., an SSD), magnetic storage 212 (e.g., an HDD or tape library) and optical storage 214 (e.g., a CD or DVD). Storage device 110 includes storage interface 206, which communicates to each individual storage device 210, 212 and 214. Logical/Physical translation module 204 translates between logical addresses for operations received from host device 100 and physical addresses on mass storage 216. Storage device 110 also includes sub-status module 222 and sub-request module 220, both of which are in communication with PHY interface 202.

[0023] In described embodiments, the upstream PHYs (e.g., 101) are in communication with a host device (e.g., 100) via the PCI-E hierarchy, and downstream PHYs (e.g., 218) are in communication with other storage devices (e.g., multiple of 110). Exemplary embodiments might employ a fixed number of configurable PHYs, for example, 8 total configurable PHYs, where a given PHY might be configured as an upstream link or a downstream link. Having configurable PHYs allows for a trade-off between bandwidth delivered to host 101 (e.g., upstream connectivity) and capacity of the scalable storage system (e.g., downstream connectivity). Other embodiments might employ a fixed number of upstream PHYs and a fixed number of downstream PHYs, for example, 2 upstream PHYs and 6 downstream PHYs.

[0024] In various embodiments, some or all of PHYs 101 and 218 of a storage device (e.g., 110) might be operable at the same speed (e.g., a same maximum speed) or might each be operable at different speeds. For example, some embodiments might allow each of PHYs 101 and 218 to independently support any one or more of: PCI-E Gen1, Gen2, Gen3 or Gen4, 10GE, InfiniBand, SAS, SATA, or a nonstandard protocol for communication with one or more storage devices. Each of PHYs 101 and 218 are coupled to one or more respective PHY interfaces integrated within each storage device 110. When, for example, PHY interface 202 is a PCI-E interface, the PCI-E interface is configurable to communicate as one or more of: a root complex; a forwarding point; and an endpoint. A forwarding point is similar to a root complex in that a forwarding point can send and receive traffic among one or more PCI-E interfaces. A root complex is additionally a root of a separate PCI-E hierarchy. Since a host device (e.g., 100) coupled to one or more storage devices (e.g., 110) is itself a root complex, if one or more of the storage devices coupled to the host also is a root complex, then a multi-root PCI-E hierarchy is created.

[0025] Multiple of storage device 110 might be connected in any number of different ways. FIGS. 3-5 show block diagrams of exemplary point-to-point connections of multiple storage devices in scalable storage systems in accordance with exemplary embodiments. As shown, in various embodiments the PHYs and the PHY controllers might be coupled via: a daisy chain (or optionally a loop) as shown in FIG. 3; a fixed, 1-to-1 interconnection to a host device (shown in FIG. 4); a full crossbar topology; a partial crossbar topology; a multiplexor network; a combination thereof; or any other technique for coupling multiple hardware devices. In some embodiments, the connection network among the storage devices is a switched network, while in others, the connection network among the storage devices is a routed network. Further, in some embodiments, at least some of storage devices 110 have a different configuration of PHY, or one or more different types of PHYs (e.g., PCI-E, 10GE, InfiniBand, SAS, SATA, etc.).

[0026] As shown in FIGS. 3 and 4, storage devices 110(A)-110(N) of FIG. 3, and storage devices 110(1)-110(N) of FIG. 4 have internal PHY interfaces configured as forwarding points. FIG. 5 shows a hierarchical coupling where all of storage devices 110 have PHY interfaces that are configured as forwarding points, except storage devices 110.Z1 through 110.ZN, which have PHY interfaces configured as endpoints. Thus, in described embodiments, one or more of storage devices 110 (e.g., storage device 110(A) of FIG. 3, 110(1)-110(N) of FIG. 4, 110.A of FIG. 5) is coupled to host device 100, and all of the storage devices are coupled directly to host device 100 (e.g., as shown in FIG. 4), or are coupled indirectly to host device 100 via others of the storage devices, without employing, for example, a PCI-E switch.

[0027] At least one of storage devices 110 acts as a primary agent, and at least one or more of storage devices 110 act as secondary agents. In various embodiments, the one or more primary agents have a direct, more direct, shorter, and/or lower latency connection with host device 100 than the secondary agents. For example, as shown in FIG. 3, storage device 110(A) might act as the primary agent for storage devices 110(B)-110(N), since, for example, storage device 110(A) has a direct connection to host device 100, while storage devices 110(B)-110(N) are coupled to one another in a daisy chain. As shown in FIG. 4, all of storage devices 110(1)-110(N) are able to act as primary agents for themselves, as each storage device 110(1)-110(N) has a direct connection to host device 100. Each storage device having a direct connection to the host advantageously enables bandwidth to/from the host to scale linearly with a number of the storage devices. Further, having a subset of the storage devices, such as just one of the storage devices, act as a primary agent and the others as secondary agents enables scalable capacity without a need for the host to control a plurality of separate storage devices. As shown in FIG. 5, storage device 110.A might act as the primary agent for storage devices 110.B1-110.Bn, since, for example, storage device 110.A has a direct connection to host device 100, while storage device 110.B1 might act as a primary agent for storage devices (not shown) coupled via couplings 218(C1), and so on.

[0028] In described embodiments, all communication between primary agents and secondary agents is performed as neighbor-to-neighbor traffic that is not visible to host device 100 (and, thus, not visible to the PCI-E hierarchy of host device 100). For example, as shown in FIG. 3, all of the neighbor-to-neighbor traffic is performed on couplings 218(1)-218(N), and none of the neighbor-to-neighbor traffic is performed on connection 101 which couples storage devices 110 to host device 100. Similarly, as shown in FIG. 4, all of the neighbor-to-neighbor traffic is performed on couplings 218(1)-218(N), and none of the neighbor-to-neighbor traffic is performed on couplings 101(1)-101(N) coupling storage devices 110(1)-110(N) to host device 100. Similarly, as shown in FIG. 5, all of the neighbor-to-neighbor traffic is performed on couplings 218(B1)-218(Zn), and none of the neighbor-to-neighbor traffic is performed on coupling 101 coupling storage device 110.A to host device 100.

[0029] In described embodiments, the neighbor-to-neighbor traffic is control traffic, such as the forwarding of commands received by a primary agent from host device 100 to a specific one of storage devices 110 and responses (e.g., completions), back to a primary agent from the specific one of storage devices 110, information derived from commands received from host device 100, maintenance traffic such as synchronization or heartbeats; RAID or other data redundancy control or data traffic (e.g., deltas for RAID), and other traffic. For example, when a write command updates a part of a RAID stripe on a particular one of storage devices 110, the particular storage device sends a RAID delta to one or more of the other storage devices (e.g., the one of storage devices storing the RAID parity of the stripe) as neighbor-to-neighbor traffic.

[0030] Couplings 101 and 218, as shown in FIGS. 3-5, are optionally or selectively of different bandwidths and/or different protocols. For example, upstream connections (e.g., coupling 101) to host device 100 might typically be PCI-E Gen4, while downstream connections (e.g., couplings 218) among the various storage devices 110 might typically be PCI-E Gen3 or a different protocol, such as 10GE, InfiniBand, SAS, etc. Any of the couplings might have a different bandwidth or a different number of physical links from each other. In some embodiments, control traffic of any of couplings 101 and 218 might be transferred over relatively lower-bandwidth sideband couplings, while data traffic might be transferred over relatively higher-bandwidth main band couplings. Thus, in some embodiments, any of couplings 101 and 218 might be implemented as custom-designed communication links, or might be implemented as links conforming to a standard communication protocol such as, for example, SCSI, SAS, SATA, USB, FC, Ethernet (e.g., 10GE), IEEE 802.11, IEEE 802.15, IEEE 802.16, PCI-E, SRIO, InfiniBand, or other similar interface link.

[0031] In some embodiments, such as shown in FIG. 4, a bandwidth upstream to host device 100 is substantially equal to an aggregate deliverable bandwidth of the various storage devices 110. In some embodiments, such as shown in FIG. 5, storage devices 110 that are communicatively closer to host device 100 (e.g., storage device 110.A) are configured for a higher bandwidth than storage devices communicatively farther from host device 100 (e.g., storage device 110.Z1). In some embodiments, each of storage devices 110 might have different capacities, capabilities, or be implemented as different types of storage media, such as Solid State Disks (SSDs), Hard Disk Drives (HDDs), Magnetoresistive Random Access Memory (MRAM), tape libraries, hybrid magnetic and solid state storage systems, or some combination thereof.

[0032] In some embodiments, a connection network among storage devices 110 uses a PCI-E protocol (or other standard protocol) but in nonstandard ways, such as by having a circular (loop) interconnection (e.g., as indicated by optional coupling 218(N) in FIGS. 3 and 4). In further embodiments, the connection network among storage devices 110 is enabled to use nonstandard bandwidths, signaling, commands or protocol extensions to advantageously improve performance. In general, the connection network among the storage devices 110 is enabled to provide inter-device communication in a manner efficient in one or more of bandwidth, latency, and power.

[0033] Thus, as described herein, described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.

[0034] Reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments. The same applies to the term "implementation."

[0035] As used in this application, the word "exemplary" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word exemplary is intended to present concepts in a concrete fashion.

[0036] While the exemplary embodiments have been described with respect to processing blocks in a software program, including possible implementation as a digital signal processor, micro-controller, or general-purpose computer, described embodiments are not so limited. As would be apparent to one skilled in the art, various functions of software might also be implemented as processes of circuits. Such circuits might be employed in, for example, a single integrated circuit, a multi-chip module, a single card, or a multi-card circuit pack.

[0037] Described embodiments might also be embodied in the form of methods and apparatuses for practicing those methods. Described embodiments might also be embodied in the form of program code embodied in non-transitory tangible media, such as magnetic recording media, optical recording media, solid state memory, floppy diskettes, CD-ROMs, hard drives, or any other non-transitory machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing described embodiments. Described embodiments might can also be embodied in the form of program code, for example, whether stored in a non-transitory machine-readable storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium or carrier, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the described embodiments. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits. Described embodiments might also be embodied in the form of a bitstream or other sequence of signal values electrically or optically transmitted through a medium, stored magnetic-field variations in a magnetic recording medium, etc., generated using a method and/or an apparatus of the described embodiments.

[0038] It should be understood that the steps of the exemplary methods set forth herein are not necessarily required to be performed in the order described, and the order of the steps of such methods should be understood to be merely exemplary. Likewise, additional steps might be included in such methods, and certain steps might be omitted or combined, in methods consistent with various described embodiments.

[0039] As used herein in reference to an element and a standard, the term "compatible" means that the element communicates with other elements in a manner wholly or partially specified by the standard, and would be recognized by other elements as sufficiently capable of communicating with the other elements in the manner specified by the standard. The compatible element does not need to operate internally in a manner specified by the standard. Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word "about" or "approximately" preceded the value of the value or range.

[0040] Also for purposes of this description, the terms "couple," "coupling," "coupled," "connect," "connecting," or "connected" refer to any manner known in the art or later developed in which energy is allowed to be transferred between two or more elements, and the interposition of one or more additional elements is contemplated, although not required. Conversely, the terms "directly coupled," "directly connected," etc., imply the absence of such additional elements. Signals and corresponding nodes or ports might be referred to by the same name and are interchangeable for purposes here.

[0041] It will be further understood that various changes in the details, materials, and arrangements of the parts that have been described and illustrated in order to explain the nature of the described embodiments might be made by those skilled in the art without departing from the scope expressed in the following claims.

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed