U.S. patent application number 13/628967 was filed with the patent office on 2014-03-27 for association of data to a biological sequence.
This patent application is currently assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION. The applicant listed for this patent is INTERNATIONAL BUSINESS MACHINES CORPORATION. Invention is credited to James R. Kozloski, Clifford A. Pickover, Jacinta M. Wubben, Ruhong Zhou.
Application Number | 20140089328 13/628967 |
Document ID | / |
Family ID | 50339940 |
Filed Date | 2014-03-27 |
United States Patent
Application |
20140089328 |
Kind Code |
A1 |
Kozloski; James R. ; et
al. |
March 27, 2014 |
ASSOCIATION OF DATA TO A BIOLOGICAL SEQUENCE
Abstract
A computer assembly includes a processor configured to access
data on a network and to perform a method. The method includes
identifying, in the network, one or more references having a
relevance level greater than a predetermined threshold. The one or
more references are associated to one or more probe sequences
corresponding to one or more biological sequences. The one or more
probe sequences are ranked based on one or more criteria
corresponding to a target biological sequence. The one or more
probe sequences are assigned with a level of affinity to one or
more segments of the target biological sequence based at least on
the ranking of each of the one or more probe sequences.
Inventors: |
Kozloski; James R.; (New
Fairfield, CT) ; Pickover; Clifford A.; (Yorktown
Heights, NY) ; Wubben; Jacinta M.; (Victoria, AU)
; Zhou; Ruhong; (Fort Lee, NY) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
Armonk |
NY |
US |
|
|
Assignee: |
INTERNATIONAL BUSINESS MACHINES
CORPORATION
Armonk
NY
|
Family ID: |
50339940 |
Appl. No.: |
13/628967 |
Filed: |
September 27, 2012 |
Current U.S.
Class: |
707/749 ;
707/E17.014 |
Current CPC
Class: |
G06F 16/24578 20190101;
G16B 30/00 20190201 |
Class at
Publication: |
707/749 ;
707/E17.014 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A computer assembly for associating data with a target
biological sequence, comprising: a processor configured to access
data on a network and to perform a method, the method comprising:
identifying, in the network, one or more references having a
relevance level greater than a predetermined threshold, said
references being at least one of a pointer and an address
indicating a location of data or providing information regarding
the data; associating each reference of the one or more references
to one or more probe sequences corresponding to one or more
biological sequences; ranking the one or more probe sequences based
on one or more criteria corresponding to a target biological
sequence; and assigning the one or more probe sequences with a
level of affinity to one or more segments of the target biological
sequence based at least on the ranking of each of the one or more
probe sequences.
2. The computer assembly of claim 1, wherein the references are
uniform resource locators (URLs).
3. The computer assembly of claim 1, wherein associating the one or
more probe sequences to the one or more references having a
relevance level greater than a predetermined threshold includes
analyzing the one or more references to detect the presence of one
or more of key words, phrases, symbols and sources of the one or
more references.
4. The computer assembly of claim 1, wherein associating each
reference of the one or more references to one or more probe
sequences includes associating each reference with a biological
sequence that is complementary to a biological sequence referenced
by the reference.
5. The computer assembly of claim 1, wherein ranking the one or
more probe sequences comprises determining a similarity between the
one or more probe sequences and the one or more segments of the
target biological sequence, and ranking the one or more probe
sequences further comprises at least one of determining an
importance of a source of each reference of each probe sequence,
determining a popularity of each reference of each probe sequence,
and determining a historical applicability of each reference of
each probe sequence to the one or more segments of the target
biological sequence.
6. The computer assembly of claim 5, wherein determining the
similarity between the one or more probe sequences and the one or
more segments of the target biological sequence includes
determining a match between the one or more probe sequences and a
complement of the one or more segments of the target biological
sequence.
7. The computer assembly of claim 6, wherein the method further
comprises associating the one or more probe sequences with at least
one of a document, an analysis tool, and biographical information
of a person.
8. The computer assembly of claim 7, wherein determining an
importance of a source of each reference includes at least one of
determining a number of citations of an author of the document,
determining an organization to which an author of the document
belongs, and determining a type of analysis performed by the
analysis tool, determining a popularity of each reference includes
at least one of determining a number of citations of the document
and a frequency of use of the analysis tool, and determining a
historical applicability of each reference includes at least one of
determining a frequency with which the document has been cited in
association with the segment of the target biological sequence and
determining a frequency with which the analysis tool has been used
to analyze the segment.
9. The computer assembly of claim 1, wherein the one or more probe
sequences includes at least two probe sequences corresponding to a
same segment of the target biological sequence, and assigning the
at least two probe sequences a level of affinity to the segment of
the target biological sequence includes competitively comparing the
at least two probe sequences such that a reference having a higher
ranking is assigned a higher level of affinity than a reference
having a lower ranking.
10. The computer assembly of claim 1, further comprising a display,
wherein the method further comprises displaying a graphical
representation of the segment of the target biological sequence on
the display, and displaying a graphical representation of the level
of affinity of each probe sequence to the segment of the target
biological sequence on the display by adjusting a physical distance
of a graphical representation of the probe sequence from the
graphical representation of the segment based on the level of
affinity of the probe sequence.
11. A system for simulating annealing to a biological sequence,
comprising: one or more network computers having stored therein
data; and a host computer having stored therein a biological
sequence, the host computer connected to the one or more network
computers via a communications network, the host computer
configured to identify data in the one or more network computers as
relevant data that is relevant to the biological sequence, to
perform one of identifying references to the data in the one or
more network computers and generating references to the data in the
one or more network computer, said references being at least one of
a pointer and an address indicating a location of the data
associate the relevant data with a segment of the biological
sequence, and rank the relevant data based on predetermined
criteria applied to functions of the associated segment of the
biological sequence to determine a level of affinity of the
relevant data with the segment of the biological sequence.
12. The system of claim 11, wherein the host computer is configured
to search the network for uniform resource locators (URLs) pointing
to the data stored in the one or more network computers, to
associate the URLs with the data.
13. The system of claim 11, wherein the host computer is configured
to associate the relevant data with one or more probe sequences,
the one or more probe sequences corresponding to one or more
respective segments of the biological sequence.
14. The system of claim 13, wherein the host computer is configured
to competitively rank the one or more probe sequences corresponding
to a same segment of the biological sequence, such that a probe
sequence having a higher ranking has a higher level of affinity to
the segment of the biological sequence than a probe sequence having
a lower ranking.
15. The system of claim 13, wherein the host computer is configured
to rank the one or more probe sequences based on a correspondence
between the one or more probe sequences and the segment of the
biological sequence, and the host computer is configured to further
rank the one or more probe sequences based on at least one of an
importance of a source of data associated with the one or more
probe sequences, a popularity of the data associated with the one
or more probe sequences, and a historical applicability of the data
associated with the one or more probe sequences to the segment of
the biological sequence.
16. The system of claim 15, wherein the data includes an analysis
tool relevant to the segment of the biological sequence, the
importance of the source of data associated with the one or more
probe sequences is based on at least one of a source of the
analysis tool and a type of analysis performed by the analysis
tool, a popularity of data associated with the one or more probe
sequences is based on a frequency of use of the analysis tool, and
a historical applicability of data associated with the one or more
probe sequences is based on a frequency with which an analysis tool
has been used to analyze the segment of the biological
sequence.
17. The system of claim 15, wherein the data includes a document
relevant to the segment of the biological sequence, the importance
of the source of data associated with the one or more probe
sequences is based on at least one of a number of citations of an
author of the document and an organization with which the author of
the document is associated, a popularity of data associated with
the one or more probe sequences is based on at least one of a
number of citations to the document, and a historical applicability
of data associated with the one or more probe sequences is based on
a frequency with which the document has been cited in association
with the segment of the biological sequence.
18. The system of claim 11, further comprising a display, wherein
the host computer is configured to display a graphical
representation of the segment of the biological sequence on the
display and a graphical representation of the level of affinity of
relevant data to the segment of the biological sequence on the
display by adjusting a physical distance of a graphical
representation of the relevant data from the graphical
representation of the segment based on the level of affinity of the
reference.
19. The system of claim 11, wherein the host computer is further
configured to associate the relevant data with a first probe
sequence, simulate annealing of the first probe sequence with the
segment of the biological sequence based on the determined level of
affinity of the relevant data with the segment of the biological
sequence, and simulate annealing of a second probe sequence with
the first probe sequence based on a determined level of affinity of
the second probe sequence with the first probe sequence.
20. A computer program product for simulating annealing to a
biological sequence, comprising: a processor; and a non-transitory
computer readable medium having stored thereon code to perform a
method, comprising: identifying, by the processor, references to
data in a network as relevant references that are relevant to a
biological sequence, said references being at least one of a
pointer and an address indicating a location of data or providing
information regarding the data; associating, by the processor, the
relevant references with a segment of the biological sequence; and
ranking, by the processor, the relevant references based on
predetermined criteria to determine a level of affinity of the
relevant references with the segment of the biological
sequence.
21. The computer program product of claim 20, wherein the method
comprises the relevant references with one or more probe sequences,
the one or more probe sequences corresponding to one or more
respective segments of the biological sequence; and the host
computer is configured to competitively rank the one or more probe
sequences corresponding to a same segment of the biological
sequence, such that a probe sequence having a higher ranking has a
higher level of affinity to the segment of the biological sequence
than a probe sequence having a lower ranking.
22. The computer program product of claim 20, wherein ranking the
relevant references includes ranking the one or more probe
sequences based on a correspondence between the one or more probe
sequences and the segment of the biological sequence, and ranking
the one or more probe sequences further includes ranking the one or
more probe sequences based on at least one of an importance of a
source of data associated with the one or more probe sequences, a
popularity of the data associated with the one or more probe
sequences, and a historical applicability of the data associated
with the one or more probe sequences to the segment of the
biological sequence.
23. The computer program product of claim 20 wherein the data
includes at least one of a document, an analysis tool, and
biographical information of a person, determining an importance of
a source of each reference includes at least one of determining a
number of citations of an author of the document, determining an
organization to which an author of the document belongs, and
determining a type of analysis performed by the analysis tool,
determining a popularity of each reference includes at least one of
determining a number of citations of the document and a frequency
of use of the analysis tool, and determining a historical
applicability of each reference to the segment includes at least
one of determining a frequency with which the document has been
cited in association with the segment of the biological sequence
and determining a frequency with which the analysis tool has been
used to analyze the segment.
24. The computer program product of claim 20, wherein the
references correspond to a same segment of the biological sequence,
the method further comprising: determining a level of affinity of
the relevant references with the segment of the biological sequence
includes competitively comparing the relevant references such that
a reference having a higher ranking is assigned a higher level of
affinity than a reference having a lower ranking.
25. The computer program product of claim 20, the method further
comprising: displaying a graphical representation of the segment of
the biological sequence, and displaying a graphical representation
of the level of affinity of each reference to the segment of the
biological sequence on the display by adjusting a physical distance
of a graphical representation of the reference from the graphical
representation of the segment based on the level of affinity of the
reference.
Description
BACKGROUND
[0001] The present disclosure relates to a simulated binding of
data to a biological sequence, and in particular to identifying
data that is relevant to a biological sequence, ranking the data
according to its importance, and providing the data to a user
according to the ranking.
[0002] Analysis of biological data, including biological sequences,
may require large amounts of data stored on different computers to
perform the analysis. Biological data being researched may be
annotated by a program to refer to data, such as research
publications, related to the biological data. This allows a
researcher to see other data that is related to the present
research of the biological data. While research articles and other
publications are useful for analyzing biological data, other
resources are also useful, such as analysis tools and software
programs. In addition, over time, the amount of information
regarding biological data being resourced grows. When biological
data is annotated to refer to related publications, the annotations
also increase, which may make it more difficult for a researcher to
identify the important information related to present research.
SUMMARY
[0003] Exemplary embodiments include a computer assembly for
associating data with a biological sequence. The computer assembly
includes a processor configured to access data on a network and to
perform a method. The method includes identifying, in the network,
one or more references having a relevance level greater than a
predetermined threshold. Each reference of the one or more
references is associated with a probe sequence corresponding to a
segment of a biological sequence. The method includes ranking one
or more probe sequences based on one or more criteria and assigning
the one or more probe sequences with a level of affinity to a
segment of a target biological sequence based at least on the
ranking of each probe sequence.
[0004] Embodiments further include a system for simulating
annealing to a biological sequence. The system includes one or more
network computers having stored therein data and a host computer.
The host computer has stored therein a biological sequence. The
host computer is connected to the one or more network computers via
a communications network. The host computer is configured to
identify data in the one or more network computers as relevant data
that is relevant to the biological sequence and to associate the
relevant data with a segment of the biological sequence. The host
computer is further configured to rank the relevant data based on
predetermined criteria to determine a level of affinity of the
relevant data with the segment of the biological sequence.
[0005] Embodiments further include a computer program product for
simulating annealing to a biological sequence. The computer program
product includes a processor and a non-transitory computer readable
medium having stored thereon code to perform a method. The method
includes identifying, by the processor, references to data in a
network as relevant references that are relevant to a biological
sequence and associating the relevant references with a segment of
the biological sequence. The method includes ranking the relevant
references based on predetermined criteria to determine a level of
affinity of the relevant references with the segment of the
biological sequence.
[0006] Additional features and advantages are realized by
implementation of embodiments of the present disclosure. Other
embodiments and aspects of the present disclosure are described in
detail herein and are considered a part of the claimed invention.
For a better understanding of the embodiments, including advantages
and other features, refer to the description and to the
drawings.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0007] The subject matter which is regarded embodiments of the
present disclosure is particularly pointed out and distinctly
claimed in the claims at the conclusion of the specification. The
forgoing and other features, and advantages of the embodiments are
apparent from the following detailed description taken in
conjunction with the accompanying drawings in which:
[0008] FIG. 1 illustrates a network system according to embodiments
of the present disclosure;
[0009] FIG. 2 illustrates a simulated annealing module according to
embodiments of the disclosure;
[0010] FIG. 3 illustrates a user customization display according to
embodiments of the disclosure;
[0011] FIG. 4A illustrates an annealing display according to an
embodiment of the disclosure;
[0012] FIG. 4B illustrates an annealing display according to an
embodiment of the disclosure;
[0013] FIG. 5 illustrates an annealing display according to another
embodiment of the disclosure;
[0014] FIG. 6 illustrates a table according to embodiments of the
disclosure;
[0015] FIG. 7 illustrates a flowchart of a method according to
embodiments of the disclosure;
[0016] FIG. 8 illustrates a computer system according to
embodiments of the disclosure; and
[0017] FIG. 9 illustrates a computer program product according to
embodiments of the disclosure.
DETAILED DESCRIPTION
[0018] The large volume of data that may be annotated to a
biological sequence may make it difficult for a researcher to
identify important data. Embodiments of the present disclosure
relate to displaying a simulated annealing of data and references
to a biological sequence to allow researchers to quickly identify
important information.
[0019] FIG. 1 illustrates a network system 100 according to an
embodiment of the present disclosure. The system 100 includes a
host computer 110 including a simulated annealing module 111, a
biological sequence 112 (also referred to as a target biological
sequence 112) and data 113. The simulated annealing module 111 is
configured to analyze data and references to the data, to determine
which data is relevant data 114 to the biological sequence 112, and
to determine a level of affinity of the relevant data 114 to the
biological sequence 112 based on predetermined ranking criteria.
The host computer 110 may display the determined level of affinity
by displaying the biological sequence 112, displaying relevant data
114, symbols representing the relevant data 114, or symbols
representing references to the relevant data 114, and adjusting a
distance of the relevant data 114 (or corresponding symbols) from
the biological sequence 112 based on the level of affinity of the
relevant data 114 with respect to the biological sequence 112.
[0020] The host computer 110 may be connected to a network 120. The
network 120 may communicate with one or more network computers 130,
which in the present specification and claims refer to computers
connected to the network 120 to communicate via the network 120.
The network computers 130 may include data 131, such as documents
132, analysis tools 133 and biographical data 134. While only a few
types of data are illustrated for purposes of description, any type
of data may be stored by the network computers 130. The simulated
annealing module 111 may access the data 131 to determine which
data 131 is relevant to the biological sequence 112. The simulated
annealing module 111 may rank the relevant data 131 based on the
predetermined ranking criteria to determine a level of affinity of
the data 131 to the biological sequence 112.
[0021] The host computer 110 may also be connected to one or more
storage devices 140, and the storage devices may store one or more
references 141 pointing to data 142, and one or more biological
sequences 143 that may be target biological sequences, or
biological sequences against which data is compared to determine a
level of affinity. For example, the storage 140 may contain a
database of biological sequences 143 and a user of the host
computer 110 may upload a biological sequence 143 from the storage
140 to the host computer 110 to allow the simulated annealing
module 111 to perform an analysis of data, such as data 113, data
131 and data 142 with respect to the biological sequence 143, to
determine the level of affinity of the data to the biological
sequence 143.
[0022] In addition, the network 120 may access one or more of
storage 170 and network computers 180 via a server 150 connected to
the Internet 160. Alternatively, the host computer 110 may directly
connect to the Internet 160. The storage 170 and network computer
180 may include data, references and biological sequences
accessible by the host computer 110 to perform analysis.
[0023] In embodiments of the present disclosure, the biological
sequence 112 or 143 may include any type of biological sequence,
including deoxyribonucleic acid (DNA), ribonucleic acid (RNA), an
amino acid sequence of a protein, or any other biological sequence.
Data includes documents, files, stored biographical information of
a person or information of an organization, stored publications,
data regarding a number of queries of the simulated annealing
module 111 or other systems, data regarding previous analysis
performed on the biological sequence 112, analysis tools,
algorithms or programs, medical treatments associated with the
biological sequence 112, data regarding comments or reviews of
publications or tools, or any other data. References include any
pointer or address that indicates a location of data or provides
additional information regarding the data. Examples include uniform
resource locators (URLs), uniform resource names (URNs),
hyperlinks, javascript pointers to data, XML pointers to data or
any other type of reference to data.
[0024] An operation of the host computer 110 including the
simulated annealing module 111 will be described below with
reference to FIGS. 1 and 2. The simulated annealing module 111 may
include a biological sequence identifier 206 to identify a target
biological sequence 112. For example, a user accessing the host
computer 110 may display the biological sequence 112 or data
corresponding to the biological sequence 112 on a display device.
Alternatively, the simulated annealing module 111 may
automatically, or based on predetermined commands to identify a
predetermined biological sequence or a predetermined class or group
of biological sequences, search one or more of the host computer
110, storage 140 and 170, and network computers 130 and 180 to
identify biological sequences to be target biological sequences
112. In the present specification and claims, a "target biological
sequence" is defined as a biological sequence that is selected by a
user or program to be subject to ranking of related data, and in
some embodiments simulated annealing, as described in embodiments
of the disclosure.
[0025] The simulated annealing module 111 may include a reference
identifier 201, a reference generator 202, a relevance identifier
203 and a reference/data associator 204. The reference identifier
201 may search memory of a device, such as the host computer 110,
of connected storage devices 140, of devices 130 connected to a
network 120, or of devices 170 and 180 connected to the Internet
160 for references to data, such as URLs that refer to data at a
particular location. In addition, in circumstances in which data
does not correspond to a reference, or the reference is not in a
format usable by the simulated annealing module 111, the reference
generator 202 may search memory of a device, such as the host
computer 110, of connected storage devices 140, of devices 130
connected to a network 120, or of devices 170 and 180 connected to
the Internet 160 for data, such as documents, biographical data,
data related to analysis tools, and any other data. The reference
generator 202 may then generate a reference, such as a URL that
points to a location of the data.
[0026] The relevance identifier 203 analyzes the data, such as the
data pointed to by the searched references or the data identified
by the reference generator 202, to determine whether the data meets
a threshold level of relevance. The threshold level of relevance
may be based on predetermined criteria, such as a similarity of the
data to a target biological sequence, a source of the data, such as
an organization supplying the data (e.g. university, company,
etc.), an author of the data, a publisher of the data, and a type
of operation performed by execution of the data (such as in the
case of an analysis tool for analyzing biological sequences). The
threshold level of relevance may also be based on a frequency with
which the data is accessed or referenced, a frequency with which
the data is accessed or referenced by predetermined classes, such
as researchers, scientists, professional organizations, etc., or a
frequency with which the data is associated with a target
biological sequence. In other words, the threshold level of
relevance may be related to a target sequence or may include
criteria unrelated to the target sequence. The threshold level of
relevance may be based on the content of the data, such as an
identity of a person or organization that is the subject of
biographical information data, or content of a document or file. In
addition, the threshold level relevance may be based on usage of
the data, such as how often the data is accessed or referenced or
by whom the data is accessed or referenced.
[0027] Based on a determination that the data meets a threshold
level of relevance, the reference/data associator 204 associates a
reference, either identified or generated, with the data. For
example, the reference and data, or information identifying the
data, may be stored in a reference table 205. The probe generator
207 may generate a probe or probe sequence and the probe/reference
associator 208 may associate the probe sequence with the reference,
such as by adding the probe sequence to the reference table 205. In
one embodiment, the probe represents a degree to which the
reference, or the data associated with the reference, corresponds
to a particular segment of a biological sequence to which the data
pertains. The segment of the biological sequence to which the data
pertains may be a portion less than the entire biological sequence,
but in some examples the segment could correspond to the entire
biological sequence. In one embodiment, the probe is identified by
a sequence that is complementary to a sequence of a segment of the
biological sequence to which the data pertains. For example, if a
segment of the biological sequence to which the data pertains has a
configuration of "GGGGAAAATT," the probe may correspond to a
complementary probe sequence, or "CCCCTTTTAA." Accordingly, the
host computer 110 may match references and data to portions of
biological sequences pertaining to each reference according to a
sequence indicated by a probe sequence. In other embodiments, the
probe identifies spatially, numerically or graphically a portion of
the biological sequence to which it pertains.
[0028] The rank calculator 209 may calculate a rank of each probe
sequence, or each reference, or of the data corresponding to each
reference and probe sequence. The ranking may be based on one or
more criteria, and the criteria may be weighted so that different
criteria affect the ranking more than other criteria. For example,
a user may set up a profile in the host computer 110 or the
simulated annealing module 111, and the user may indicate which
indicia the user would prefer to be given the most weight. In one
embodiment, the weighting criteria are analogous to biological
characteristics of a probe and biological sequence in a biological
annealing process.
[0029] One criterion, which may be analogous to a biological
complementarity requirement, is a determination of a similarity,
resemblance, or overlap of the data or the probe with a segment of
the target biological sequence. For example, a document may
explicitly describe the segment of the target biological sequence
or an analysis tool may have been used to analyze the segment of
the target biological sequence. Alternatively, a document may
describe a similar, but not identical, biological sequence, which
may result in a lower ranking.
[0030] Another criterion, which may be analogous to a binding
affinity of a probe in a biological annealing, is a determination
of importance or prestige of a probe, data or reference. The
importance of the data may be determined based on information about
the data that does not necessarily relate to the target biological
information. For example, the importance of the data may be based
on one or more of a number of citations the authors of a document
have received, the identities of people or organizations that have
referenced a document, a university or organization where the data
was generated, such as where research was conducted, a type of
analysis performed in a document, a name of an author of a
document, or any other factors that may provide information
regarding the prestige or importance of data in a field. When the
data relates to an analysis tool, for example, the importance of
the data may be determined based on a type of analysis performed by
the tool, a university or organization where the analysis tool was
developed, a creator of the analysis tool, or any other factors
that may provide information regarding the prestige or importance
of the analysis tool.
[0031] Another criterion, which may be analogous to probe mobility
in a biological annealing, is a determination of a popularity of
data. The determination may take into account a frequency with
which the data is accesses or cited, such as a frequency with which
a document is cited in other publications or a frequency with which
an analysis tool is used in a field. In an embodiment in which the
data is biographical information, such as information about a
researcher, the determination may consider a frequency with which
the researcher is cited. In other words, while determining the
importance or prestige of the data relates to factors that are not
directly associated with the data (such as a prestige of the source
of the data), the popularity of the data may relate to the data
itself.
[0032] Another criterion, which may be analogous to occupancy
constraints in a biological annealing, is a determination of a
historical applicability of data to a target biological sequence,
to the probe or to other probe sequences. For example, the
determination may be based on a frequency with which an analysis
tool has been used to analyze a target segment of the biological
sequence, or a frequency with which a document has been cited in
reference to the target segment of the biological sequence.
[0033] Although a few examples of ranking criteria have been
provided, embodiments of the present disclosure encompass any
ranking criteria including prior uses of a tool, prior citations of
a document, a quality of citations to the data, affiliations of an
author of data, ease-of-use of a tool, cost to implement an
analysis tool, number of software or tool citations, a date of
citations or references to the data, votes or other determinations
of importance of data by crowd sourcing, content of user comments
about the data, etc. In addition, in one embodiment a stochastic
element may be introduced to the rankings to allow for optimization
away from local minima.
[0034] In one embodiment, the probe sequence includes or has
attached to it the data associated with the ranking criteria. When
a target biological sequence is identified, it may be compared with
all of the available probe sequences, and the available probe
sequences may be ranked based on a similarity with segments of the
target biological sequence. Accordingly, in one embodiment, the
data and references identified in the system are not directly
compared with data associated with the target biological sequence.
Instead, a probe sequence having stored therein, attached thereto,
or which inherently includes the ranking criteria data is compared
to segments of the target biological sequence.
[0035] The probe sequences may be generated by systems or users
that generated the data and/or references, or the probe sequences
may be generated by a system or user that identifies a target
biological sequence. For example, in one embodiment, a system that
identifies a target biological sequence searches a network for
previously-generate probe sequences and performs the ranking
operation. In another embodiment, the system that identifies the
target biological sequence may search the network, identify
relevant data, and generate the probe sequences corresponding to
the relevant data. Then the data, references or generated probe
sequences may be compared to predetermined criteria and to the
target biological sequence. In yet another embodiment, a system may
combine both analysis of pre-generated probe sequences with the
generation of new probe sequences based on newly-identified data or
references in a network.
[0036] Once the data, references or probe sequences have been
ranked by the rank calculator 209, the annealing display generator
210 generates a graphical display of the ranking. The graphical
display may display an icon or other representation of the data,
reference, or probe and of the target biological sequence, or one
or more target segments of the biological sequence. The annealing
display generator may display the ranking by displaying icons
associated with data having a higher ranking as being located
closer to a corresponding segment of the target biological
sequence, while an icon associated with data having a lower ranking
are located farther from the segment of the target biological
sequence.
[0037] In embodiments of the present disclosure, the data may be
analyzed by the simulated annealing module 111 to determine
relevant data and to rank the data by analyzing one or more of key
words in the data, frequency of keywords, groups of key words,
frequency of groups of keywords, metadata associated with the data,
or any other content of the data or content related to the
data.
[0038] According to embodiments of the present disclosure, data,
references and probe sequences may be competitively ranked to
ensure that data, references and probe sequences determined to be
of highest interest to a user are more closely associated with a
target biological sequence being analyzed by the user.
[0039] In one embodiment, in addition to annealing one or more
probe sequences to a target biological sequence, the simulated
annealing module 111 may anneal one or more probe sequences to one
or more other probe sequences. For example, if one probe sequence
represents an analysis tool, another probe sequence representing a
program or tool for improving the efficiency of the analysis tool
may be simulated as being annealed to the first probe sequence. In
another example, if a first probe sequence represents a software
application, one or more probe sequences representing journal
citations including formulas or analysis using the software
application may be simulated as being annealed to the first probe
sequence.
[0040] In embodiments of the present disclosure, a user may
determine settings, or may generate a profile, to adjust or alter
the ranking and display of data, references, and probe sequences.
FIG. 3 illustrates a display 300 or graphical user interface (GUI)
300 which may be displayed on an electronic display device, such as
a computer monitor, to allow a user to set preferred weights. The
display 300 includes rank criteria 301a, 301b, 301c and 301d. In
FIG. 3, the rank criteria include "similarity to target sequence"
301a, "importance of reference" 301b, "popularity of reference"
301c, and "historical applicability to target sequence" 301d.
However, embodiments of the present disclosure encompass any
criteria, including pre-set criteria or criteria generated by a
user.
[0041] FIG. 3 further illustrates sub-ranking icons 302a to 302d,
which may allow a user to further specify ranking preferences. For
example, under the sub-ranking icon 302b, a user may specify that
in determining an importance of a reference, the organization with
which an author or creator is affiliated is more important, and
receives a higher weight, than a total number of citations to the
author. A user may also set minimum standards, such as a minimum
number of citations to data or a reference that is required for the
data or reference to obtain any ranking.
[0042] The display 300 may further include fields 303a to 303d that
are able to be modified by a user to adjust the weighting desired
by a user. In one embodiment, the rank calculator 209 utilizes an
algorithm to combine a user's selected weight of one or more
criteria in combination with information contained within the
relevant data or references, or metadata associated with the data
or references, to calculate a final ranking of the data, reference,
or probe sequence.
[0043] FIGS. 4A and 4B illustrate displays 400a and 400b of a icons
402, 403 and 404 representing data, references or probe sequences
annealing to target biological sequence 401 based on a ranking of
the data, references or probe sequences. Alternatively, in addition
to annealing to the target biological sequence 401, one or more of
the icons 402, 403 and 404 may be annealed to other icons among
402, 403 and 404, representing the annealing of a probe sequence to
another probe sequence that is annealed to the target biological
sequence 401. Referring to FIG. 4A, icons 402, 403 and 404 having
different visual characteristics, such as different cross-hatching,
different colors, different shapes or different graphic
representations may correspond to different types of data or
references. For example, an icon 402 having a first type of
cross-hatching may correspond to a document, an icon 403 having a
second type of cross-hatching may correspond to an application and
an icon 404 having a third type of cross-hatching may correspond to
biographical information about a person, such as a researcher.
Other types of data that may be represented by the icons 402, 403
and 404 include data about an organization, such as a company or
university, computer program information, project information about
a research project, information about web pages that may contain
relevant information, etc.
[0044] The icons are displayed to be vertically (in FIGS. 4A and
4B) aligned with a segment of the target biological sequence 401
according to the probe sequence associated with the data and
reference represented by the icons 402, 403 and 404. In addition,
the icons are displayed as being a distance away from a segment of
the target biological sequence 401 (in a horizontal direction in
FIGS. 4A and 4B) based on a ranking of the data or reference
associated with the icons. For example, the icons labeled 402 and
404 may be associated with the same or a very similar segment of
the biological sequence, as indicated by the close vertical
alignment of icons 402 and 404. However, icon 404 may be associated
with data having a higher ranking than icon 402, as illustrated by
icon 404 being located closer to the target biological sequence
than icon 402 in the horizontal direction.
[0045] In one embodiment, a user may retrieve the data represented
by the icons 402, 403 and 404, or may be provided with information
regarding where the data is located, by selecting the icons 402,
403 or 404 with a cursor, touch, or any other user interface. In
one embodiment, different ranking characteristics may change an
appearance of the icons 402, 403 and 404. For example, an icon
representing data that is referenced often may have a larger shape
than data that is referenced seldom. An icon representing data that
is available by clicking an icon may have a different outline than
data that is not. An icon representing a person may have an image
of the person. An icon representing a product, such as an analysis
tool or program, may have an icon or image associated with the tool
or program, such as a trademark.
[0046] In one embodiment, if a segment or adjacent segments of the
target biological sequence 401 include a relatively large number of
icons, the display 400a may generate a blob 405. When a user moves
a cursor over the blob 405 or performs any other action for
selecting the blob 405, the individual icons may be shown and
selected.
[0047] FIG. 4B illustrates a display 400b of the same target
biological sequence 401 as in FIG. 4A, but the ranking preferences
are different corresponding, for example, to preferences selected
by a different user. Accordingly, the icons 402, 403 and 404 may be
arranged differently and may have different numbers than in FIG.
4A. In embodiments of the present disclosure, a user may modify
preferences of the information that the user considers important to
personalize information displayed to the user related to a target
biological sequence 401.
[0048] FIG. 5 illustrates a display 500 according to another
embodiment of the present disclosure. In FIG. 5, the target
biological sequence 501 is displayed by letters representing, for
example, nucleotides, and corresponding probe sequences 502 are
represented by complementary letters, or nucleotides that may bond
with the nucleotides of the target biological sequence 501. Icons
503, 504, 505 and 506 represent different types of data, such as
publications, analysis tools, biographical information, and web
page information. The icons 503, 504, 505 and 506 may be located in
a horizontal direction (in FIG. 5) based on a segment of the target
biological sequence 501 to which the data represented by the icons
503, 504, 505 and 506 is most closely related. The icons 503, 504,
505 and 506 may be separated from the target biological sequence
501 by a distance determined by the ranking of the data or
reference associated with the icon 503, 504, 505 and 506.
[0049] As illustrated in FIG. 5, the icons 503 to 506 may contain
information related to the data that the icon represents. For
example, the icons may contain numbers to indicate a number of
citations to the data by particular sources. While examples of
displays have been provided for purposes of description,
embodiments encompass any type of display in which a user may see
an importance of data relative to a target biological sequence
based on a distance of an icon representing the data from the
target biological sequence. In one embodiment, a probe sequence may
be further bonded to one or more additional probe sequences. For
example, a user may move a cursor over an icon or select the icon,
and one or more additional linked icons may appear, corresponding
to additional data that is related to the data represented by the
selected icon. In one embodiment, the probe sequence may be treated
as a target biological sequence, and data may be analyzed and
ranked with reference to the probe sequence in the same manner as
for the original biological sequence.
[0050] FIG. 6 illustrates an example of a table 600 according to an
embodiment of the present disclosure. The table 600 may correspond
to the reference table 205 of FIG. 2, for example. The table 600
associates a reference, such as a URL, URN, or other address, link
or locator, with relevant data and a probe sequence. Examples of
relevant data have been discussed previously, and in FIG. 6 the
probe sequence corresponds to a biological sequence that is
complementary to a segment of a target biological sequence. The
table 600 may further include icon information for displaying an
icon representing the data or reference, or any other information
to be associated with the relevant data and reference. While FIGS.
2 and 6 illustrate tables to associate data, references to the data
and probe sequences, embodiments of the present disclosure
encompass any data structures for associating data, such as arrays,
pointers, or any other types of data structures with which a person
or system could associate data with references to the data and with
probe sequences.
[0051] FIG. 7 illustrates a flow diagram of a method according to
an embodiment of the present disclosure. In block 701a reference,
such as an address or pointer to data may be found by searching
memory in a computer, searching storage devices, searching devices
connected to a host device connected to a network, such as the
Internet, etc. In addition, references to data found in one or more
devices may be generated when no previous reference is found, or
when a particular type or format of reference is desired.
[0052] In block 702, data is associated with the reference. For
example, an entry may be formed in a table or another data
structure may be formed to associated with reference with the data
to which the reference points. In block 703, it may be determined
whether the data is relevant. In other words, a threshold
determination of relevance of the data may be made. The threshold
level of relevance may be based on predetermined criteria, such as
a similarity of the data to a target biological sequence, a source
of the data, such as an organization supplying the data (e.g.
university, company, etc.), an author of the data, a publisher of
the data, and a type of operation performed by execution of the
data (such as in the case of an analysis tool for analyzing
biological sequences). The threshold level of relevance may also be
based on a frequency with which the data is accessed or referenced,
a frequency with which the data is accessed or referenced by
predetermined classes, such as researchers, scientists,
professional organizations, etc., or a frequency with which the
data is associated with a target biological sequence. In other
words, the threshold level of relevance may be related to a target
sequence or may include criteria unrelated to the target sequence.
The threshold level of relevance may be based on the content of the
data, such as an identity of a person or organization based on
stored biographical information, or content of a document or file.
In addition, the threshold level relevance may be based on usage of
the data, such as how often the data is accessed or referenced or
by whom the data is accessed or referenced.
[0053] If it is determined in block 703 that the data does not meet
the threshold level of relevance, the process with respect to that
data ends in block 704. On the other hand, if the data is
determined to be sufficiently relevant, the data and reference may
be associated with a probe sequence in block 705. The probe
sequence may identify at least a portion of a target biological
sequence to which the data pertains. In one embodiment, the probe
sequence is represented by a complementary sequence to a
corresponding segment of the target biological sequence. For
example, if the biological sequence is a series of nucleotides, the
probe sequence may be a complementary series of nucleotides. In
another embodiment, the probe sequence is merely data identifying
the portion of the target biological sequence to which the data
most closely pertains. In embodiments of the present disclosure,
the data may pertain to one segment of the target biological
sequence or to more than one segment.
[0054] In block 706, the data is ranked based on predetermined
criteria to determine an affinity or bond between the data,
reference, or probe sequence and a segment of the target biological
sequence. The ranking may be based on one or more criteria, and the
criteria may be weighted so that different criteria affect the
ranking more than other criteria. Examples of ranking criteria
include a similarity of the data to the target biological sequence,
a relevance of the content of the data to the target biological
sequence, an importance or prestige of the data or reference, a
popularity of the data or reference, and a historical applicability
of the data or reference to target biological sequences.
[0055] In one embodiment, a user may add criteria, remove criteria,
and adjust a weight of criteria used to rank relevant data,
references to the data and probe sequences associated with the
data. In embodiments of the present disclosure, different users may
generate different profiles or may otherwise indicate different
preferences for ranking information related to a target biological
sequence. In block 707, the relevant data may be bound to the
target biological sequence, or segments of the target biological
sequence, based on the ranking. In particular, the target
biological sequence may be displayed and icons may be displayed
with the target biological sequence representing the data,
references and probe sequences.
[0056] A graphical display may display an icon or other
representation of the data, reference, or probe and of the target
biological sequence, or one or more target segments of the
biological sequence. The ranking of the data, references or probe
sequences may be displayed by displaying icons associated with data
having a higher ranking as being located closer to a corresponding
segment of the target biological sequence, while icons representing
data having a lower ranking are located farther from the segment of
the target biological sequence.
[0057] FIG. 8 illustrates a block diagram of a computer system 800
according to another embodiment of the present disclosure. The
computer 800 may correspond to the host computer 110 of FIG. 1, for
example. The methods described herein can be implemented in
hardware, software (e.g., firmware), or a combination thereof. In
an exemplary embodiment, the methods described herein are
implemented in hardware as part of the microprocessor of a special
or general-purpose digital computer, such as a personal computer,
workstation, minicomputer, or mainframe computer. The system 800
therefore may include general-purpose computer or mainframe 801
capable testing a reliability of a base program by gradually
increasing a workload of the base program over time.
[0058] In an exemplary embodiment, in terms of hardware
architecture, as shown in FIG. 8, the computer 801 includes one or
more processors 805, memory 810 coupled to a memory controller 815,
and one or more input and/or output (I/O) devices 840, 845 (or
peripherals) that are communicatively coupled via a local
input/output controller 835. The input/output controller 835 can
be, for example, one or more buses or other wired or wireless
connections, as is known in the art. The input/output controller
835 may have additional elements, which are omitted for simplicity
in description, such as controllers, buffers (caches), drivers,
repeaters, and receivers, to enable communications. Further, the
local interface may include address, control, and/or data
connections to enable appropriate communications among the
aforementioned components. The input/output controller 835 may
include a plurality of sub-channels configured to access the output
devices 840 and 845. The sub-channels may include, for example,
fiber-optic communications ports.
[0059] The processor 805 is a hardware device for executing
software, particularly that stored in storage 820, such as cache
storage, or memory 810. The processor 805 can be any custom made or
commercially available processor, a central processing unit (CPU),
an auxiliary processor among several processors associated with the
computer 801, a semiconductor based microprocessor (in the form of
a microchip or chip set), a macroprocessor, or generally any device
for executing instructions.
[0060] The memory 810 can include any one or combination of
volatile memory elements (e.g., random access memory (RAM, such as
DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g.,
ROM, erasable programmable read only memory (EPROM), electronically
erasable programmable read only memory (EEPROM), programmable read
only memory (PROM), tape, compact disc read only memory (CD-ROM),
disk, diskette, cartridge, cassette or the like, etc.). Moreover,
the memory 810 may incorporate electronic, magnetic, optical,
and/or other types of storage media. Note that the memory 810 can
have a distributed architecture, where various components are
situated remote from one another, but can be accessed by the
processor 805.
[0061] The instructions in memory 810 may include one or more
separate programs, each of which comprises an ordered listing of
executable instructions for implementing logical functions. In the
example of FIG. 8, the instructions in the memory 810 include a
suitable operating system (O/S) 811. The operating system 811
essentially controls the execution of other computer programs and
provides scheduling, input-output control, file and data
management, memory management, and communication control and
related services.
[0062] In an exemplary embodiment, a conventional keyboard 850 and
mouse 855 can be coupled to the input/output controller 835. Other
output devices such as the I/O devices 840, 845 may include input
devices, for example but not limited to a printer, a scanner,
microphone, and the like. Finally, the I/O devices 840, 845 may
further include devices that communicate both inputs and outputs,
for instance but not limited to, a network interface card (NIC) or
modulator/demodulator (for accessing other files, devices, systems,
or a network), a radio frequency (RF) or other transceiver, a
telephonic interface, a bridge, a router, and the like. The system
800 can further include a display controller 825 coupled to a
display 830. In an exemplary embodiment, the system 800 can further
include a network interface 860 for coupling to a network 865. The
network 865 can be an IP-based network for communication between
the computer 801 and any external server, client and the like via a
broadband connection. The network 865 transmits and receives data
between the computer 801 and external systems. In an exemplary
embodiment, network 865 can be a managed IP network administered by
a service provider. The network 865 may be implemented in a
wireless fashion, e.g., using wireless protocols and technologies,
such as WiFi, WiMax, etc. The network 865 can also be a
packet-switched network such as a local area network, wide area
network, metropolitan area network, Internet network, or other
similar type of network environment. The network 865 may be a fixed
wireless network, a wireless local area network (LAN), a wireless
wide area network (WAN) a personal area network (PAN), a virtual
private network (VPN), intranet or other suitable network system
and includes equipment for receiving and transmitting signals.
[0063] When the computer 801 is in operation, the processor 805 is
configured to execute instructions stored within the memory 810, to
communicate data to and from the memory 810, and to generally
control operations of the computer 801 pursuant to the
instructions.
[0064] In an exemplary embodiment, the methods described herein can
be implemented with any or a combination of the following
technologies, which are each well known in the art: a discrete
logic circuit(s) having logic gates for implementing logic
functions upon data signals, an application specific integrated
circuit (ASIC) having appropriate combinational logic gates, a
programmable gate array(s) (PGA), a field programmable gate array
(FPGA), etc.
[0065] In embodiments of the present disclosure, the simulated
annealing module 111 may comprise program code stored in the memory
810 and executed by the processor 805. The data and references
pointing to the data may be stored in the computer 801 or may be
stored on other computers, servers, databases, or other network
devices connected to the computer 801 via a network. The simulated
annealing module 111 may further include hardware components, such
as processors, memory and logic chips or structures for
implementing the simulated annealing.
[0066] As described above, embodiments can be embodied in the form
of computer-implemented processes and apparatuses for practicing
those processes. An embodiment may include a computer program
product 900 as depicted in FIG. 9 on a computer readable/usable
medium 902 with computer program code logic 904 containing
instructions embodied in tangible media as an article of
manufacture. Exemplary articles of manufacture for computer
readable/usable medium 902 may include floppy diskettes, CD-ROMs,
hard drives, universal serial bus (USB) flash drives, or any other
computer-readable storage medium, wherein, when the computer
program code logic 904 is loaded into and executed by a computer,
the computer becomes an apparatus for practicing the embodiments.
Embodiments include computer program code logic 904, for example,
whether stored in a storage medium, loaded into and/or executed by
a computer, or transmitted over some transmission medium, such as
over electrical wiring or cabling, through fiber optics, or via
electromagnetic radiation, wherein, when the computer program code
logic 904 is loaded into and executed by a computer, the computer
becomes an apparatus for practicing the embodiments. When
implemented on a general-purpose microprocessor, the computer
program code logic 904 segments configure the microprocessor to
create specific logic circuits.
[0067] As will be appreciated by one skilled in the art, aspects of
the present disclosure may be embodied as a system, method or
computer program product. Accordingly, aspects of the present
disclosure may take the form of an entirely hardware embodiment, an
entirely software embodiment (including firmware, resident
software, micro-code, etc.) or an embodiment combining software and
hardware aspects that may all generally be referred to herein as a
"circuit," "module" or "system." Furthermore, aspects of the
present disclosure may take the form of a computer program product
embodied in one or more computer readable medium(s) having computer
readable program code embodied thereon.
[0068] Any combination of one or more computer readable medium(s)
may be utilized. The computer readable medium may be a computer
readable signal medium or a computer readable storage medium. A
computer readable storage medium may be, for example, but not
limited to, an electronic, magnetic, optical, electromagnetic,
infrared, or semiconductor system, apparatus, or device, or any
suitable combination of the foregoing. More specific examples (a
non-exhaustive list) of the computer readable storage medium would
include the following: an electrical connection having one or more
wires, a portable computer diskette, a hard disk, a random access
memory (RAM), a read-only memory (ROM), an erasable programmable
read-only memory (EPROM or Flash memory), an optical fiber, a
portable compact disc read-only memory (CD-ROM), an optical storage
device, a magnetic storage device, or any suitable combination of
the foregoing. In the context of this document, a computer readable
storage medium may be any tangible medium that can contain, or
store a program for use by or in connection with an instruction
execution system, apparatus, or device.
[0069] Program code embodied on a computer readable medium may be
transmitted using any appropriate medium, including but not limited
to wireless, wireline, optical fiber cable, RF, etc., or any
suitable combination of the foregoing.
[0070] Computer program code for carrying out operations for
aspects of the present disclosure may be written in any combination
of one or more programming languages, including an object oriented
programming language such as Java, Smalltalk, C++ or the like and
conventional procedural programming languages, such as the "C"
programming language or similar programming languages. The program
code may execute entirely on the user's computer, partly on the
user's computer, as a stand-alone software package, partly on the
user's computer and partly on a remote computer or entirely on the
remote computer or server. In the latter scenario, the remote
computer may be connected to the user's computer through any type
of network, including a local area network (LAN) or a wide area
network (WAN), or the connection may be made to an external
computer (for example, through the Internet using an Internet
Service Provider).
[0071] Aspects of the present disclosure are described above with
reference to flowchart illustrations and/or block diagrams of
methods, apparatus (systems) and computer program products
according to embodiments of the present disclosure. It will be
understood that each block of the flowchart illustrations and/or
block diagrams, and combinations of blocks in the flowchart
illustrations and/or block diagrams, can be implemented by computer
program instructions. These computer program instructions may be
provided to a processor of a general purpose computer, special
purpose computer, or other programmable data processing apparatus
to produce a machine, such that the instructions, which execute via
the processor of the computer or other programmable data processing
apparatus, create means for implementing the functions/acts
specified in the flowchart and/or block diagram block or
blocks.
[0072] These computer program instructions may also be stored in a
computer readable medium that can direct a computer, other
programmable data processing apparatus, or other devices to
function in a particular manner, such that the instructions stored
in the computer readable medium produce an article of manufacture
including instructions which implement the function/act specified
in the flowchart and/or block diagram block or blocks.
[0073] The computer program instructions may also be loaded onto a
computer, other programmable data processing apparatus, or other
devices to cause a series of operational steps to be performed on
the computer, other programmable apparatus or other devices to
produce a computer implemented process such that the instructions
which execute on the computer or other programmable apparatus
provide processes for implementing the functions/acts specified in
the flowchart and/or block diagram block or blocks.
[0074] The flowchart and block diagrams in the Figures illustrate
the architecture, functionality, and operation of possible
implementations of systems, methods and computer program products
according to various embodiments of the present disclosure. In this
regard, each block in the flowchart or block diagrams may represent
a module, segment, or portion of code, which comprises one or more
executable instructions for implementing the specified logical
function(s). It should also be noted that, in some alternative
implementations, the functions noted in the block may occur out of
the order noted in the figures. For example, two blocks shown in
succession may, in fact, be executed substantially concurrently, or
the blocks may sometimes be executed in the reverse order,
depending upon the functionality involved. It will also be noted
that each block of the block diagrams and/or flowchart
illustration, and combinations of blocks in the block diagrams
and/or flowchart illustration, can be implemented by special
purpose hardware-based systems that perform the specified functions
or acts, or combinations of special purpose hardware and computer
instructions.
[0075] The terminology used herein is for the purpose of describing
particular embodiments only and is not intended to be limiting of
the invention to the particular embodiments described. As used
herein, the singular forms "a", "an" and "the" are intended to
include the plural forms as well, unless the context clearly
indicates otherwise. It will be further understood that the terms
"comprises" and/or "comprising," when used in this specification,
specify the presence of stated features, integers, steps,
operations, elements, and/or components, but do not preclude the
presence or addition of one more other features, integers, steps,
operations, element components, and/or groups thereof.
[0076] The corresponding structures, materials, acts, and
equivalents of all means or step plus function elements in the
claims below are intended to include any structure, material, or
act for performing the function in combination with other claimed
elements as specifically claimed. The description of the present
disclosure has been presented for purposes of illustration and
description, but is not intended to be exhaustive or limited to the
disclosed embodiments. Many modifications and variations will be
apparent to those of ordinary skill in the art without departing
from the scope and spirit of the embodiments of the present
disclosure.
[0077] While embodiments of the present disclosure have been
described above, it will be understood that those skilled in the
art, both now and in the future, may make various improvements and
enhancements which fall within the scope of the claims which
follow.
Sequence CWU 1
1
7129DNAHomo sapiens 1tgcggacctc ggagggcccc atccccacc 29230DNAHomo
sapiens 2acgcctggag cctcccgggg tagggggtgg 30310DNAHomo sapiens
3tgccggaatc 10410DNAHomo sapiens 4gatgggcccc 10510DNAHomo sapiens
5gtcggggaac 10610DNAHomo sapiens 6ggggaaaatt 10710DNAHomo sapiens
7ccccttttaa 10
* * * * *