U.S. patent application number 15/917419 was filed with the patent office on 2018-07-12 for bioprocess method and system.
The applicant listed for this patent is SYNTHACE LIMITED. Invention is credited to Michael Ian SADOWSKI, Sean Michael WARD.
Application Number | 20180196918 15/917419 |
Document ID | / |
Family ID | 50686811 |
Filed Date | 2018-07-12 |
United States Patent
Application |
20180196918 |
Kind Code |
A1 |
SADOWSKI; Michael Ian ; et
al. |
July 12, 2018 |
BIOPROCESS METHOD AND SYSTEM
Abstract
Methods, systems and apparatus for performing a biological
process are provided, wherein the method comprises implementation
of at least one unit operation, and wherein the unit operation is
defined according to a standardized element structure, the element
structure comprising a plurality of functional section blocks, and
wherein the section blocks comprise at least one of the group
consisting of: imports; parameters; data; physical inputs;
requirements; setup; and execution steps.
Inventors: |
SADOWSKI; Michael Ian;
(London, GB) ; WARD; Sean Michael; (London,
GB) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
SYNTHACE LIMITED |
London |
|
GB |
|
|
Family ID: |
50686811 |
Appl. No.: |
15/917419 |
Filed: |
March 9, 2018 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15271592 |
Sep 21, 2016 |
9977862 |
|
|
15917419 |
|
|
|
|
PCT/US2015/022280 |
Mar 24, 2015 |
|
|
|
15271592 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G16B 50/00 20190201 |
International
Class: |
G06F 19/28 20110101
G06F019/28 |
Foreign Application Data
Date |
Code |
Application Number |
Mar 24, 2014 |
GB |
1405246.8 |
Claims
1. A system for performing a biological process that includes a
step involving one or more of a reproducible and scalable
transformation of a material into a physical product or a
quantification of an amount of a molecule in the material, the
system comprising: (i) a server with at least one processing module
adapted to implement a method for performing the biological process
wherein the method comprises implementation of at least one unit
operation, and wherein the at least one unit operation is defined
according to a standardized element structure, the standardized
element structure comprising a plurality of functional section
blocks, and wherein the plurality of functional section blocks
comprises at least one of the group consisting of: imports;
parameters; data; physical inputs; requirements; setup; and
execution steps; (i) a data storing mechanism which is accessible
by the at least one processing module for maintaining a record of
standardized elements, wherein each standardized element defines a
unit operation in a biological process; and (ii) a first interface
for accessing the method.
2. The system of claim 1, further comprising a second interface for
controlling an automated laboratory equipment to implement the at
least one unit operation of the biological process to perform the
reproducible and scalable transformation of the material into a
physical product.
3. The system of claim 1, wherein the at least one unit operation
is a self-contained process of the biological process and the
standardized element structure comprises a structure of a
combination of elements that are each reproducible and scalable in
a plurality of workflows including a workflow to perform the
biological process.
4. The system of claim 1, wherein the data storing mechanism
comprises a database.
5. The system of claim 1, wherein the method is provided through a
cloud service.
6. The system of claim 1, wherein the system comprises a website or
a mobile device or computer application to access the method.
7. The system of claim 1, wherein the system is incorporated as
part of a laboratory information management system (LIMS).
8. The system of claim 1, wherein the standardized element
structure further comprises at least one additional section block
selected from the group consisting of: physical outputs; analysis;
and validation steps.
9. The system of claim 1 wherein the standardized element structure
comprises at least section blocks defining: imports; parameters;
data; physical inputs; requirements; setup; and execution
steps.
10. The system of claim 1, wherein the biological process comprises
at least two unit operations, wherein each unit operation is
defined according to a standardized element structure.
11. The system of claim 10, wherein the at least two unit
operations are non-identical.
12. The system of claim 1, wherein the at least one unit operation
is selected from the group consisting of: a conversion; a reaction;
a purification; a construct assembly step; an assay or analysis
including any of a quantification of a product, a by-product or
reagent; a nucleotide or protein/peptide synthesis; a cell culture;
an incubation; a restriction; a ligation; a mutation; an
inoculation; a lysis; a transformation; an extraction; conditioning
of a product; and an amplification.
13. The system of claim 1, wherein the biological process comprises
a manufacturing process.
14. The system of claim 1, wherein the biological process comprises
an analytical process.
15. A system for designing an experiment, the system comprising:
(i) a server with at least one processing module adapted to
implement a method for designing the experiment, wherein the method
comprises determining a reproducible and scalable biological
process for conversion of a selected input to a desired output,
wherein the reproducible and scalable process comprises at least
one unit operation selected from a database that comprises at least
one unit operation, and wherein the at least one unit operation is
defined according to a standardized element structure, the
standardized element structure comprising a plurality of functional
section blocks, and wherein the plurality of functional section
blocks comprises at least one of the group consisting of: imports;
parameters; data; physical inputs; requirements; setup; and
execution steps; (ii) a data storing mechanism which is accessible
by the at least one processing module for maintaining a record of
standardized elements, wherein each standardized element defines a
unit operation in a biological process; and (iii) at least a first
interface for a user to choose the selected input and the desired
output for the experiment, wherein the selected input comprises
physical input and the desired output is selected from either or
both a physical output and information output.
16. The system of claim 15, further comprising a second interface
for controlling an automated laboratory equipment to implement the
at least one unit operation of the biological process to perform
the reproducible and scalable transformation of a material into a
physical product.
17. The system of claim 15, wherein the at least one unit operation
is a self-contained process of the biological process and the
standardized element structure is a structure of a combination of
elements that are each reproducible and scalable in a plurality of
workflows including a workflow to perform the biological
process.
18. The system of claim 15, wherein the data storing mechanism is a
database.
19. The system of claim 15, wherein the method is provided through
a cloud service.
20. The system of claim 15, wherein the system comprises a website
or a mobile device or computer application to access the
method.
21. The system of claim 15, wherein the system is incorporated as
part of a laboratory information management system (LIMS).
Description
[0001] This application is a continuation of U.S. patent
application Ser. No. 15/271,592, filed Sep. 21, 2016, which is a
continuation of PCT/US2015/022280, filed Mar. 24, 2015, which
claims priority of GB1405246.8, filed Mar. 24, 2014. The contents
of the above-identified applications are incorporated herein by
reference in their entirety.
FIELD OF THE INVENTION
[0002] Methods and systems for design and execution of experiments
are considered in this invention, in particular design and
implementation of bioprocess manufacturing via automated laboratory
systems.
BACKGROUND OF THE INVENTION
[0003] When assembling a biological synthetic process, multiple
alternatives typically exist for each of the operations and parts
in the process, such as the structure and identity of the genetic
constructs used, the particular protocol used to perform a step
such as a transformation, purification etc. The question of how to
design the most efficient process is therefore one of choosing a
set of parts and operations, in order to satisfy design criteria
such as maximising yield of the required output.
[0004] There are very large numbers of variables that influence the
overall yield of product in a biological synthetic process, such as
the host organism selected and the particular strain of host
species used, physical factors such as temperature, pH and oxygen
availability and timing of reactions, to name a few. Therefore, the
choice of suitable parts and operations that make up a multi-step
process has to be made in the context of a highly dimensional
design space. Often the combination of variables that work in the
context of one manufacturing facility cannot be easily transposed
to other facilities. This leads to considerable difficulties in
standardisation of bio-processing and represents a key challenge
for the future of synthetic biology. By way of example, a 2012
report in Nature recounted that scientists at biotech company Amgen
had only managed to reproduce around 11% of 53 published
cancer-related studies which they had attempted over the previous
years (Begley C. G & Ellis L. M., Nature 483, 531-533 (29 Mar.
2012). Similarly, the pharmaceutical company Bayer has indicated
that in their estimation only 20-25% of published data corresponded
to their own in-house findings (Prinz, F., Schlange, T. &
Asadullah, K. Nature Rev. Drug Discov. 10, 712 (2011)).
[0005] Conventionally, essential process or experimental design
decisions have to-date been made arbitrarily based on what is usual
in the art, available or known to the experimenter or manufacturer
at the time of setting up the process or experimental pipeline.
Decisions in biological process design are often habitual or based
upon artisanal know-how passed down within laboratories or
industrial organisations. This is often complicated with time and
resource constraints leading to a trial and error development in
which a pipeline is adjusted by exchanging discrete parts and
operations or modifying parameters, in order to improve the
features of the starting pipeline. This results in design decisions
that are often suboptimal or require substantial resources to
identify reagents, operations and parameters that might be merely
satisfactory. Hence, there can be considerable institutional
resistance to change a process once it has been settled upon due to
the inherent uncertainty associated with the optimization strategy
as a whole.
[0006] Despite these problems many successful bioprocesses have
been developed and there is a recognised potential for bio-based
manufacturing to provide enormous benefits across many areas.
Hence, there exists a need in the art--particularly within
synthetic biology--to provide methods and systems that can
facilitate the design of experimental or production pipelines from
the level of the laboratory bench up to and including the
industrial-scale bioreactor. In particular there is a need to
provide methods and systems that can facilitate like-for-like
comparisons between processes as well as standardised approaches
for defining parts and protocols that may be used in experimental
design, bioprocessing and manufacturing. To achieve this, there
exists a need in the art to provide methods and systems that can
facilitate reliable design of experiments from the level of the lab
bench up to and including the industrial-scale bioreactor. These
and other uses, features and advantages of the invention should be
apparent to those skilled in the art from the teachings provided
herein.
SUMMARY OF THE INVENTION
[0007] The present inventors have overcome the problems associated
with the art by providing methods and systems for reproducible and
scalable bioprocess workflows via stacking of smart and reusable
elements.
[0008] Accordingly a first aspect of the invention provides a
method for performing a biological process wherein the method
comprises implementation of at least one unit operation, and
wherein the unit operation is defined according to a standardised
element structure, the element structure comprising a plurality of
functional section blocks, and wherein the section blocks comprise
at least one of the group consisting of: imports; parameters; data;
physical inputs; requirements; setup; and execution steps.
Suitably, the element structure further comprises at least one
additional section block selected from the group consisting of:
physical outputs, analysis and validation steps. Optionally, the
element structure comprises at least the sections blocks defining:
imports; parameters; data; physical inputs; requirements; setup;
and execution steps.
[0009] In a specific embodiment of the invention the biological
process comprises at least two unit operations, wherein each unit
operation defined according to a standardised element structure.
Suitably a plurality of unit operations may be arranged in sequence
or in parallel to create a workflow. In a further embodiment of the
invention the least two unit operations are non-identical.
[0010] In yet a further embodiment of the invention, the unit
operation is selected from the group consisting of: a conversion; a
reaction; a purification; a construct assembly step; an assay or
analysis such as a quantification of a product, a by-product or
reagent; a nucleotide or protein/peptide synthesis; a cell culture;
an incubation; a restriction; a ligation; a mutation; an
inoculation; a lysis; a transformation; an extraction; the
conditioning of a product (e.g. for storage); and an amplification
(e.g. with respect to a nucleic acid). Optionally, the biological
process is either a manufacturing process and/or an analytical
process. Suitably the process may comprise at least two unit
operations, at least one of which is a process operation and at
least one of which is an analytical process operation.
[0011] A second aspect of the invention provides a computer
implemented method comprising any of the method steps described
herein.
[0012] A third aspect of the invention provides a system for
performing a biological process, comprising:
[0013] a server with processing modules adapted to implement the
methods as described herein;
[0014] a data storing means which is accessible by the processor
for maintaining a record of standardised elements, wherein each
standardised element defines a unit operation in a biological
process; and
[0015] an interface for accessing the method.
[0016] Suitably, the data storing means is a database and/or the
data is provided through a cloud service. Optionally, the system
comprises a website or a mobile device or computer application to
access the service. Typically, the system may be incorporated as
part of a laboratory information management system (LIMS).
[0017] A fourth aspect of the invention provides a computer
readable medium comprising a database, wherein the database
comprises a plurality of unit operations, each unit operation being
suitable for use within a biological process and wherein each unit
operation is defined according to a standardised element structure,
the element structure comprising a plurality of functional section
blocks, and wherein the section blocks comprise at least one of the
group consisting of: imports; parameters; data; physical inputs;
requirements; setup; and execution steps. Typically, the element
structure further comprises at least one additional section block
selected from the group consisting of: physical outputs, analysis
and validation steps. Suitably, the element structure comprises at
least the section blocks defining: imports; parameters; data;
physical inputs; requirements; setup; and execution steps.
[0018] A fifth aspect of the invention provides an apparatus
comprising the computer readable medium described herein. In a
specific embodiment, the apparatus comprises one or more memories
and one or more processors, and wherein the one or more memories
and the one or more processors are in electronic communication with
each other, the one or more memories tangibly encoding a set of
instructions for implementing the methods of the invention as
described.
[0019] A sixth aspect of the invention provides a computer
implemented method for designing an experiment comprising the steps
of:
[0020] (i) selecting an input and a desired output for the
experiment, wherein the input comprises physical input and the
output is selected from either or both of a physical output and an
information output; and
[0021] (ii) determining a process for conversion of the input to
the desired output, wherein the process comprises at least one unit
operation, and wherein the unit operation is selected from a
database that comprises a plurality of potential unit
operations;
[0022] wherein the unit operation is defined according to a
standardised element structure, the element structure comprising a
plurality of functional section blocks, and wherein the section
blocks comprise: imports; parameters; data; physical inputs;
requirements; setup; and execution steps.
[0023] A seventh aspect of the invention provides an apparatus
comprising one or more memories and one or more processors, and
wherein the one or more memories and the one or more processors are
in electronic communication with each other, the one or more
memories tangibly encoding a set of instructions for implementing
the methods described herein.
DRAWINGS
[0024] The invention is further illustrated with reference to the
following drawings in which
[0025] FIG. 1 shows a flow diagram according to one embodiment of
the present invention
[0026] FIGS. 2 (a) and (b) show exemplary bioprocess workflows
according to embodiments of the present invention, each unit
operation is defined by an element shown as a box containing a
cog-shaped wheel symbol.
[0027] FIG. 3 shows the multi-section structure of an element
according to one embodiment of the present invention
DETAILED DESCRIPTION OF THE INVENTION
[0028] All references cited herein are incorporated by reference in
their entirety. Unless otherwise defined, all technical and
scientific terms used herein have the same meaning as commonly
understood by one of ordinary skill in the art to which this
invention belongs.
[0029] Prior to setting forth the invention, a number of
definitions are provided that will assist in the understanding of
the invention.
[0030] As used herein, the term "comprising" means any of the
recited elements are necessarily included and other elements may
optionally be included as well. "Consisting essentially of" means
any recited elements are necessarily included, elements that would
materially affect the basic and novel characteristics of the listed
elements are excluded, and other elements may optionally be
included. "Consisting of" means that all elements other than those
listed are excluded. Embodiments defined by each of these terms are
within the scope of this invention.
[0031] The term "process" is defined as a specific sequence of
transformative events performed upon a starting material in order
to achieve a specified purpose or goal. The process may result in
the transformation of the starting material into a product--in
which case the process is a "production process". Alternatively,
the process may result in the determination of information about
the starting material--in which case the process may be diagnostic
or prognostic in nature. The overall process may be sub-divided
into individual process steps that are applied in sequence to
achieve the desired outcome. According to an embodiment of the
invention, the process is a "bio-process" that uses complete living
cells or their components (e.g., prokaryotic or eukaryotic cells,
enzymes, organelles such as chloroplasts) to obtain desired
products. The processes of the present invention are subject to
process variables that are referred to as factors. Hence, a process
comprises a set of steps that are applied on inputs (including at
least a physical input) in order to produce an output (including at
least a physical output such as a product, and possibly additional
data outputs). Inputs may comprise physical starting materials
selected from one or more of the group consisting of: reagents;
cells; cellular derivatives; cellular extracts; tissue extracts;
polypeptides; peptides; proteins; nucleic acids; small molecules;
oligosaccharides; polysaccharides; polymeric molecules; elements;
organic or inorganic salts; pharmaceutical compounds; pro-drugs;
and any other composition of matter suitable for use within a
biological process. An embodiment of the invention may include a
process which involves the introduction of one or more genes into a
microorganism, which in turn expresses one or more proteins encoded
by those genes or modifies the metabolic processes of the organism
by the expression of non-protein-coding genes or other alterations
to the genetic makeup of the host. The protein(s) itself is may be
the desired product or where it functions as part of a pathway, the
protein may contribute to the generation of a desired product.
[0032] The term "unit operation" is defined as any step or sub-step
in a process that can be identified as a self-contained process or
"unit" which contributes to a set of successive steps--or
units--that together serve to make up a complete process. Suitably,
a unit operation may be selected from one or more of: a conversion;
a reaction; a purification; a construct assembly step; an assay or
analysis such as a quantification of a product, a by-product or
reagent; a sequencing of nucleic acids; a physical mixing; a
centrifugation; a spreading or physical plating of a sample; the
selective sampling of a sub population of a sample, such as colony
picking; the three dimensional placement of a sample into a
structural matrix; a nucleotide or protein/peptide synthesis; a
fermentation; a cell culture; an incubation; a restriction; a
ligation; a mutation; a transformation; a specific computation
analysis, such as a linear regression, sequence alignment, or model
based prediction; a separation such as chromatography; a
filtration; a concentration; an evaporation; a desiccation; a wash;
an extraction; the conditioning of a product (e.g. for storage);
and an amplification (e.g. with respect to a nucleic acid). It will
be appreciated that the aforementioned does not represent an
exhaustive list of potential unit operations, which are typically
reliant upon the precise nature of the process that is to be
undertaken.
[0033] The term "parts" refers to any physical element utilised
within a process or unit operation. Suitably, a part may be a
reagent, product, or input to any unit operation, or any piece of
equipment or apparatus that is used in a process or unit operation.
Typical parts may be selected from one or more of: a variant of a
gene or polynucleotide; a genetic construct; a whole cell or cell
line; an enzyme; an antibody; a small molecule; a solution (such as
buffers, reagents, culture media, etc.); a solid support; a
container (such as reaction tanks, flasks, slides, beads or
physical interaction substrates, etc.); a peptide; a polypeptide; a
functional or non-functional nucleic acid; a nutrient compound; a
growth factor; a cytokine; an element; an ionic substance, such as
an organic or inorganic anion or cation; and a gas or vapour in the
environment of the process. It will be appreciated that the
aforementioned does not represent an exhaustive list of potential
parts, which are typically reliant upon the precise nature of the
process that is to be undertaken.
[0034] The term "product" is defined as any desirable physical
output of a process. Suitably, a product may include a eukaryotic
or prokaryotic organism, virus, nanostructure, a protein,
polypeptide, polynucleotide, polymer, tissue, complex or small
molecule that is produced as a result of the process. In some
processes the product is in fact an information object, such as a
digital genetic sequence, or a measurement of system properties
that is the result of a destructive or non-destructive assay. It
will be appreciated that the aforementioned does not represent an
exhaustive list of potential products, which are typically reliant
upon the precise nature of the process that is to be
undertaken.
[0035] The term "protocol" refers to a set of instructions for
performing a unit operation. Typically, the set of instructions may
be a non-exhaustive list of actions and associated parameters that
have to be performed, such that a series of variables are set by
the protocol while additional variables are left to the user.
Typical variables that are set by a protocol may include the
identity and/or concentration of inputs to the operation, the order
and/or timing of performing various steps in the protocol, the
value of physical parameters which have to be set for some or all
steps of the protocol (such as e.g. the temperature, pH, oxygen
concentration, mixing speed, etc.), features of the equipment used,
and factors such as selecting between alternative calculation
models or analysis techniques for computationally derived steps. It
will be appreciated that the aforementioned does not represent an
exhaustive list of potential elements of a protocol, which are
typically reliant upon the precise nature of the process that is to
be undertaken.
[0036] The term "factor" is used herein to denote any defined
feature of or within a process that can be modified without
changing the ultimate goal of the process. According to one
embodiment of the present invention there are two categories of
factors: genetic and process factors.
[0037] "Process factors" suitably relate to features of a process
which are not associated with the genetics of a construct or host.
Typical process factors may include features of the equipment (e.g.
dimensions of a reaction tank, impeller configurations, siting of
probes), environment (e.g. temperature, pH, oxygenation,
atmospheric pressure), protocol (e.g. timings of significant stages
and events such as inoculation and induction), reagents (growth
media composition, nutrient level, feedstock concentration, inducer
concentration), handling of cells (stock storage conditions, size
of inoculations between reactors), process design (number of
process steps, type of reaction vessel). It will be appreciated
that the aforementioned does not represent an exhaustive list of
potential process factors, which are typically reliant upon the
precise nature of the process that is to be undertaken.
[0038] "Genetic factors" suitably relate to qualitative and
quantitative features associated with any genetic material involved
in a process, for example, such as features of the specific genetic
`construct` which is used to introduce new nucleic acid, including
DNA, into the host (e.g. identity or composition of vector),
features of the host microorganism (e.g. strain, genetic background
(including knockouts of undesirable genes and protein
overexpression, epigenetic factors), features of functional DNA
(e.g. promoter strength, ribosome binding site strength, plasmid
origin of replication, terminator, codon usage strategy, operator,
activator, gene variant). It will be appreciated that the
aforementioned does not represent an exhaustive list of potential
genetic factors, which are typically reliant upon the precise
nature of the process that is to be undertaken.
[0039] Factors, whether process or genetic factors, are deemed to
interact when the effects of changes to one factor are dependent on
the value of another factor. Typically, a given process step within
a process--such as a bioprocess--may comprise a plurality of
factors that can interact with each other. Hence, when one factor
is altered as a result of a change in a process parameter, or the
inherent characteristics associated with that factor are changed,
there can be a cascade of interactions that will modify the effects
of other factors within that process step in a causative manner.
Where a process comprises more than one process step, this cascade
of interactions may lead to additional interactions within factors
of neighbouring or even distant process steps. It follows,
therefore, that many processes can be considered to be
multi-factorial in nature.
[0040] The term "score" refers to any interpretable objective or
subjective measurement of the suitability of a part, unit operation
or protocol for a given purpose within a process. Suitably, a score
may be in the form of a user-defined rating (such as e.g. in a
range of a minimum to a maximum number of stars, points, etc.), a
grade, a proportion of positive evaluations, or a colour (such as a
traffic light ranking), or a Boolean indicator (such as a thumbs up
or thumbs down symbol). In some embodiments, a score may be in the
form of a quantifiable or measurable feature of a part or
operation, such as e.g. the quantity, purity, productivity,
efficacy of a product output; the quantity of a by-product or
contaminant present; the yield of a process; and the cost, energy
or time efficiency of a part or unit operation. It will be
appreciated that the aforementioned does not represent an
exhaustive list of potential scores, which are typically reliant
upon the precise nature of the process that is to be
undertaken.
[0041] The term "context" as used herein refers to the situational
information associated with a specified user. Context as applied to
a multidimensional rating or score provides a perspective to the
value ascribed by a score. It will be appreciated that virtually
every user will have a unique perspective when providing a rating
for any a given unit operation. The context will depend, in part,
upon the parts available to the user, the success of those parts
(e.g. apparatus, infrastructure) in performing a unit operation,
the success of the unit operation within the process as a whole or
in combination with other unit operations (e.g. compatibility with
other unit operations) and any factor variables associated with the
user.
[0042] The term "element" as used herein comprises a standardised
description of a part, protocol and/or unit operation that can be
utilised within a biological process. In this way, an element
represents a reusable unit which can be combined with other
elements to form process workflows and pipelines. According to an
embodiment of the invention, the elements can robustly describe the
inputs and outputs of unit operations. This includes both the
information flow and the physical sample flow, with strong typing
ensuring compatibility with other unit operations. Typically an
element will relate to a single workflow within a given unit
operation with a defined set up physical and information inputs
being processed into a defined set of physical and information
outputs.
[0043] In a specific embodiment of the invention, the described
method can be implemented via one or more computer systems. In
another embodiment the invention provides a computer readable
medium containing program instructions for implementing the method
of the invention, wherein execution of the program instructions by
one or more processors of a computer system causes the one or more
processors to carry out the phases as described herein. Suitably,
the computer system includes at least: an input device, an output
device, a storage medium, and a microprocessor). Possible input
devices include a keyboard, a computer mouse, a touch screen, and
the like. Output devices computer monitor, a liquid-crystal display
(LCD), light emitting diode (LED) computer monitor, virtual reality
(VR) headset and the like. In addition, information can be output
to a user, a user interface device, a computer-readable storage
medium, or another local or networked computer. Storage media
include various types of memory such as a hard disk, RAM, flash
memory, and other magnetic, optical, physical, or electronic memory
devices. The microprocessor is any typical computer microprocessor
for performing calculations and directing other functions for
performing input, output, calculation, and display of data. Two or
more computer systems may be linked using wired or wireless means
and may communicate with one another or with other computer systems
directly and/or using a publicly-available networking system such
as the Internet. Networking of computers permits various aspects of
the invention to be carried out, stored in, and shared amongst one
or more computer systems locally and at remote sites. In one
embodiment of the invention, the computer processor may comprise an
artificial neural network (ANN). In a further embodiment of the
invention the method may be incorporated as part of a laboratory
information management system (LIMS) or a software suite that is
compatible with a LIMS.
[0044] The methods of the invention may be configured to interact
with and control automated laboratory equipment including liquid
handling and dispensing apparatus or more advanced laboratory
robotic systems. Where higher numbers of factors are considered
during the factor screening phase, in one embodiment of the
invention it is an option to automate performance of factor
screening experiments using a high-level programming language to
produce reproducible and scalable workflows to underpin the
screening, refining and optimisation phases of the method. Suitable
high-level programming languages may include C++, Java.TM., Visual
Basic, Ruby, Google.RTM. Go and PHP, as well as the biology
specific language Antha.TM. (www.antha-lang.org).
[0045] FIG. 1 is a flow diagram that shows a computer implemented
platform for the design of experiments or biological processes by a
user that utilises various interacting modules. In one embodiment
of the invention the user will access the platform via a user
interface (105) so as to access a workflow design tool (101). The
user interface (105) may be comprised within a laboratory
information management system (LIMS) package, via a dedicated
software application (an `app`), via a website or any other
suitable user interface. The workflow design tool (101) enables the
user to specify the type of experiment or biological process that
is under consideration, especially by specifying inputs (e.g.
starting materials) and the desired outputs (e.g. products). In
defining the objectives of the experiment or process the workflow
design tool (101) the user is able to access the experimental
design module (101a) which provides a mechanism for breaking down
the experiment or process into one or more unit operations.
[0046] Each unit operation will comprise one or more parts and one
or more protocols. Selection of the most appropriate components of
the one or more unit operation can be accomplished within the parts
module (101b) and the protocols module (101c). The parts module
(101b) and the protocols module (101c) respectively are able to
access a library of compatible standardised parts and protocols
comprised within a parts characterisation module (102) and a
protocol definition module (103). A fully assembled workflow
provides a process pipeline that comprises at least one unit
operation, more typically a plurality of unit operations such as
the ones shown in FIGS. 2 (a) and (b). The fully assembled workflow
can be tested for compatibility with the user's available
parts--including laboratory automation apparatus--so as to provide
a validation of the workflow within the specific context of the
user. Validation can be carried out via the analysis module (101d).
It is optional for unit operations to subject to associated scoring
or rating criteria that allow for comparison of the user's unique
context with the suggested workflow. Hence, the workflow design
tool (101) provides capability to establish a design space in part
defined by the user's unique context and, in so doing, only permits
assembly of a workflow that is compatible with the user
contextualised design space.
[0047] One important aspect of the platform is that it permits
certain degrees of freedom for users to modify unit operations in
order to improve compatibility with available parts and associated
protocols. This advantageously enables a level of flexibility
within the design space as well as an evolution of unit operations
to accommodate slightly different user contexts. Once a validated
process pipeline is approved by the user the workflow can be
implemented either via fully automated laboratory systems or via a
manual implementation, or a combination of both. As the unit
operations within the pipeline are completed the laboratory
automation apparatus and/or the user are prompted to provide
feedback metrics on the successful performance of the unit
operation as well as the assembled pipeline as a whole. Feedback
metrics may include, for example, scores, ratings, data and
information on reaction conditions, yield of product, time taken
for completion of the protocol, purity of the product, amongst
others. The feedback metrics may be combined together with the
information regarding the process pipeline and communicated to a
standardisation engine (104).
[0048] The standardisation engine (104) provides a function of data
standardisation, including normalisation, reformatting and parsing
on the input information that includes the pipeline process
assembly and any accompanying modifications made by the user,
together with associated metrics and scores. Data standardisation
may comprise removal of extraneous or irrelevant information as
well as normalisation of data or values to common or standard form,
such as via reference to lookup tables. In so doing, the
standardisation engine (104) transforms the input data into a
common representation of values thereby providing a consistent
record. The standardisation engine (104) may comprise a database of
standardised unit operations, parts and protocols. Optionally the
standardisation engine (104) does not comprise a database itself
but communicates with a database within a separate module (not
shown), or within one or more databases comprised within the
workflow design tool (101). The standardisation engine provides
standardised descriptions of parts to the parts characterisation
module (102) and the protocol definition module (103) respectively.
Hence, the computer implemented platform provides an iterative
procedure for assembling unit operations from standardised parts
and procedures that are continually improved, adapted and modified
dependent upon the user's context. Suitably, the unit operations
are defined in a standardised element structure, described further
below. Where the platform is accessed by multiple users, such as in
the instance of a multi-user cloud or internet based platform,
users will benefit from the continual generation of novel and/or
improved parts, protocols and associated unit operations.
[0049] In accordance with one embodiment of the invention, the
workflow design tool (101) may select one or more unit operations
that are defined as elements. Hence, as in FIGS. 2 (a) and (b),
each unit operation in the finally assembled workflow consists of
an element.
[0050] A specific embodiment of the invention provides a method for
performing or designing a biological process--including one or more
experimental steps--wherein the method is comprised of at least one
unit operation, and wherein the unit operation is defined according
to standardised element structure. The element according to this
embodiment is shown in FIG. 3 as having a section-based format that
defines information as well as the physical inputs and outputs of
the unit operation. The use of a structured, text-based format with
a domain specific vocabulary also permits the use of version
control systems to track how protocols evolve and change over time
and to identify which changes are responsible for particular
behaviour, also avoiding repetition of errors. In one embodiment of
the invention the elements are configured to run as microservices
communicating via a network using a flow-based approach.
[0051] The element typically comprises a section-based format
having at least the following functional section blocks: imports;
parameters; data; physical inputs; requirements; setup; and
execution steps. Optionally, the element may further comprise at
least one additional section that relates to physical outputs,
analysis and/or validation.
[0052] The Import section block suitably defines a name for the
element, and specifies what additional protocols, parts or unit
operations are needed to execute the element.
[0053] The Parameters section block suitably defines the
information inputs to the element. Data types can be any of the
built-in types from high level programming languages such as
Google.RTM. Go language or Antha.TM., including int, string, byte,
float, as well as specified metric units required in the protocol.
Default parameters may be included in this section block.
[0054] The Data section block suitably defines the information
outputs from the element. The Data section follows the same format
as the Parameters block, although typically no default values are
given.
[0055] The Inputs block suitably defines the physical inputs to the
protocol, along with their appropriate type. Physical inputs may
comprise starting materials or parts used in the unit
operation.
[0056] The Outputs section block is optional and may only be
present in a unit operation in which a physical product is
generated. Examples of protocols that output a physical sample may,
thus, include a new liquid solution containing DNA, enzymes, or
cells; a lyophilised preparation comprising biological material; or
a frozen sample comprising a biopolymer.
[0057] The Requirements block is typically executed by a protocol
before it begins work, to allow confirmation that the states of any
inputs are suitable for successful completion of the unit
operation.
[0058] The Setup section block is performed once the first time
that an element is executed. This can be used to perform any
configuration that is needed globally for the element, and is also
used to define any special setup that may be needed for groups of
concurrent tasks that might be executed at the same time. Any
variables that need to be accessed by the steps function globally
can be defined here as well.
[0059] The functional core of the element of the invention is
defined within the Steps section block. The Steps block describes
the actual steps taken to transform a set of input parameters and
samples into the output data and samples. The Steps are a kernel
function, meaning they share no information for every concurrent
sample that is processed, and define the workflow to transform a
single block of inputs and samples into a single set of
outputs.
[0060] The Analysis section block is optional and defines how the
results of the Steps block should be transformed into final values,
if appropriate.
[0061] The Validation section block is optional and allows the
definition of specific tests to verify the correct execution of an
element, along with reporting capabilities as well as the ability
to declare the execution a failure.
[0062] The placement order of the section blocks within the element
may be varied in alternative embodiments of the invention. In
addition, section blocks may be combined to give dual functionality
and additional section blocks may be added to expand functionality
beyond the element set out in FIG. 3.
[0063] The invention is further illustrated by the following
non-limiting example.
Example
[0064] In this example, a unit operation of a biological process is
defined within the high level biology language Antha.TM.. The
element defines a Bradford assay, which is a molecular biology
assay used to quantify the amount of protein in a physical
sample.
[0065] Syntax wise, Antha.TM. is an extension of the Go language
(www.golang.org), and shares a focus on describing concurrent
processes functionally. Any execution of a workflow is intended to
describe a large array of parallel processes, and is described from
the standpoint of the smallest appropriate unit of operation. In
the case of this Element, that is the set of actions to process a
single physical sample, even though this protocol will normally be
run on arrays of samples at the same time. A core purpose of the
Antha.TM. system is to establish a de facto standard language for
defining protocols and parts for use in biological experimentation.
Therefore, it is designed to mask some of the programming detail
from the user and focus on the biology.
[0066] Imports:
TABLE-US-00001 protocol bradford import ( "plate_reader"
"github.com/sajari/regression" "standard_labware" )
[0067] The Antha.TM. Element starts by defining a name for the
protocol, in this case bradford, and listing what additional
protocols or Go libraries are needed to execute the bradford
protocol. The Antha.TM. compiler is intelligent enough to identify
whether the imports are existing Go libraries, or other Antha.TM.
Elements, and can be transparently imported directly from source
code repositories such as Github (www.github.com).
[0068] Parameters:
TABLE-US-00002 // Input parameters for this protocol (data)
Parameters { var SampleVolume Volume = 15.(uL) var BradfordVolume
Volume = 5.(uL) var Wavelength Wavelength = 595.(nm) var
ControlCurvePoints uint32 = 7 var ControlCurveDilutionFactor uint32
= 2 var ReplicateCount uint32 = 1 // Note: 1 replicate means
experiment is in duplicate, etc. }
[0069] The Parameters block defines the information inputs to the
Bradford Element. Data types can be any of the built-in types from
the Go language, such as int, string, byte, float, as well as the
strongly typed scientific types introduced by the Antha.TM.
language, such as the metric units. Parameter declarations follow
the syntax of
[0070] var VariableName
VariableType=OptionalDefaultValue.(OptionalUnit)
For Example:
[0071] go
[0072] var ExampleVolume Volume=15.(uL)
[0073] means "Create a parameter named ExampleVolume, which only
accepts volume units, with a default value of 15 microlitres. By
convention variables are named in UpperCamelCase (using an
Upper-case letter for each word as a single name). All Parameters
are visible to other Elements, so also by convention they start
with an Upper-case letter.
[0074] ReplicateCount is a special variable, which tells Antha.TM.
to run ReplicateCount additional copies of each sample. The
association of the results, and impact on workflow is automatically
handled by the system.
[0075] ReplicateCount is a special variable, which tells Antha.TM.
to run ReplicateCount additional copies of each sample. The
association of the results and impact on workflow is automatically
handled by the system.
[0076] Data
TABLE-US-00003 // Data which is returned from this protocol, and
data types Data { var SampleAbsorbance Absorbance var ProteinConc
Concentration var RSquared float32 var control_absorbance
[control_curve_points+1]Absorbance var control_concentrations
[control_curve_points+1]float64 }
[0077] The Data block defines the information outputs from the
Bradford Element. Declaration follows the same format as the
Parameters block, although no default values are given. By
convention, results which may be consumed as outputs by other
Elements are named with an Upper-case first letter. Variables which
start with a lower-case first letter are intended for use only
within the protocol, and while the values will be logged, they are
not available to any other Antha.TM. Elements. Additionally, they
are shared across all executing copies of an Element, which
requires their use to be carefully considered to avoid concurrency
problems.
[0078] Inputs:
TABLE-US-00004 // Physical Inputs to this protocol with types
Inputs { var Sample WaterSolution var BradfordReagent WaterSolution
var ControlProtein WaterSolution var DistilledWater WaterSolution
}
[0079] The Inputs block defines the physical inputs to the
protocol, along with their appropriate type. For example, in this
block, all the types are WaterSolutions, meaning they can be
operated on by a standard liquid handling robot, or manual pipette
operations. Additional attributes of the physical samples are used
by the Antha.TM. Execution system to plan the optimal way to
perform physical actions such as mixing on samples based on their
types.
[0080] Declaration syntax follows the form of the information
variables, with the exceptions that no default value is
declared.
[0081] Outputs:
TABLE-US-00005 // Physical outputs from this protocol with types
Outputs { // None }
[0082] This protocol is a destructive protocol, meaning that all of
the intermediates and the final sample created as a result of this
assay needs to be destroyed after performing the protocol. However,
many protocols also output a physical sample, such as a new liquid
solution containing DNA, enzymes, or cells. By default, any
physical sample which is not passed to an Output is scheduled for
destruction, with methods appropriate to the safety level of the
sample (such as having to autoclave genetic materials, etc).
[0083] Requirements:
TABLE-US-00006 Requirements { // None }
[0084] The Requirements block is executed by a protocol before it
begins work, to allow confirming the state of any inputs. For
example, a test like require(!Sample.Expired( ) would explicitly
confirm that the input sample had not, for the information on the
type of sample available to the Antha.TM. system, expired by being
left outside of a temperature controlled environment for too long.
By default, Antha.TM. confirms items such as whether samples have
expired automatically, and this block is provided primarily as a
convenience for certain classes of more complex tests needed to
validate complex inputs such as DNA assembly protocols.
[0085] Setup:
TABLE-US-00007 Setup { control.Config(config.per_plate) var
control_curve[ControlCurvePoints + 1]WaterSolution for i:= 0; i
< control_curve_points; i++ { go func(i) { if (i ==
control_curve_points) { control_curve[i] = mix(distilled_water
(sample_volume) + bradford_reagent(bradford_volume)) } else {
control_curve[i] = serial_dilute(control_protein(sample_volume),
control_curve_points, control_curve_dilution_factor, i) }
control_absorbance[i] = plate_reader.read (control_curve[i],
wavelength) } } } }
[0086] The Setup block is performed once the first time that an
Element is executed. This can be used to perform any configuration
that is needed globally for the Element, and is also used to define
any special setup that may be needed for groups of concurrent tasks
that might be executed at the same time. Any variables that need to
be accessed by the Steps function globally can be defined here as
well, but need to be handled with care to avoid concurrency
problems.
[0087] In the context of this Bradford Element, the Control library
is used to enable the protocol to define a block of samples that
need to be performed in concert with any block of tasks. For
example, each 96 well plate of samples needs to have a set of
control samples added to it to enable the calculation of the amount
of protein in each sample. Creating these control samples is done
via a serial dilution of a known protein sample, using up to
ControlCurvePoints+1 samples in each block.
[0088] Steps:
TABLE-US-00008 Steps { var product = mix(Sample(SampleVolume) +
BradfordReagent(Bradford- Volume))SampleAbsorbance =
PlateReader.ReadAbsorbance(product, Wavelength) }
[0089] The Steps block defines the actual steps taken to transform
a set of input parameters and samples into the output data and
samples. The Steps are a kernel function, meaning they share no
information for every concurrent sample that is processed, and
define the workflow to transform a single block of inputs and
samples into a single set of outputs, even if the Element is
operating on an entire array (such as micro-titre plate of samples
at once).
[0090] In this Bradford Element, a new sample is created, which is
the result of mixing SampleVolume amount of the physical input,
Sample. Note: no physical locations, layouts, or methods are
required, as the Antha.TM. Execution layer manages determining the
capabilities to perform library functions such as the mix function
depending on the equipment registered with the system. Where
automated methods of sample transport or liquid handling are not
available, it falls back to providing manual instructions.
[0091] The newly created sample, product, is then passed to another
Antha.TM. Element, which in this case represents a device driver
for a plate reader, to perform a measurement on the sample. Where
such processing needs to be batched (such as performing it a plate
at a time) the system automatically manages the scheduling of
samples to be collocated on a shared micro-titre plate.
[0092] Lastly, the results of the plate reader are stored as the
output data variable SampleAbsorbance.
[0093] Analysis:
TABLE-US-00009 Analysis { // need the control samples to be
completed before doing the analysis control.WaitForCompletion( ) //
Need to compute the linear curve y = m * x + c var r
regression.Regression r.SetObservedName("Absorbance")
r.SetVarName(0, "Concentration")
r.AddDataPoint(regression.DataPoint{Observed :
ControlCurvePoints+1, Variables : ControlAbsorbance})
r.AddDataPoint(regression.DataPoint{Observed :
ControlCurvePoints+1, Variables : ControlConcentrations})
r.RunLinearRegression( ) m := r.GetRegCoeff(0) c :=
r.GetRegCoeff(1) RSquared = r.Rsquared ProteinConc =
(sample_absorbance - c) / m }
[0094] The Analysis block defines how the results of the Steps can
be transformed into final values, if appropriate. Computing the
final protein concentration of a Bradford assay requires having the
data back from the control samples, performing a linear regression,
and then using those results to normalize the plate reader
results.
[0095] To start, the control .WaitForCompletion( ) is a utility
method saying that the Analysis needs to wait for the concurrent
control samples to be fully processed before analysis can continue.
The actual linear regression is then performed by using an existing
Go library for linear regression, which like all Go code, can be
seamlessly included in Antha.TM.
[0096] Lastly, the final normalized result (the protein
concentration in the sample) is stored in the ProteinConc variable
where it can be accessed by downstream Elements.
[0097] Validation:
TABLE-US-00010 Validation { if SampleAbsorbance > 1 {
panic("Sample likely needs further dilution") } if (RSquared <
0.9) { warn("Low r_squared on standard curve") } if (RSquared <
0.7) { panic("Bad r_squared on standard curve") } // TODO: add test
of replicate variance }
[0098] The Validation block allows the definition of specific tests
to verify the correct execution of an Element, along with reporting
capabilities (and the ability to declare the execution a failure).
For example, the Bradford assay can only handle a specific linear
range of concentrations, so if the amount of protein in the sample
is above or below that range, the assay will fail.
[0099] The solution in such a case is to rerun the assay, with a
different dilution factor, however as the Bradford Element is a
destructive assay, it may require the generation of more source
material which may not be possible, preventing the Element alone
from handling such an error.
[0100] Validation checks can be grouped as destructive or non
destructive. All the tests performed in this example are
non-destructive, as they simply analyse the data. However, in other
types of Elements, a validation test may require the consumption of
some of a sample, such as to run a mass spec trace, and as such
only random dipstick testing may be required rather than validating
every sample which is executed. Policies such as dipstick
validation testing can be configured in the Antha.TM. Execution
environment.
[0101] Unless otherwise indicated, the practice of the present
invention employs conventional techniques of chemistry, computer
science, statistics, molecular biology, microbiology, recombinant
DNA technology, and chemical methods, which are within the
capabilities of a person of ordinary skill in the art. Such
techniques are also explained in the literature, for example, T.
Cormen, C. Leiserson, R. Rivest, 2009, Introduction to Algorithms,
3rd Edition, The MIT Press, Cambridge, Mass.; L. Eriksson, E.
Johansson, N. Kettaneh-Wold, J. Trygg, C. Wikstom, S. Wold, Multi-
and Megavariate Data Analysis, Part 1, 2.sup.nd Edition, 2006,
UMetrics, UMetrics AB, Sweden; M. R. Green, J. Sambrook, 2012,
Molecular Cloning: A Laboratory Manual, Fourth Edition, Books 1-3,
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.;
Ausubel, F. M. et al. (1995 and periodic supplements; Current
Protocols in Molecular Biology, ch. 9, 13, and 16, John Wiley &
Sons, New York, N. Y.); B. Roe, J. Crabtree, and A. Kahn, 1996, DNA
Isolation and Sequencing: Essential Techniques, John Wiley &
Sons; J. M. Polak and James O'D. McGee, 1990, In Situ
Hybridisation: Principles and Practice, Oxford University Press; M.
J. Gait (Editor), 1984, Oligonucleotide Synthesis: A Practical
Approach, IRL Press; and D. M. J. Lilley and J. E. Dahlberg, 1992,
Methods of Enzymology: DNA Structure Part A: Synthesis and Physical
Analysis of DNA Methods in Enzymology, Academic Press. Each of
these general texts is herein incorporated by reference.
[0102] Although particular embodiments of the invention have been
disclosed herein in detail, this has been done by way of example
and for the purposes of illustration only. The aforementioned
embodiments are not intended to be limiting with respect to the
scope of the appended claims, which follow. It is contemplated by
the inventors that various substitutions, alterations, and
modifications may be made to the invention without departing from
the spirit and scope of the invention as defined by the claims.
* * * * *