U.S. patent application number 13/054086 was filed with the patent office on 2011-07-07 for system and method for determining drilling activity.
This patent application is currently assigned to Schlumberger Technology Corporation. Invention is credited to Harry Barrow, Bertrand Du Castel.
Application Number | 20110166789 13/054086 |
Document ID | / |
Family ID | 41570663 |
Filed Date | 2011-07-07 |
United States Patent
Application |
20110166789 |
Kind Code |
A1 |
Barrow; Harry ; et
al. |
July 7, 2011 |
SYSTEM AND METHOD FOR DETERMINING DRILLING ACTIVITY
Abstract
A method and system for interpreting oilfield process data,
including drilling rig data and/or the like, is described, the
method and system including use of a knowledge representation
containing a representation of uncertainty in the oilfield process
operations.
Inventors: |
Barrow; Harry;
(Cambridgeshire, GB) ; Du Castel; Bertrand;
(Austin, TX) |
Assignee: |
Schlumberger Technology
Corporation
Cambridge
MA
|
Family ID: |
41570663 |
Appl. No.: |
13/054086 |
Filed: |
July 23, 2009 |
PCT Filed: |
July 23, 2009 |
PCT NO: |
PCT/IB2009/006346 |
371 Date: |
March 22, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61083125 |
Jul 23, 2008 |
|
|
|
61083074 |
Jul 23, 2008 |
|
|
|
Current U.S.
Class: |
702/6 |
Current CPC
Class: |
G06N 7/005 20130101;
G06N 5/022 20130101 |
Class at
Publication: |
702/6 |
International
Class: |
G06F 19/00 20110101
G06F019/00; G06F 17/18 20060101 G06F017/18 |
Claims
1. A method of processing oilfield process status data in a data
channel, comprising; representing oilfield process knowledge in a
knowledge representation, the knowledge representation including
representations of uncertainty values in the oilfield process
knowledge; and generating an interpretation of the oilfield process
status data from the oilfield process knowledge representation.
2. The method of claim 1, wherein: the step of representing
oilfield process knowledge in a knowledge representation comprises
representing oilfield process knowledge in a knowledge
representation of activities and transitions between subactivities
of activities, including representing probabilities for transitions
between subactivities of activities; and the step of generating an
interpretation comprises computing activity probability values
corresponding to each of a set of activities from transitional
probabilities and data values in the data channel.
3. The method of claim 2, wherein the step of computing activity
probability values comprises; determining detailed activities
corresponding to each possible subactivity; computing
activity-to-data probability values for each ordered pair (a,d) of
detailed activity (a) and data channel value (d), wherein for the
given the activity (a), the data channel would indicate the data
channel value (d); for each data sample (n) in the data channel
computing an activity probability vector (n) for a set of
activities, wherein the probability value for each activity in the
set of activities indicates the probability that the drilling rig
carried out the each activity at time n, wherein each of the
activity probability vectors (205) is computed by applying the
transition probabilities (203) to an activity vector (n-1) (201);
selecting an activity-to-data probability vector (207) from a
computed activity-to-data probability values, wherein the elements
of the activity-to-data probability vector correspond to the
activities in the set of activities and the data sample (n); and
computing an output probability vector (209) by applying the
activity-to-data probability vector (207) to the first probability
vector (205) and by applying Bayes Theorem to the result.
4. A method of interpreting oilfield process state data,
comprising: representing oilfield process knowledge in an activity
grammar containing activity states; transitional probabilities for
transitioning from one activity to another; configuration
variables; and leaf activity with assigned values for each
configuration variable; and interpreting input data using the
activity grammar to compute probabilities for each of a set of
activities defined as possible activities of an oilfield process
state.
5. A method of processing drilling rig data in a data channel,
comprising; representing drilling knowledge in a knowledge
representation including representing uncertainty values in the
drilling knowledge; and generating an interpretation of the
drilling rig data from the drilling knowledge representation
including the representation of uncertainty values.
6. The method of processing drilling rig data in the data channel
of claim 5, wherein: the representing comprises: representing
drilling knowledge in a knowledge representation of activities and
transitions between subactivities of activities, including
representing probabilities for transitions between subactivities of
activities; and the generating an interpretation comprises:
computing activity probability values corresponding to each of a
set of activities from the transitional probabilities and data
values in the data channel.
7. The method of interpreting drilling rig data in a data channel
of claim 6, wherein the step of computing activity probability
values comprises; determining detailed activities corresponding to
each possible subactivity; computing activity-to-data probability
values for each ordered pair (a,d) of detailed activity (a) and
data channel value (d) that given the activity (a), the data
channel would indicate the data channel value (d); for each data
sample (n) in the data channel computing an activity probability
vector (n) for a set of activities, wherein the probability value
for a each activity in the set indicates the probability that the
drilling rig carried out the each activity at time n, by: computing
a first probability vector (205) by applying the transition
probabilities (203) to the activity vector (n-1) (201); selecting
an activity-to-data probability vector (207) from the computed
activity-to-data probability values, wherein the elements of the
activity-to-data probability vector correspond to the activities in
the set of activities and the data sample (n); and computing an
output probability vector (209) by applying the activity-to-data
probability vector (207) to the first probability vector (205) and
by applying Bayes Theorem to the result.
8. A method of interpreting drilling rig data, comprising:
representing drilling knowledge in an activity grammar containing:
activity states; transitional probabilities for transitioning from
one activity to another; configuration variables; leaf activity
with assigned values for each configuration variable; and
interpreting input data using the activity grammar to compute
probabilities for each of a set of activities defined as possible
activities of a drilling rig.
9. The method of interpreting drilling rig data of claim 8, wherein
the representing drilling knowledge in an activity grammar further
comprises: for each activity that is not a leaf activity: a start
state and a finish state; at least one subactivity; at least one
transition from the start state to at least one subactivity; and at
least one transition from at least one subactivity to the finish
state.
10. A computer system for interpreting drilling data comprising:
sensors for collecting drilling data; a storage for storing data
and program instructions; a processor having a data input/output
mechanism and operable to input data on the input/output mechanism
and to output data onto the input/output mechanism, and to process
data according to instructions stored in the program storage;
wherein the storage contains: a representation of drilling
knowledge including representation of uncertainty values in the
drilling knowledge; and instructions that when executed causes the
processor to generate an interpretation of the drilling rig data
from the drilling knowledge representation including the
representation of uncertainty values.
11. The computer system for interpreting drilling data of claim 10,
wherein: the representation of drilling knowledge comprises a
knowledge representation of activities and transitions between
subactivities of activities, including representing probabilities
for transitions between subactivities of activities; and the
instructions to generate an interpretation comprises instructions
to cause the processor to: compute activity probability values
corresponding to each of a set of activities from the transitional
probabilities and data values in the data channel.
12. The computer system for interpreting drilling data according to
claim 11, wherein the instructions to cause the processor to
compute activity probability values comprises instructions to cause
the processor to: determine detailed activities corresponding to
each possible subactivity; compute activity-to-data probability
values for each ordered pair (a,d) of detailed activity (a) and
data channel value (d) that given the activity (a), the data
channel would indicate the data channel value (d); for each data
sample (n) in the data channel compute an activity probability
vector (n) for a set of activities, wherein the probability value
for a each activity in the set indicates the probability that the
drilling rig carried out the each activity at time n, by: computing
a first probability vector (205) by applying the transition
probabilities (203) to the activity vector (n-1) (201); selecting
an activity-to-data probability vector (207) from the computed
activity-to-data probability values, wherein the elements of the
activity-to-data probability vector correspond to the activities in
the set of activities and the data sample (n); computing an output
probability vector (209) by applying the activity-to-data
probability vector (207) to the first probability vector (205) and
normalizing the result.
13. A method for determining probabilities of data collected during
exploration or production of subterranean resources corresponding
to particular activities, comprising: receiving a sequence of data
values from a subterranean resource exploration operation; and
determining the probability that the data values correspond to each
of several activity states using a function derived from a
knowledge representation containing probabilities of transitioning
from one activity state to each other of the several activity
states, and a function providing a probability mapping of
particular data values to possible activity states.
14. The method for determining probabilities of data collected
during exploration or production of subterranean resources
corresponding to particular activities of claim 13, wherein the
function derived from a knowledge representation containing
probabilities of transitioning from one activity state to each
other of the several activity states is a transition probability
matrix in which each element is the probability of transitioning
from one activity state to another activity state.
15. The method for determining probabilities of data collected
during exploration or production of subterranean resources
corresponding to particular activities of claim 14 further
comprising: adjusting the function providing a probability mapping
of particular data values to possible activity states to account
for confusion in regard to whether particular data values
correspond to actual conditions.
16. The method for determining probabilities of data collected
during exploration or production of subterranean resources
corresponding to particular activities of claim 13 wherein the
knowledge representation containing probabilities of transitioning
from one activity state to each other of the several activity
states is a representation of a stochastic grammar having rules
describing possible transitions in a sequence of activities that
may correspond to the sequence of oilfield data values.
17. The method for determining probabilities of data collected
during exploration or production of subterranean resources
corresponding to particular activities of claim 13, wherein the
function providing a probability mapping of particular data values
to possible activity states is a data-to-activity state probability
matrix in which each element is the probability that a given data
value corresponds to a particular activity state.
18. The method for determining probabilities of data collected
during exploration or production of subterranean resources
corresponding to particular activities of claim 13 further
comprising: determining a first vector of probability values
corresponding to a particular data item in the sequence of oilfield
data values by applying the function derived from a knowledge
representation containing probabilities of transitioning from one
activity state to each other of the several activity states to a
preceding vector of probability values corresponding to a data item
preceding the particular data item to determine the probability of
each of the several activity states given the probability in the
preceding vector of probability values and the probabilities of
transitioning from one activity state to each other of the several
activity states; determining a second vector of probability values
corresponding to a particular data item in the sequence of oilfield
data values by applying the function providing a probability
mapping of particular data values to possible activity states to
the particular data item in the sequence of oilfield data values;
and determining a probability vector for the particular data item
by combining the first vector of probability values and the second
vector of probability values.
19. A method of evaluating alternative hypothesis in regard to
origin of data collected during exploration or production of
subterranean resources, comprising: receiving a sequence of data
values from a subterranean resource exploration operation; for each
of a plurality hypothesis, determining the probability that the
data values correspond to each of several activity states using: a
first function providing probabilities for transitioning from each
of the several activity states to each other of the several
activity states wherein the first function is derived from a
knowledge representation containing probabilities of transitioning
from one activity state to each other of the several activity
states according to rules specified for the each of a plurality of
hypothesis, and a second function providing a probability mapping
of particular data values to possible activity states wherein the
second function is derived from a knowledge representation
containing hypothesis specific mapping of traits to activity states
and traits to data values according to rules specified for the each
of a plurality of hypothesis; and rejecting any hypothesis in which
the determined probability that the data values correspond to each
of several activity states is indicative of the rules specified for
the hypothesis provide a poor match of the data sequence.
20. The method of evaluating alternative hypothesis in regard to
origin of data collected during exploration or production of
subterranean resources according to claim 19, further comprising:
storing a plurality of stochastic grammars in a knowledge
representation for exploration of subterranean resources, wherein
each stochastic grammar reflects data-origin-particular activities
encountered in the exploration of subterranean resources and the
probabilities of transitioning between those activities; and
generating the first function from the stochastic grammar.
Description
[0001] Embodiments of the present invention relate to interpreting
data, including but not limited to interpreting data from oilfield
applications--which data may include but is not limited to drilling
data, production data, well data, completions data, drill string
data, wellbore data, logging data and/or the like--using a
knowledge representation that contains representation of
uncertainties.
BACKGROUND
[0002] The statements in this section merely provide background
information related to the present disclosure and may not
constitute prior art.
[0003] In oilfield applications, the drilling process can be
impeded by a wide variety of problems. Accurate measurements of
downhole conditions, rock properties and surface equipment allow
many drilling risks to be minimized and may also be used for
detecting when a problem has occurred. At present, most problem
detection is the result of human vigilance, but detection
probability is often degraded by fatigue, high workload or lack of
experience.
[0004] Merely by way of example, in oilfield applications, some
limited techniques have been used for detecting the occurrence of
one of two possible rig states using a single input channel. In one
example, a technique may be used to automatically detect if a drill
pipe for drilling a hydrocarbon well is either "in slips" or "not
in slips". This information may be used to gain accurate control of
depth estimates, for example in conjunction with activities such as
measurement-while-drilling (MWD) or mud logging. To tell whether
the drill pipe is "in slips," the known technique generally only
uses a single input channel of hookload data measured on the
surface. Another example of making a determination between two
possible rig states is a technique used to predict if the drill bit
is "on bottom" or "not on bottom." Again, this method makes use of
only a single input channel, namely block position, and is only
used to detect one of two "states" of the drilling rig.
[0005] In the oilfield industry there is a need to automate
process/applications and to monitor the automated processes and
applications. This automation and monitoring may require monitoring
of one or more streams of data and interpretation of the data.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] The present disclosure is described in conjunction with the
appended figures.
[0007] FIG. 1 shows a drilling system using automatic rig state
detection, according to one embodiment of the present
invention.
[0008] FIG. 2 is a schematic-type illustration of a processing
system for processing data to determine an oilfield application
state, according to one embodiment of the present invention.
[0009] FIG. 3(a) is a screen shot showing graphs of several data
channels collected during a drilling operation as may be processed
in accordance with an embodiment of the present invention.
[0010] FIG. 3(b) is a screen shot showing a zoom-in on the graphs
of several data channels collected during the drilling operation of
FIG. 3(a) over a short time interval.
[0011] FIG. 3(c) is a screen shot showing graphs of several data
channels collected during a drilling operation and interpretations
including probabilities of particular drilling activities occurring
based on the input data of FIG. 3(a) using a methodology in
accordance with an embodiment of the present invention.
[0012] FIG. 3(d) is a zoom-in on the screen shot of FIG. 3(c) for a
short time interval.
[0013] FIG. 4 is a high-level schematic illustration of a RIG data
interpretation software program that may be used in accordance with
an embodiment of the present invention to compute the
interpretations illustrated in FIGS. 3(c) and 3(d).
[0014] FIG. 5 is a schematic-type illustration depicting use of
drilling knowledge to interpret drilling data to produce an
interpretation, in accordance with an embodiment of the present
invention.
[0015] FIG. 6 is a screen dump of a portion of an ontology of
drilling, in accordance with an embodiment of the present
invention.
[0016] FIGS. 7 through 11 are schematic-type illustrations of
activities in a drilling activity grammar as may be used in the
interpretation methodology of FIG. 5 or the like, in accordance
with an embodiment of the present invention.
[0017] FIG. 12 is a block diagram illustrating the components of a
data interpretation program generated by a data interpretation
program generator, in accordance with an embodiment of the present
invention.
[0018] FIG. 13 is a block diagram illustrating a process for
interpreting a time sequence of input data, to compute probability
values for leaf states and for activities defined by an activity
grammar, in accordance with an embodiment of the present
invention.
[0019] FIG. 14 is a block diagram illustrating exemplary pseudo
code implementing the process illustrated and discussed in
conjunction with FIG. 13, in accordance with an embodiment of the
present invention.
[0020] FIG. 15 is a block diagram illustration of a code for
computing a current state probability vector, in accordance with an
embodiment of the present invention.
[0021] FIG. 16 is a block diagram illustrating a relationship
between the stochastic grammar illustrated in FIG. 5, the
interpretation program code generator described in FIG. 5, and a
TRANS-PROB matrix and a DATA-STATE-PROB matrix of the data
interpretation program, in accordance with an embodiment of the
present invention.
[0022] FIG. 17 is a grammar that is relied upon herein to provide
an illustrative example of a process for generating the TRANS-PROB
matrix, in accordance with an embodiment of the present
invention.
[0023] FIG. 18 is an example of a TRANS-PROB matrix corresponding
to the grammar of FIG. 17, in accordance with an embodiment of the
present invention.
[0024] FIG. 19 is an abstraction of a grammar showing four leaf
states without providing specific grammar rules for transitioning
between these four leaf states, in accordance with an embodiment of
the present invention. (This abstraction is provided merely by way
of example for illustrative purposes).
[0025] FIG. 20 is a continuation of FIG. 19 and provides a table
illustrating values for traits making up the configurations of the
leaf activities of FIG. 19.
[0026] FIG. 21 illustrates the compatible configurations for the
configurations defined in FIG. 20.
[0027] FIG. 22 provides a table of the values of the traits making
up the configurations corresponding to six data values used in the
example introduced in FIG. 19.
[0028] FIG. 23 illustrates the compatible configurations for the
configurations defined in FIG. 22.
[0029] FIG. 24 is a table of sequence of flags that reflect whether
a particular corresponding configuration of the state is compatible
with the configuration of the data value, in accordance with an
embodiment of the present invention.
[0030] FIG. 25 is a table illustrating an intermediate step in the
computation of the DATA-STATE-PROB matrix corresponding to the
example introduced in FIG. 19, in accordance with an embodiment of
the present invention.
[0031] FIG. 26 is an illustration of a resulting matrix following a
normalization operation, in accordance with an embodiment of the
present invention.
[0032] FIG. 27 provides an example of a TRANS-PROB matrix provided
merely by way of example for the purposes of illustrating the
operation of the interpretation program.
[0033] FIG. 28 is an illustration of results of operation of an
interpretation program 401 using the process described in
conjunction with FIGS. 13 though 15 and the TRANS-PROB matrix of
FIG. 27 and the DATA-STATE-PROB matrix of FIG. 26, in accordance
with an embodiment of the present invention.
[0034] FIG. 29 illustrates some confusion matrices corresponding to
the example given herein above, in accordance with an embodiment of
the present invention.
[0035] FIG. 30 is a flow-chart illustrating a process of applying
the data interpretation program introduced herein to evaluate
multiple hypothesis relevant to the origin of a data set, in
accordance with an embodiment of the present invention.
[0036] In the appended figures, similar components and/or features
may have the same reference label. Further, various components of
the same type may be distinguished by following the reference label
by a dash and a second label that distinguishes among the similar
components. If only the first reference label is used in the
specification, the description is applicable to any one of the
similar components having the same first reference label
irrespective of the second reference label.
DETAILED DESCRIPTION
[0037] In the following detailed description, reference is made to
the accompanying drawings that show, by way of illustration,
specific embodiments in which the invention may be practiced. These
embodiments are described in sufficient detail to enable those
skilled in the art to practice the invention. It is to be
understood that the various embodiments of the invention, although
different, are not necessarily mutually exclusive. For example, a
particular feature, structure, or characteristic described herein
in connection with one embodiment may be implemented within other
embodiments without departing from the scope of the invention. In
addition, it is to be understood that the location or arrangement
of individual elements within each disclosed embodiment may be
modified without departing from the spirit and scope of the
invention. The following detailed description is, therefore, not to
be taken in a limiting sense, and the scope of the present
invention is defined only by the appended claims, appropriately
interpreted, along with the full range of equivalents to which the
claims are entitled. In the drawings, like numerals refer to the
same or similar functionality throughout the several views.
[0038] It should also be noted that in the development of any such
actual embodiment, numerous decisions specific to circumstance must
be made to achieve the developer's specific goals, such as
compliance with system-related and business-related constraints,
which will vary from one implementation to another. Moreover, it
will be appreciated that such a development effort might be complex
and time-consuming but would nevertheless be a routine undertaking
for those of ordinary skill in the art having the benefit of this
disclosure.
[0039] In this disclosure, the term "storage medium" may represent
one or more devices for storing data, including read only memory
(ROM), random access memory (RAM), magnetic RAM, core memory,
magnetic disk storage mediums, optical storage mediums, flash
memory devices and/or other machine readable mediums for storing
information. The term "computer-readable medium" includes, but is
not limited to portable or fixed storage devices, optical storage
devices, wireless channels and various other mediums capable of
storing, containing or carrying instruction(s) and/or data.
[0040] Embodiments of the present invention provide a method of
describing oilfield operations in a knowledge representation that
contains a grammar for interpreting oilfield application data.
Merely by way of example, in some embodiments, methods of
describing drilling operations in a knowledge representation that
contains a grammar for interpreting drilling data are provided.
However, the methods herein disclosed may be used on other oilfield
applications, such as hydrocarbon production, well completions,
well logging, well interpretation, recovery operations, stimulation
or the like. The knowledge representation of embodiments of the
present invention may include representation of uncertainty.
[0041] For example, given a representation of oilfield operations,
such as drilling activities or the like, that have component
subactivities, the representation may include probabilities for
transitioning from one such subactivity to another. The method,
when applied, may provide for an efficient way of interpreting
input data to determine the probability that the input data is
indicative of certain activities occurring and is therefore a
valuable tool in analyzing an oilfield application, such as the
operations of a drilling rig or the like.
[0042] FIG. 1 shows a drilling system 10 using automatic rig state
detection, according to one embodiment of the present invention.
Drill string 58 is shown within borehole 46. Borehole 46 is located
in the earth 40 having a surface 42. Borehole 46 is being cut by
the action of drill bit 54. Drill bit 54 is disposed at the far end
of the bottomhole assembly 56 that is attached to and forms the
lower portion of drill string 58. Bottomhole assembly 56 contains a
number of devices including various subassemblies. According to an
embodiment of the invention measurement-while-drilling (MWD)
subassemblies are included in subassemblies 62. Examples of typical
MWD measurements include direction, inclination, survey data,
downhole pressure (inside the drill pipe, and outside or annular
pressure), resistivity, density, andporosity. Also included is a
subassembly 62 for measuring torque and weight on bit. The signals
from the subassemblies 62 are preferably processed in processor 66.
After processing, the information from processor 66 is communicated
to pulser assembly 64. Pulser assembly 64 converts the information
from processor 66 into pressure pulses in the drilling fluid. The
pressure pulses are generated in a particular pattern which
represents the data from subassemblies 62. The pressure pulses
travel upwards though the drilling fluid in the central opening in
the drill string and towards the surface system. The subassemblies
in the bottomhole assembly 56 can also include a turbine or motor
for providing power for rotating and steering drill bit 54. In
different embodiments, other telemetry systems, such as wired pipe,
fiber optic systems, acoustic systems, wireless communication
systems and/or the like may be used to transmit data to the surface
system.
[0043] The drilling rig 12 includes a derrick 68 and hoisting
system, a rotating system, and a mud circulation system. The
hoisting system which suspends the drill string 58, includes draw
works 70, fast line 71, crown block 75, drilling line 79, traveling
block and hook 72, swivel 74, and deadline 77. The rotating system
includes kelly 76, rotary table 88, and engines (not shown). The
rotating system imparts a rotational force on the drill string 58
as is well known in the art. Although a system with a kelly and
rotary table is shown in FIG. 4, those of skill in the art will
recognize that the present invention is also applicable to top
drive drilling arrangements. Although the drilling system is shown
in FIG. 4 as being on land, those of skill in the art will
recognize that the present invention is equally applicable to
marine environments.
[0044] The mud circulation system pumps drilling fluid down the
central opening in the drill string. The drilling fluid is often
called mud, and it is typically a mixture of water or diesel fuel,
special clays, and other chemicals. The drilling mud is stored in
mud pit 78. The drilling mud is drawn in to mud pumps (not shown),
which pump the mud though stand pipe 86 and into the kelly 76
through swivel 74 which contains a rotating seal.
[0045] The mud passes through drill string 58 and through drill bit
54. As the teeth of the drill bit grind and gouges the earth
formation into cuttings the mud is ejected out of openings or
nozzles in the bit with great speed and pressure. These jets of mud
lift the cuttings off the bottom of the hole and away from the bit
54, and up towards the surface in the annular space between drill
string 58 and the wall of borehole 46.
[0046] At the surface the mud and cuttings leave the well through a
side outlet in blowout preventer 99 and through mud return line
(not shown). Blowout preventer 99 comprises a pressure control
device and a rotary seal. The mud return line feeds the mud into
separator (not shown) which separates the mud from the cuttings.
From the separator, the mud is returned to mud pit 78 for storage
and re-use.
[0047] Various sensors are placed on the drilling rig 10 to take
measurement of the drilling equipment. In particular hookload is
measured by hookload sensor 94 mounted on deadline 77, block
position and the related block velocity are measured by block
sensor 95 which is part of the draw works 70. Surface torque is
measured by a sensor on the rotary table 88. Standpipe pressure is
measured by pressure sensor 92, located on standpipe 86. Additional
sensors may be used to detect whether the drill bit 54 is on
bottom. Signals from these measurements are communicated to a
central surface processor 96. In addition, mud pulses traveling up
the drillstring are detected by pressure sensor 92. Pressure sensor
92 comprises a transducer that converts the mud pressure into
electronic signals. The pressure sensor 92 is connected to surface
processor 96 that converts the signal from the pressure signal into
digital form, stores and demodulates the digital signal into
useable MWD data. According to various embodiments described above,
surface processor 96 is programmed to automatically detect the most
likely rig state based on the various input channels described.
Processor 96 is also programmed to carry out the automated event
detection as described above. Processor 96 preferably transmits the
rig state and/or event detection information to user interface
system 97 which is designed to warn the drilling personnel of
undesirable events and/or suggest activity to the drilling
personnel to avoid undesirable events, as described above. In other
embodiments, interface system 97 may output a status of drilling
operations to a user, which may be a software application, a
processor and/or the like, and the user may manage the drilling
operations using the status.
[0048] Processor 96 may be further programmed, as described below,
to interpret the data collected by the various sensors provided to
provide an interpretation in terms of activities that may have
occurred in producing the collected data. Such interpretation may
be used to understand the activities of a driller, to automate
particular tasks of a driller, and to provide training for
drillers.
[0049] FIG. 2 shows further detail of processor 96, according to
preferred embodiments of the invention. Processor 96 preferably
consists of one or more central processing units 350, main memory
352, communications or I/O modules 354, graphics devices 356, a
floating point accelerator 358, and mass storage such as tapes and
discs 360. It should be noted that while processor 96 is
illustrated as being part of the drill site apparatus, it may also
be located, for example, in an exploration company data center or
headquarters.
[0050] FIG. 3(a) is a screenshot of 11 data channels logged as part
of a drilling operation and one data channel that is an
interpretation of a subset of the 11 logged data channels. Channel
301 is a plot of the depth (DEPT) and horizontal depth (HDTH).
Channel 303 is a plot of block position (BPOS). Channel 305 is a
plot of block velocity (BVEL). Channel 307 is a plot of hook load
(HKLD). Channel 309 is a plot of standpipe pressure (SPPA). Channel
311 is a plot of mud flow rate in (FLWI). Channel 313 is a plot of
rotational speed (RPM). Channel 315 is a plot of surface torque
(STOR). Channel 317 is a plot of rate of penetration (ROP). Channel
319 is a plot of a binary, value that indicates whether the bit is
on bottom (BONB), and channel 321 is a plot of a binary value
indicating whether the rig is "in slips" (SLIPSTAT).
[0051] FIG. 3(b) is a zooming in on a small section along the
time-index of the screen shot from FIG. 3(a) thereby spreading out
the data to show greater detail.
[0052] As described in U.S. Pat. Nos. 6,868,920 and
7,128,167,--which patents are commonly owned by the owner of the
present application and are incorporated herein in their entirety
by reference for all purposes, various sensor data, i.e., one or
more of the data channels shown in FIG. 3, may be communicated via
the communications modules 354 and may be interpreted to determine
a rig state. The rig states, in one embodiment, may include the
following states: DrillRot, DrillSlide, RihPumpRot, RihPump, Rih,
PoohPumpRot, PoohPump, Pooh StaticPumpRot, StaticPump, Static, In
slips, Unclassified. Where Rih=Run in Hole, Rot=Rotate, Pooh=Pull
out of hole. These states correspond to a numerical value in the
RIG channel that is also logged and depicted as channel 323 in FIG.
3.
[0053] Table I is a listing of RIG channel values and corresponding
configurations:
TABLE-US-00001 TABLE I RIG Channel Values and Configurations
Integer TRAITS Value Rig State Rotation Pumping Block Bottom Slips
0 Rotary Drill On On Slow On Bottom Not Slips 1 Slide Drill Off On
Slow On Bottom Not Slips 2 In Slips -- -- -- -- In Slips 3 Ream On
On Down Off Bottom Not Slips 4 Run In Off On Down Off Bottom Not
Slips Pump 5 Run In On Off Down Off Bottom Not Slips Rotate 6 Run
In Off Off Down Off Bottom Not Slips 7 Back Ream On On Up Off
Bottom Not Slips 8 Pull Up Off On Up Off Bottom Not Slips Pump 9
Pull Up On Off Up Off Bottom Not Slips Rotate 10 Pull Up Off Off Up
Off Bottom Not Slips 11 Rotate On On Stop Off Bottom Not Slips Pump
12 Pump Off On Stop Off Bottom Not Slips 13 Rotate On Off Stop Off
Bottom Not Slips 14 Stationary Off Off Stop Off Bottom Not Slips 15
Un- -- -- -- -- -- classified 16 Absent -- -- -- -- -- 17 Data Gap
-- -- -- -- --
[0054] In addition to the physical traits such as Rotation etc.,
the grammar of Appendix A defines traits for datagap, classified,
and absent. These traits reflect the presence or consistency of the
data. For example, a configuration that is not compatible with any
of the first 15 data values, would be unclassified. Where data is
missing for one index value in the data, the data would be absent,
and if no data is recorded (including no index value), the data
would be a datagap. For Rig States 0 through 14, these traits all
have the values classified, not absent, and not datagap.
Conversely, rig states 15 through 17 correspond to the conditions
resulting in those particular states, e.g., for 15 unclassified,
the trait values are unclassified, not absent, and not datagap.
[0055] A configuration is a particular combination of traits. The
Rig channel is an assignment of a value corresponding to values
collected from sensors that indicate a combination of traits
corresponding to particular drilling conditions and operations. For
this example, the traits are Rotation, Pumping, Block, Bottom, and
Slips; Rotation signifies whether the drill string is rotating or
not; Pumping signifies whether drilling mud is being pumped; Block
indicates the direction of the block, i.e., up, down, slow or no
movement; and Slips is reflects whether the drill string is in
slips or not. Thus, a configuration is a particular combination of
these trait values. Of course, given five variables, some of which
take on several different values, the universe of configurations is
rather large. However, some combinations of traits may not make
sense. These nonsensical combinations are delegated to the
Unclassified configuration. Drilling data may be collected on
particular time intervals. As such, in some embodiments of the
present invention, if for any given time index, data is recorded as
NIL, the Absent value is assigned to the RIG channel. Similarly, if
no data is recorded at all, the Data Gap value is assigned to the
RIG channel.
[0056] While in one embodiment the invention may be used to
interpret activities that correspond to values of the RIG channel,
in other embodiments, other data values may be interpreted, either
as combinations of data channels forming configurations in a
similar manner to that discussed above for the RIG channel or for
single channel data sets.
[0057] FIG. 4 is a high-level schematic illustration of a RIG data
interpretation software program 401, advantageously stored on a
mass storage device 360 of the computer system 96 or on an
interpretation computer system not located at the rig site but, for
example, having a similar architecture as computer system 96, that
is operable to further interpret the log data collected during a
drilling operation. In one embodiment, described herein, the
further interpretation operates to further interpret the collected
data to determine the activities that occur or have occurred during
a drilling operation by analyzing a time sequence of the RIG
channel 323. In an alternative embodiment additional data channels
are used for determining the activities that occur or have occurred
during a drilling operation.
[0058] The drilling data interpretation program 401 may accept as
input a drilling knowledge base 403 and drilling data 405. The
drilling data 405 may be drilling log data, for example, as
depicted in FIGS. 3(a) and 3(b), a subset of the drilling log data,
or, for example, the RIG_STATE channel 323. The drilling
knowledgebase 403 is described in further detail below. As is
discussed herein below, in an alternative embodiment, the drilling
data interpretation program 401 may be constructed from the
drilling knowledge in the drilling knowledgebase 403. It may, in
accordance with an embodiment of the present invention, then be
reused for interpreting subsequent data sets without accessing the
knowledgebase 403.
[0059] The output of the drilling data interpretation program may
be some form of interpretation 407 of the drilling data 405, e.g.,
a report of the activities that are occurring or have occurred
during a drilling operation. The interpretation output 407 may be
an interpretation of the input data using the knowledge contained
in the knowledge base 403.
[0060] Embodiments of the present invention described herein, may
be used on a variety of data channels and provide a variety of
interpretations. Herein, merely for purposes of example, the
interpretations that are made from the Rig Status channel 323
include four separate channels as illustrated in FIGS. 3(c) and
3(d). FIGS. 3(c) and 3(d) contain, in addition to a subset of the
channels illustrated in FIGS. 3(a) and 3(b), four interpretation
channel graphs containing curves for several interpretation
probability variables (in italics in the table below): [0061] 1.
(Channel graph 325) Type of Drilling Operation, i.e., whether the
rig is used for drilling rotary (drill rotary), drilling sliding
(drill sliding), or neither [0062] 2. (Channel graph 327) Actively
making hole (Make Hole), wiping the hole (Wipe Hole), or merely
circulating mud by pumping (Circulate) [0063] 3. (Channel graph
329) Actively drilling (drilling), adding stand (Add Stand) [0064]
4. (Channel graph 331) Activity unknown (Unknown)
[0065] For each interpretation channel plot 325 through 331 there
are logs for each of the interpretation probability variables. For
example, considering graph 329, for most of the displayed section
of Figure (c) the Drilling plot and the Add Stand plot behave
essentially binary, e.g., there is a 1.0 probability of drilling at
the same time as there is a 0.0 probability of adding stand.
However, in the section near time-mark 23, the Add Stand plot
indicates a probability of approximately 0.2-0.3 and, conversely,
the Drilling plot indicates a probability of drilling of
approximately 0.7-0.8. In other words, the plotted curves in graphs
325 through 331 indicate the probability of a particular
activity.
[0066] Having described the input and the interpretation result,
the methodology of interpreting the input data is now described,
which methodology of interpretation may in some embodiments of the
present invention take uncertainty into account and may produce the
interpretation results.
[0067] FIG. 5 is a schematic illustration illustrating one
embodiment of using drilling knowledge to interpret drilling data
405 to produce an interpretation 407. In the example of FIG. 5, the
drilling knowledgebase 403 is used by an interpretation program
code generator 507 to produce a data interpretation program
401.
[0068] The drilling knowledgebase 403 may be contained in a
hierarchical structure 501 known as an ontology. A sample ontology
is depicted and described in co-pending application to Bertrand du
Castel et al., entitled "SYSTEM AND METHOD FOR AUTOMATING
EXPLORATION OR PRODUCTION OF SUBTERRANEAN RESOURCES" filed
contemporaneously with this application, commonly owned by the
owner of the present application, and incorporated by reference
herein for all purposes.
[0069] The ontology 501 may be input into an
Ontology-to-Activity-Grammar program 503, the output of which is an
activity grammar 505. In an alternative embodiment, the drilling
knowledge is contained directly in an activity grammar 505. FIG. 6
is a screen dump of a portion of an ontology of drilling 501. A
corresponding text version of the stochastic grammar ontology may
be found in Appendix A--DRILLING STATES ONTOLOGY LISTING.
[0070] An Activity Grammar 505 contains, for example: [0071]
Activity descriptions for a number of activities wherein each
activity is described as a start state, a finish state, and one or
more subactivities performed during each activity. There may be any
number of levels of subactivities, i.e., a subactivity may be
further composed of other subactivities. [0072] Transitional
probabilities defining the probability of transitioning from one
subactivity to another subactivity, from the start state to a
particular subactivity, and from a subactivity to the finish state.
[0073] A number of leaf activities. A leaf activity is an activity
that does not include any subactivities. [0074] Configuration
variables that define the configuration of an activity with respect
to particular traits, which are values of particular observed
conditions, e.g., whether the bit is on bottom, whether the mud
circulating pump is on or not, whether block is moving, the
direction it is moving, etc, wherein a configuration is a
combination of trait values. [0075] Specification of a top level
activity, e.g., The activity drill_well corresponds to a defined
activity that is composed of several subactivities. drill_well is
defined in the activity grammar at lines A-1471 through A-1555.
Without meta information, drill_well. Would logically be the top
level activity. However, in the actual implementation, for
implementation reasons, the top level activity is a combination of
drill_well activity and the meta activity defined at A-1760 through
A-1818.
[0076] Each of these elements of the stochastic grammar 505 is
described herein below.
[0077] Activity Descriptions FIG. 7 is a schematic illustration of
the activity drill_well represented by stochastic finite state
machine 601. In a sense activities as defined in the activity
grammar are probabilistic finite state machines, i.e., finite state
machines in which transitions from one state to another have
assigned probabilities. In FIG. 7 the activity finite state machine
(AFSM) 601 corresponds to the activity grammar code (Appendix A,
Lines A-1471 through A-1555). The drill_well AFSM 601 has a start
state 603 and a finish state 605. From the start state 603, there
are transitions to each of three sub-activities, namely,
drill_a_section 611, trip_in 609, and trip_out 613, with transition
probabilities 0.4, 0.4, and 0.2. These transitions and transitional
probabilities are defined in the code at Lines A-1473 through
A-1479, A-1480 through A-1486, and A-1501 through A-1507.
[0078] Drilling a section is defined as a continuous drilling
operation that is terminated by an activity that does not fit
within the grammar definition for the drill_a_section activity 611,
see below. Therefore, at the conclusion of drill_a_section 611, or
a sequence of drill_a_section activities, the AFSM drill_well
transitions to the finish state 605.
[0079] Now consider the activity drill_a_section (Lines A-1557
through A-1607) illustrated in FIG. 8. The activity drill_a_section
has several component activities, also known as subactivities. The
subactivities of an activity are states in the finite state machine
corresponding to the activity. Thus, there is a one-to-one mapping
between activities and states. The drill_a_section activity 701 is
depicted graphically in FIG. 8. The drill_a_section 701 is composed
of the subactivities: pre_drill_stand 702, drill_a_stand 703,
pre_add_stand 704 and add_a_stand 705 (in addition to the start 707
and finish states 709).
[0080] FIG. 9 is a schematic illustration of the drill_a_stand
activity 703.
[0081] FIG. 10 is a schematic illustration of an activity with a
repeating subactivity. The trip_in activity 711 includes one
subactivity, the trip_in_stand subacitity 713 which may repeat up
to 100 times.
[0082] FIG. 11 is a schematic illustration of a leaf activity,
namely, the make_hole activity 715. The make_hole activity is
defined in Appendix A at lines A-1188 through A-1213. The make_hole
activity has not subactivities and the only transition is directly
from its start state 717 to its finish state 719. In the context of
the finite state machine representation of stochastic grammar, leaf
activities are referred to as leaf states.
[0083] The example of Appendix A defines the following leaf
activities:
TABLE-US-00002 TABLE II Leaf Activities from the Example of
Appendix A lower_to_bottom run_into_hole lift_out_of_slips
circulate wipe_up wipe_down lower_into_slips make_hole
pull_out_of_hole in_slips connect_stand unclassified absent datagap
unknown
[0084] Transitional Probabilities As noted above, each activity,
other than leaf activities or bottom level activities, comprise one
or more subactivities. The activity has specified transitional
probabilities and a start and finish state. For example, the
drill_well activity 601 defines transitions from trip_in 609 to
trip_out 613 and drill_a_section 611. In the example of drill_well,
the transitional probabilities from its start state 603 to
drill_a_section is 0.4 and to trip_in 0.4. These probabilities
represent probabilities that well drilling operation commences with
drilling a section or tripping in, respectively. In some
circumstances, drilling a well may start with a tripping out
operation represented by a 0.2 probability transition from the
start state to the trip_out subactivity 613.
[0085] As illustrated in FIGS. 7 through 10, activities correspond
to finite state machines with transitions from the start state to
the finish state through a sequence of subactivities. The activity
grammar defines transitional probabilities for the transitions from
the start state to these various subactivities, to one another, and
to the finish state.
[0086] For example, lines A1473 through line A1542 define the
transitional probabilities of the activity drill_well,
corresponding to the transitional probabilities illustrated in FIG.
7 and discussed hereinabove.
[0087] Confizuration Variables and Leaf Activities The grammar has
certain activities that do not have further subactivities; these
are leaf activities. Associated with each leaf activity are values
for certain traits. The traits may be defined in superactivities of
the leaf activities and inherited by the leaf activities. A
combination of trait values constitute a configuration that by
definition have certain values when the leaf activity is being
performed. The configuration variables, in a preferred embodiment,
include pump, rotate (optionally), block, bottom, and slips.
[0088] Pump has the values on and off, and indicates whether the
pump circulating drilling mud through the drillpipe is pumping (on)
or not (off).
[0089] Rotate defines whether the drillstring is rotating or
not.
[0090] Block indicates the movement of the block and has the values
up, down, and stop (i.e., no movement).
[0091] Bottom indicates whether the bit is on the bottom of the
borehole and has the values onbottom and offbottom.
[0092] Slips indicates whether the drillstring is inslips or
notinslips.
[0093] Each leaf state is defined by particular values for each of
the configuration variables. Configurations are particular
combinations of trait values. For example, lines A1084 through
A1109 defines that the activity circulate has the values pump=on,
rotate=on, block=stop, bottom=offbottom, and slips=notslips. In
other words, when the activity is circulate by definition the pump
is pumping, the drill string is rotating, the block is not moving,
the drillstring is off the bottom of the borehole and in slips.
[0094] In addition to the traits pump, rotate, block, bottom, and
slips the ontology of Appendix A define several traits that are not
directly associated with drilling operations, but rather with the
data collected. These include classified, datagap and absent.
Classified indicates that the trait combination recorded by the
observed data translates to a datavalue in the RIG channel. I.e.,
if the combination of pump, rotate, block, bottom, and slips do not
produce a RIG channel datavalue, the configuration is said to not
be Classified. Datagap is used to signify a sequence of datapoints
without recorded data. Absent indicates a missing data value.
[0095] Declaring configurations for the leaf activities specifies
connections to the observations that lead to a conclusion that the
drilling rig is operating according to that leaf activity. Thus,
the system defines some configuration variables, namely pump,
block, bottom, rotate and slips. These correspond to the data
channels and correspond to the RIG STATE data channel. Furthermore,
these define important variables that characterized into discrete
cases, e.g., block is going down, pumping is off or on, we are
either rotating or not, we are either on bottom or not on bottom,
and in or not in slips. In an embodiment of the present invention,
qualitative variables may be used that couple to the actual data.
To decide whether the drilling process is pumping or not, in
aspects of the present invention, a threshold above which it is
deemed that the system is pumping is defined.
[0096] This threshold may be determined/analyzed/interpreted
probabilistically. When looking at a measurement with a threshold,
if far from the threshold there is a high certainty about the
meaning of the data, e.g., high standpipe pressure above the
determined pumping threshold means the probability of pumping in
the system is high, whereas low'pipe pressure data below the
pumping threshold means that the probability is that the pumping in
the system is off. Pumping data around the threshold means the
probability of pumping or not pumping is about fifty-fifty. As the
pipe pressure rises the probability of pumping goes from zero, to
fifty percent, to 100 percent.
[0097] The specific configuration variable values for each leaf
state may be found in Appendix A, e.g., for make_hole, at A-1189
through A-1212, which defines that the configuration for make_hole
is slips=notslips, pump=on, block=slow, bottom=onbottom; rotate is
not specified.
[0098] Top-Level Activity The grammar 505 defines a top level
activity from which certain operations of the generation of the
data interpretation program 501 may commence. For example,
determination of transitional probabilities from one leaf-state to
another leaf-state is performed by traversing the grammar. That
traversal begins at the top-level activity.
[0099] Returning now to FIG. 5. The ontology 501 contains a data
representation of uncertainty in drilling knowledge. Uncertainty in
drilling knowledge includes uncertainties in the manner in which
one would interpret a particular data condition. Consider, for
example, a knowledge that driller is drilling a section of a well
and that in so doing the driller is drilling a stand of drill pipe,
there is an uncertainty as to whether the driller will at the
conclusion of that operation drill another stand or will have
finished drilling a the section of the well. Experience may have
shown that 90% of the time after drilling a stand another stand is
added and 10% of the time the drilling of the section has finished.
The activity grammar 505 contains this type of probability
knowledge about the flow of drilling operations.
[0100] CODE GENERATOR 507 In one embodiment of the present
invention, the code generator 507 accepts as input the activity
grammar 505 (e.g., as listed in Appendix A) and produces the Data
Interpretation Program 401 that when executed may be used to
interpret the input data 405 and produce an interpretation 407 of
the data in terms of the activities of the grammar 505. A sample
code generator 507 written in the Java programming language is
listed in Appendix B. This sample code generator accepts as input
the grammar 505 that is represented in listing form in Appendix
A.
[0101] FIG. 12 is a block diagram illustrating the components of
the data interpretation program 401 generated by the data
interpretation program generator 401. The data interpretation
program 401 consists of three major components: a transition
probability matrix (TRANS-PROB) 451 which is a matrix containing
the probabilities of transitioning from one activity (i.e., one
state) to another, a data-to-state probability matrix
(DATA-STATE-PROB) 453 which is a matrix containing the
probabilities that a given data value corresponds to each
particular state, and the program code 455 that applies the
TRANS-PROB matrix 451 and the DATA-STATE-PROB matrix 453 to compute
an interpretation of the input data in terms of probabilities that
the input data corresponds to particular activities defined in the
knowledge base 403.
[0102] The mechanism for building the TRANS-PROB matrix 451 and the
DATA-STATE-PROB matrix 453 is described herein below. Before
discussing how the code generator 507 builds these matrices we
describe the operation of the code 455 that applies these matrices
to interpret input data, e.g., a RIG states channel.
[0103] FIG. 13 is a block diagram illustrating the process of
interpreting a time sequence of input data, e.g., the RIG State
channel 323, to compute probability values for each leaf state and
for each activity defined by the activity grammar 505, in
accordance with an embodiment of the present invention. FIG. 13
illustrates an example of the operation of code 455. In the
illustrated embodiment, the input is the sequence to be
interpreted, e.g., the RIG states channel data, and the grammar
data structure, e.g., the grammar data structure 511.
[0104] Consider the interpretation of a data value Data at time T
200, and the probabilities of the various states at time T-1 201.
The input state probabilities vector P(S.sub.T-1) indicates the
probability of each leaf activity is the leaf activity occurring at
time T-1. Considering the example of Appendix A, there are fifteen
leaf activities defined. The P(S.sub.T-1) therefore has 15
elements, each indicating the probability that one of the leaf
activities is occurring at T-1.
[0105] The P(S.sub.T-1) is matrix-to-vector multiplied 157 with the
TRANS-PROB matrix to determine the probability of each leaf state
given the probabilities of transitioning from that leaf state to
each other state, i.e., P(S.sub.T|S.sub.T-1). The construction of
the TRANS-PROB matrix is described herein below.
[0106] The matrix-to-vector multiplication 157 produces a prior
state probabilities vector P(S.sub.T) 205 in which each element
represents the probability that the corresponding leaf state would
occur given the state probability vector at T-1. As is discussed
herein below, the TRANS-PROB is derived from the transitional
probabilities in the grammar 505 and the grammar structure itself.
Thus, P(S.sub.T-1) 205 reflects only the transitional probabilities
resulting from the grammar without taking the input data Data 200
into account. In Bayesian inference, a prior probability
distribution, often called simply the prior, is a probability
distribution representing knowledge or belief about an unknown
quantity a priori, that is, before any data have been observed
P(A).
[0107] The prior probability vector P(S.sub.T) 205 is adjusted by
the probabilities that the data reflects each particular leaf
activity P(S.sub.T|Data) 207. That task is performed by extracting
211 the vector of probability values corresponding to the Data
value 200 in the DATA-STATE-PROB matrix 453. The DATA-STATE-PROB
matrix 453 contains the probability value of each leaf activity
given a particular data value. The computation of the
DATA-STATE-PROB matrix 453 is provided herein below.
[0108] The prior probability vector P(S.sub.T) 205 is adjusted by
the probabilities that the data reflects each particular leaf
activity P(S.sub.T|Data) 207 by an element-by-element
multiplication 161 of each element in the prior probability vector
P(S.sub.T) 205 by the corresponding element in the data-to-state
probability vector P(S.sub.T|Data) 207 and normalizing 167 the
result thereby obtaining the posterior state probabilities at time
T P(S.sub.T|Data) 209. Thus, the posterior state probabilities at
time T P(S.sub.T|Data) 209 take into account both the stochastic
grammar 505 and the data values from the data channel.
[0109] FIG. 14 is a block diagram illustrating exemplary pseudo
code implementing the process illustrated and discussed in
conjunction with FIG. 13. A first step may be to clean up the input
data, step 131. The data may be cleaned up to provide for missing
data, to remove spikes indicative of nonsense/non-probabilistically
relevant data and/or the like. In certain aspects, the
nonsense/non-probabilistically relevant data may be treated as
missing data. In an embodiment of the present invention, the
program may interpolate for missing data values.
[0110] The pseudo code of FIG. 14 operates to take a current state
probability vector (computed at T-1) as an input for the processing
of each data value in the data sequence at time T and from it,
together with the data value, compute a new current state
probability vector reflecting the data value at time T-1. For each
iteration, the current state vector from the previous iteration
(each iteration reflecting the processing of a data value in the
sequence of data values) is updated. Thus, a first step may be to
initialize the array holding the current state vector
(CURRENT-STATE-VECTOR), step 135. The initialization may be to give
each state the same probability, e.g., if there are 14 different
possible states, each state would be given the probability of
1/14=0.0714. An alternative approach is to use a statistical
distribution of states from historical data as the basis for the
initial probability distribution.
[0111] Next, the pseudo code includes a loop iterating over the
sequence of data samples to be processed, loop 137, to update the
CURRENT-STATE-VECTOR. First, the state probability vector
(TRANSITION-PROB-VECT) is computed, step 139. Step 139 is fleshed
out in greater detail in FIG. 15. As discussed in conjunction with
FIG. 13, step 157, step 139 is a vector-matrix multiplication
operation between the CURRENT-STATE-VECTOR and TRANS-PROB matrix.
For each state i in the CURRENT-STATE-VECTOR (i.e., at T-1), outer
loop 141, the sum of the probability that a each possible state j
is followed by the state i is calculated using an inner loop 143
that is an iteration over the possible states j by looking up the
probability value that the state j is followed by the state i in
the TRANS-PROB matrix, step 145. The computed vector
TRANSITION-PROB-VECT is the "Prior" States Probabilities 205 of
FIG. 13.
[0112] Returning to FIG. 14. The processing of a data value in the
sequence being processed also includes the computation of the
probability vector having values that the Data value at time T
corresponds to each possible activity based on the data on the data
value, i.e., the Data-to-State Probabilities vector 207 of FIG. 13,
step 181 corresponding to operation 211 of FIG. 13. This
computation may be a look up operation in the DATA-TO-STATE
probability matrix to determine for each possible state the
probability that the Data value corresponds to that state.
[0113] It should be noted that steps 139 and 181 are independent of
one another and may be computed in parallel or in any sequence.
[0114] The prior probabilities (TRANSITION-PROB-VECT) 205 are
combined with the Data-to-State Probabilities vector 207 by
multiplying each value in the prior probabilities vector to the
corresponding value in the Data-to-State Probabilities vector, step
183.
[0115] In an embodiment of the present invention, having computed
the Leaf State v. Rig State probability matrix, step 153, the
interpretation/parse program loops over the sequence of data
samples in the input data 405, loop 155 may be determined. For each
sample in the data channel, time-step by time-step, the
interpretation/parse program may be performed (this process is
illustrated in FIG. 14).
[0116] At the beginning of each sample, there is a probability of
being in each state from the previous sample (the initial condition
being either that the rig is in the unknown state, or that the
probability is equal for all states, step 154). In FIG. 14 these
state probabilities are contained in a state probability vector
201. In an embodiment of the present invention, the transitional
probability matrix 203 may be applied to all these state
probabilities, step 157. This is a matrix multiplication operation.
The application of the state transitional probabilities produces a
prior state probability vector 205 ("prior" in the sense that it is
computed solely from the previous state probability vector 201 and
the transitional probability matrix 203).
[0117] The details of the Interpretation Program Code Generator 507
are now described. As noted above the Interpretation Program 401
contains the TRANS-PROB matrix 451 and DATA-STATE-PROB matrix 453.
The Interpretation Program Code Generator 507 produces these two
matrices from the grammar 505 as is illustrated in FIG. 16.
[0118] The following pseudo code describes the process of creating
the TRANS-PROB matrix 451:
TABLE-US-00003 TABLE III Pseudo Code describing how to build the
TRANS-PROB matrix Build TRANS_PROB matrix { Determine_Leaf_Nodes
(START) {Determine Leaf Nodes of the grammar starting with START
(see below)} Build Trans-Prob Matrix from leaf_nodes {rows =
{START, leaf_nodes); columns {leaf_nodes, FINISH}}
Traverse_to_collect_probabilities by following paths from each
leaf-node to each other leaf-node }
[0119] The first step is to determine the leaf nodes. As discussed
herein above, the leaf nodes are those nodes that have no
subactivity states. The matrix may thus merely be traversed until a
node has no transition out. FIG. 17 is a very simple grammar used
as an example to illustrate (this grammar is used herein below to
illustrate the operation of the system and method for interpreting
drilling data using a stochastic grammar. Beginning at the START
state, the grammar is traversed collecting the nodes that have no
path other than to finish. Thus, there is a path A-B-D and D to
Finish. Therefore one leaf node is the A-B-D node, representing the
transitions from Start to Finish via the nodes A, B, and D.
Similarly, further traversal of the grammar structure of FIG. 17
determines that nodes A-C and A-B-E are leaf nodes.
[0120] Next, the TRANS-PROB matrix is constructed to have a row and
column for each leaf state, an additional row for the START state,
and an additional column for the FINISH state. Such a TRANS-PROB
matrix 231 that corresponds to the grammar of FIG. 17 is
illustrated in FIG. 18.
[0121] Next, the TRANS-PROB matrix 231 is populated by traversing
the grammar following the transitions from START to leaf-states and
multiplying together the transition probabilities. In the example,
the path from START to A-C to FINISH has the transitions
Start.fwdarw.A with a probability 1.0, A.fwdarw.C with a
probability 0.6, and C.fwdarw.FINISH with a probability 1.0. Thus,
the START to A-C state-to-state transition probability is
1.0*0.6*1.0=0.6. Similarly, from START to A-B-D to FINISH has the
transition probabilities 1.0, 0.4, 0.3, and 1.0 for a
state-to-state transition probability of 0.12, and so on. Of note
is the transition back from node A-B-E onto itself with a 0.5
probability. In the traversal of the grammar to determine the
transitional probabilities from one node to another, if a
transition causes a visit to a node that has previously been
visited in the determination from that one node to that another
node, the traversal stops and the product of the transitional
probabilities encountered along the path is noted. In this
particular example, there is only the transition from A-B-E onto
itself with a transitional probability of 0.5. A complete
leaf-state-to-leaf-state traversal that multiplies all the
transitional probabilities in the path from each leaf-state that
can reach each other leaf-state results in the TRANS-PROB matrix,
e.g., for the grammar example of FIG. 17, into the matrix 231 of
FIG. 18.
[0122] The process for building the interpretation program 401,
e.g., the interpretation program code generator 507, also computes
the DATA-STATE-PROB probability matrix 453. The following pseudo
code describes, one possible process of creating the
DATA-STATE-PROB probability matrix 453:
TABLE-US-00004 TABLE IV Pseudo Code describing how to build the
DATA-STATE-PROB matrix without taking confusion into account Build
Data-to-state matrix NOT taking confusion into account 1 { 2 3 for
each state 4 { 5 determine the configurations compatible with that
state (state-compatible- configurations) and number of state
compatible configuration (state-compatible- configurations-count);
/* i.e., if a state is defined by config 11--, the compatible
configurations are 1100, 1101, 1110, and 1111. Thus there are 4
compatible configurations */ 6 state-per-configuration-probability
:= 1 / state-compatible-configurations- count; 7 for each
state-compatible-configuration 8 { 9 for each data-value 10 { 11 if
the data value is compatible with the state-compatible-
configuration then 12 { 13 note the data value as compatible (e.g.,
set a bit corresponding to that datavalue and configuration; 14
increment count of data value-configuration pairing as compatible
for this state (data-value-compatible- count); 15 } /* end if data
value is compatible 16 } /* end for each data value */ 17 for each
compatible-datavalue 18 { 19 DATA-STATE-PROB
[compatible-datavalue,state] := DATA-STATE-PROB
[compatible-datavalue,state] + state-per- configuration-probability
DIVIDED BY data-value-compatible- count; 20 } /* end for each
compatible datavalue */ 21 } /* end for each
state-compatible-configuration */ 22 } /* end for each state */ 23
normalize each datavalue row; 25 }
[0123] The process iterates over the leaf-states defined in the
grammar. In the present example, the leaf states are A, B, C, and
D.
[0124] For each state, first there is a determination of which
states are compatible with particular data values based on common
traits, Loop Lines 3 through 22. FIGS. 19 and 20 illustrate the
operation of matching compatible states and traits. Consider a very
simple grammar 801 of FIG. 19. The grammar 801 has four leaf
states: A, B, C, and D. The transitions defined by the grammar 801
are not specified. However, let's stipulate that the grammar
defines four traits: TRAIT1, TRAIT2, TRAIT3, and TRAIT4. The values
for the configurations of these traits for the four states are
given in the TRAITS-TO-STATE table of FIG. 20. Note that similar
configurations are given for the leaf-states defined in the grammar
of Appendix A. For illustrative purposes, the traits in FIG. 19 are
binary. Thus, for each trait, a configuration corresponding to a
particular state may have the value undefined (-), 1, or 0. For
State A, the configuration is --1 0, etc. Thus, since the undefined
(-) values may take any value, the compatible configurations for
state A are 0010, 0110, 1010, and 1110. FIG. 21 illustrates the
compatible configurations for the states in the present
example.
[0125] Similarly, configurations, i.e., combination of trait values
are assigned to the various data values. For example, in the
example of Appendix A, the token Run_In (Appendix A, Lines A269
through A304), corresponding to the RIG channel value 6, has the
defined configuration classified=yes, absent=no, rotate=off,
block=down, bottom=offbottom, pump=off, slips=notslips, and
datagap=np. All other possible data values also have defined
configurations.
[0126] In the simplified example presented here, there are six data
values provided, 1 through 6. FIG. 22 provides a table of the
configurations corresponding to these six data values. For example,
data value 1 has the configuration 0 0--.
[0127] These configurations may also be expanded into compatible
configurations like the configurations corresponding to the various
leaf states. FIG. 23 is a table of the compatible configurations
corresponding to the defined configurations of FIG. 22.
[0128] The compatible are referred to in the pseudo code of Table
IV as state-compatible-configurations and the count of such
configurations, as state-compatible-configurations-count.
[0129] Having determined the compatible configurations, the process
assigns the total probability for the state over those compatible
configurations by simply taking the inverse of the
state-compatible-configurations-count, Line 6.
[0130] The process iterates over all the
state-compatible-configurations for the state of the current outer
loop iteration, Loop starting Line 7 and ending Line 21 to
determine the data values (innermost nested loop: Lines 9 through
16) that have a configuration that matches the compatible
configurations. For any data value that is compatible with the
state configuration (If statement Line 11), the data value is noted
as compatible (Line 13) and a count of compatible data
value-to-state-configuration pairings is incremented (Line 14).
FIG. 24 is an illustration of the notation of compatible
configurations for each state. Note that the configurations for
each state are numbered sequentially and that each element in the
matrix of FIG. 24 is a sequence of flags that reflect whether, the
particular corresponding configuration of the state is compatible
with the configuration of the data value. For example, because the
configuration of state A is --10 and has compatible configurations
0010, 0110, 1010, 1111 and data value 1 has the configuration 00--
with the compatible configurations 0000, 0001, 0010, 0011, the
first compatible configuration of state A is the only compatible
configuration that is compatible with the configurations of Data
value 1.
[0131] After the conclusion of the loop over compatible
configurations, the process knows which datavalues are compatible
with the state'(e.g., have been noted as compatible) and how many
such compatible states there are, data-value-compatible-count. That
information is used to populate the DATA-STATE-PROB matrix 453. For
each data value that is noted as compatible, the DATA-STATE-PROB
[datavalue, state] matrix element is set to number of compatible
configurations for that data value, state combination divided by
the total number of compatible configurations for the state, Lines
17-21. FIG. 25 is an illustration of the result of the loop of
Lines 17-21. Consider, for example, the column for State B. For
data value 2, there are 4 compatible configurations, for each of
data values 3 through 6 there are 2 compatible configurations.
Thus, for the entire column there are a total of 12 matches of
compatible configurations. Dividing the matching configurations for
each data value with the total number of matching configurations
results in the values 1/3, 1/6, 1/6, 1/6, and 1/6 for rows
corresponding to data values 2 through 6.
[0132] Finally, the DATA-STATE-PROB matrix is normalized along the
rows, Line 23. FIG. 26 is an illustration of the resulting matrix
following the normalization operation.
[0133] The example grammar 801 of FIG. 19 does not provide specific
transition rules. For the purpose of illustrating the operation of
the data interpretation program, let's consider an example
TRANS-PROB matrix with arbitrarily selected probability numbers.
FIG. 27 provides such an example that will be used herein below for
the purposes of illustrating the operation of the interpretation
program 401.
[0134] FIG. 28 is an illustration of the results of the operation
of the interpretation program 401 using the process described in
conjunction with FIGS. 13 though 15 and the TRANS-PROB matrix of
FIG. 27 and the DATA-STATE-PROB matrix of FIG. 26. Row 803
represents a sequence of data values. Column 805 is an initial
value for the CURRENT-STATE-PROBABILITY VECTOR; or it may be viewed
as an interim vector in the processing of some larger sequence that
is immediately followed by the vector 803.
[0135] Table 205' is the prior probabilities. Thus, the first
column are the prior probabilities obtained from a vector-to-matrix
multiplication of the initial vector 805 and the TRANS-PROB matrix
of FIG. 27 as discussed in conjunction with elements 157 of FIG.
13, 139 of FIGS. 14 and 15. Table 207' contains the Data-to-State
vectors corresponding to the respective data values in the input
data sequence 803. For example, the first, third, and sixth data
values are the value 5. The value 5 has the data-to-state vector
[0, 0.3, 0.1, 0.3, 0.1, 0]. Thus, that vector is recorded in
columns 1, 3, and 6 of table 207'. Having the vectors 205 and 209,
corresponding to a data value, these are element-by-element
multiplied (operation 161) and normalized (operation 167) to
produce the vector corresponding to the same data value in the
output table 209'.
[0136] While the present example discussed herein above relies on a
very simplified grammar, the same techniques may be used for a more
complex grammar 505. Appendix A illustrates such a grammar.
Appendix B is an example Java program implementation of an
interpretation program code generator 507 operating on, for
example, the activity grammar 505 that has been extracted into the
representation of Appendix A.
[0137] It is entirely possible that a recorded data value is
inaccurate. Consider an unrelated example. Consider two drivers
following one another. The trailing driver wishes to use the turn
signal of the car in front to determine the actions of the first
driver. Usually the turn signal coming on is a good predictor of
the intent of the driver to turn. However, a missing turn signal
may only mean that the light is out. Even a blinking turn signal
may not indicate that the driver intends to turn. The blinking turn
signal could be indicative of a faulty circuit or that the driver
mistakenly engaged the turn signal (or that the eyes of the person
in the trailing car is hallucinating). Thus, there is some
confusion about what the observed data really means.
[0138] The same phenomena may occur in a drilling operation. For
example, a RIG state indicative of the rig being in slips usually
would mean that the rig is indeed in slips. However, it could also
mean that there was an error in recording the rig as being in
slips. Such errors may occur, for example, by sensors failing,
sensor calibration being off, or some anomalous condition that
caused a sensor to operate erratically.
[0139] An embodiment of the present invention accounts for such
uncertainties, also known as confusion, by recording the confusion
as to the meaning of a trait value in a confusion matrix mapping
recorded values to actual values according to the probability that
the recorded value accurately reflects the actual value. FIG. 29
illustrates some confusion matrices that could correspond to the
example given herein above. The first trait confusion matrix 901,
for example, corresponds to the Trait #1 and indicates that a
recorded value of ON is 0.98 indicative of the actual value being
ON and 0.02 indicative of the actual value being OFF. Similarly of
the other traits. Note that the third and fourth confusion matrices
903 and 905 indicate that there is no confusion as to these
values.
[0140] It is valuable to note that the confusion matrices are not
necessarily symmetrical. The example, with the turn signal would
probably yield a similar dissymmetry, i.e., it is more likely that
the turn signal being on means an imminent turn than that the turn
signal being off means that no turn will be made.
[0141] The following pseudo code describes the process of creating
the DATA-STATE-PROB matrix 453 using the Confusion Matrices:
TABLE-US-00005 TABLE V Pseudo Code describing how to build the
DATA-STATE-PROB matrix using confusion matrices Build Data-to-state
matrix taking confusion into account 1 { 2 3 for each state 4 { 5
determine the configurations compatible with that state
(state-compatible- configurations) and number of state compatible
configuration (state- compatible-configurations-count); /* i.e., if
a state is defined by config 11--, the compatible configurations
are 1100, 1101, 1110, and 1111. Thus there are 4 compatible
configurations */ 6 state-per-configuration-probability := 1 /
state-compatible-configurations- count; 7 for each
state-compatible-configuration 8 { 9 for each data-value in the
list of data-values 10 { 11 if the data value is compatible with
the state-compatible- configuration then 12 { 13 note the data
value as compatible (e.g., set a bit corresponding to that
datavalue and configuration; 14 increment count of data
value-configuration pairing as compatible for this state
(data-value-compatible- count); 15 } /* end if data value is
compatible 16 }; /* end for each data value */ 17 if the
state-compatible-configuration does NOT have confusion 18 for each
compatible-datavalue 19 DATA-STATE-PROB [compatible-
datavalue,state] := DATA-STATE-PROB [compatible- datavalue,state] +
state-per-configuration-probability DIVIDED BY
data-value-compatible-count 20 else /* the
state-compatible-configuration does have confusion */ 21 for each
confusion-alternative-configuration /* e.g., if config is 1 0 and
can be confused with 1 1 the iteration is over the set 1 0 and 1 1
. */ 22 { 21 Determine the alternative-config- probability for the
confusion-alternative-configuration from the confusion matrix; 22
For each alternative-data-value IN the list of data-values that is
compatible with the confusion- alternative-configuration 23
DATA-STATE-PROB [alternative-data-value,state] := DATA-STATE-PROB
[alternative-data-value,state] + alternative-config- probability*
state-per-configuration-probability DIVIDED BY
data-value-compatible-count; 24 } /* end for each
confusion-alternative- configuration */ 25 /* ENDIF */ 26 } /* end
for each state-compatible-configuration */ 27 } /* end for each
state */ 28 normalize each datavalue row; 29 }
[0142] The above pseudo code will be described herein by way of
example. The pseudo code of Table V loops over each state (Loop
starting at Line 3). The pseudo-code of Table V operates much like
pseudo code of Table IV. For any configuration that is compatible
with the state and for which there is no confusion, the assignment
of probability is the same. However, if there is confusion in a
compatible configuration, the probability associated with that
configuration is allocated between the alternative configurations
that could reflect the recorded configuration and to the datavalues
that such alternative configurations are compatible with according
to the probabilities assigned in the confusion matrices.
[0143] Consider a very simple example. If a first state S has a
defined configuration as 1-, i.e., the first bit is 1 and the
second bit is undefined, there are two configurations that are
compatible with that configuration, 1 0 and 1 1. Now, suppose that
there are three alternative data values, A, B, and C. Let's define
1 0 to be compatible with A and B, and 1 1, with B and C. Let's
further define that the first compatible configuration, 1 0, has no
confusion, whereas 1 1 may be confused and has alternative
configurations 1 0 and 1 1. According to the confusion matrix, 1 1
has the probability 0.8 of begin 1 1 and the probability 0.2 of
being 1 0.
[0144] Because there are two compatible configurations for state S,
each is allocated a probability of 0.5.
[0145] Consider now the first compatible configuration of the state
S, 1 0. Because it has no confusion, of the data values compatible
with 1 0, namely A and B, are allocated A of the 0.5 probability
allocated to 1 0.
[0146] The resulting DATA-TO-STATE probability matrix for state S
is as follows:
[0147] A: 0.25
[0148] B: 0.25
[0149] C: 0.0
[0150] Now consider the second compatible configuration of state S,
1 1. Because 1 1 has confusion, Line 20 of the pseudo code of Table
V, for each alternative (1 1 and 1 0), the probability of that
configuration is determined from the confusion matrix. The
confusion matrix has a row for each recorded value of a trait and a
column for each actual value. In the present example, the only
recorded value is 1 and the corresponding actual values may be
either 1 (with a probability 0.8) and 0 (with a probability 0.2).
Thus, the two alternative configurations are given the
probabilities 0.8 and 0.2, respectively, Line 22. For each
configuration that is an alternative to the confused configuration
each compatible data value the probability assigned to the
alternative configuration multiplied by the portion of the
probability assigned to the state compatible configuration that is
assigned to each data value, Line 23. Because there are two data
values compatible with the second state compatible configuration
each is allocated 0.25. This is then multiplied by then allocated
to the data values compatible with the alternative configurations
as follows:
[0151] A: 0.2*0.25 (from being compatible with 1 0 which is 0.2
probability alternative of 1 1)
[0152] B: 0.8*25+0.2*0.25 (from being compatible with 1 1 which is
0.8 probabilty alternative of 1 1, and being compatible with 1 0
which is 0.2 probability alternative of 1 1)
[0153] C: 0.8*0.25 (from being compatible with 1 1 which is 0.8
probabilty alternative of 1 1)
[0154] Thus, the end-result allocation of data-to-state
probabilities for state S is:
[0155] A: 0.25+0.2*0.25
[0156] B: 0.25+0.8*25+0.2*0.25
[0157] C: 0.0+0.8*0.25
[0158] The methodology for storing a stochastic grammar 505 in an
ontology for drilling 501 and using that in the manner described
for interpreting a data stream 405 may be extended. In an
embodiment, the above-described methodology is used to assess the
compatibility of a data set with a particular grammar and thereby
determining something about the data set. For example, each
operator company may have its own way of performing drilling
operations and may handle particular situations in particular ways.
Each company would then have a unique grammar. Similarly, different
geographic regions may have different grammars. A data set, for
which an analyst does not know the origin, be it by
operator-company or by geographic region, may be interpreted
against several alternative grammars to determine which grammar is
the best fit and therefore most likely to be the origin of the data
set. FIG. 30 is a flow-chart illustrating the process of
determining the origin of a data set.
[0159] A data set 251, e.g., a RIG state channel or another data
channel, is received as input. A plurality of hypothesis 255a
through 255d are started, step 253. Each hypothesis 253 may be a
data interpretation program 401 that implements a unique stochastic
grammar reflecting the operations of a particular drilling operator
or geological area. These hypothesis data interpretation programs
255 each iterate 257 over the data sequence 251 in the manner
described herein above in conjunction with, for example, the
interpretation program 401. On each iteration, the hypothesis
interpretation programs 255 determine state probability vector
corresponding to an interpretation of the data set using the
grammar associated with that particular hypothesis.
[0160] Each hypothesis may test the state probability vector it
generates against some criteria to determine whether the hypothesis
is plausible, decision 261. Usually, if a data set reflects
activities that may be interpreted by a particular grammar the
state probability vector would strongly indicate that certain
activities are much more probable than the other activities.
Conversely, if all activities are roughly equally probable, there
is a very poor match between the grammar and the data set. Thus, if
the grammar seem ill-suited over several iterations, the hypothesis
is aborted, step 263, otherwise, the next point in the sequence is
processed, step 265. At the conclusion of the processing of the
data set through various hypothesis, the interpretation results may
be reported, step 267, including reporting the best overall match
between the data set 251 and the grammars processed by the various
hypothesis interpretation programs 255.
[0161] The particular embodiments disclosed above are illustrative
only, as the invention may be modified and practiced in different
but equivalent manners apparent to those skilled in the art having
the benefit of the teachings herein. Furthermore, no limitations
are intended to the details of construction or design herein shown,
other than as described in the claims below. It is therefore
evident that the particular embodiments disclosed above may be
altered or modified and all such variations are considered within
the scope and spirit of the invention. In particular, every range
of values (of the form, "from about A to about B," or,
equivalently, "from approximately A to B," or, equivalently, "from
approximately A-B") disclosed herein is to be understood as
referring to the power set (the set of all subsets) of the
respective range of values. Accordingly, the protection sought
herein is as set forth in the claims below.
* * * * *