U.S. patent application number 12/145281 was filed with the patent office on 2009-02-19 for insurance claim forecasting system.
Invention is credited to Gregory S. Binns, Mark Stuart Blumberg.
Application Number | 20090048877 12/145281 |
Document ID | / |
Family ID | 39530091 |
Filed Date | 2009-02-19 |
United States Patent
Application |
20090048877 |
Kind Code |
A1 |
Binns; Gregory S. ; et
al. |
February 19, 2009 |
INSURANCE CLAIM FORECASTING SYSTEM
Abstract
A computer-implemented process of developing a person-level cost
model for forecasting future costs attributable to claims from
members of a book of business, where person-level data are
available for a substantial portion of the members of the book of
business for an actual underwriting period, and the forecast of
interest is for a policy period is disclosed. The process uses
development universe data comprising person-level enrollment data,
historical base period health care claims data and historical next
period claim amount data for a statistically meaningful number of
individuals. The process also provides at least one claim-based
risk factor for each historical base period claim based on the
claim code associated with the health care claim and provides at
least one enrollment-based risk factor based on the enrollment
data. The process also develops a cost forecasting model by
capturing the predictive ability of the main effects and
interactions of claim based risk factors and enrollment-based risk
factors, with the development universe data through the application
of an interaction capturing technique to the development universe
data.
Inventors: |
Binns; Gregory S.; (Lake
Forest, IL) ; Blumberg; Mark Stuart; (Oakland,
CA) |
Correspondence
Address: |
Husch Blackwell Sanders, LLP;Welsh & Katz
120 S RIVERSIDE PLAZA, 22ND FLOOR
CHICAGO
IL
60606
US
|
Family ID: |
39530091 |
Appl. No.: |
12/145281 |
Filed: |
June 24, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09861379 |
May 18, 2001 |
7392201 |
|
|
12145281 |
|
|
|
|
60249060 |
Nov 15, 2000 |
|
|
|
60267131 |
Feb 7, 2001 |
|
|
|
Current U.S.
Class: |
705/4 |
Current CPC
Class: |
G06Q 10/04 20130101;
G06Q 40/08 20130101; G06Q 10/10 20130101 |
Class at
Publication: |
705/4 |
International
Class: |
G06Q 40/00 20060101
G06Q040/00 |
Claims
1. A computer-implemented process of developing a person-level cost
model for forecasting future costs attributable to claims from
members of a book of business, where person-level data regarding
actual base period health care claims are available for a
substantial portion of the members of the book of business for an
actual underwriting period, and the forecast of interest (i.e.,
future claim amount) is for an actual policy period which can be,
but is not necessarily contiguous with the actual underwriting
period, comprising the steps of: providing development universe
data comprising person-level enrollment data, historical base
period health care claims data and historical next period claim
amount data for a statistically meaningful number of individuals,
where the person-level data on a health care claim comprises at
least a claim code and a claim amount; providing at least one
claim-based risk factor for each historical base period claim based
on the claim code associated with the health care claim and
providing at least one enrollment-based risk factor based on the
enrollment data; and developing a cost forecasting model by
capturing the predictive ability of the main effects and
interactions of claim based risk factors and enrollment-based risk
factors, with the development universe data through the application
of an interaction capturing technique to the development universe
data.
2. The computer-implemented process of claim 1, wherein the
interaction capturing technique is selected from the group
consisting of median regression tree techniques, least square
regression tree techniques, rule induction techniques, ordinary
least squares regression techniques, median regression techniques,
robust regression techniques, genetic algorithms, rule induction,
clustering techniques and neural network techniques.
3. The computer implemented process of claim 1 wherein the
person-level next period cost forecasts are adjusted by modifying
the extant cost forecast by the expected cost trend.
4. The computer implemented process of claim 1 wherein the datum
from the claims used as predictors consist essentially of the
claim- and enrollment-based risk factors and the claim amount is a
standardized cost of services provided and the model is used to
allocate prospective payments to health care providers.
5. The computer implemented process of claim 1 wherein the data
used from the claims data consist essentially of the claim code and
selected mandatory procedures and the claim amount is a
standardized cost of services provided during the same time period
as the base period and the model is used to evaluate the efficiency
of health care providers.
6. The computer implemented process of claim 1, further comprising
a computer implemented process of forecasting future claim amounts
attributable to claims from members of a book of business for an
actual policy period, wherein the model development universe
comprises data from the members of a book of business to be
insured, further comprising: applying the cost-forecasting model to
the actual underwriting period person-level data of each of the
members of the book of business to generate a person-level actual
policy period cost forecast for each member of the book of
business; and producing a group-level forecast for the actual
underwriting period from the person-level forecasts of each member
of the group by totaling the person-level actual policy period cost
forecasts for the group for the policy period.
7. The computer implemented process of claim 6, comprising in
addition the step of: setting insurance reserves based on
group-level forecast for the actual policy period, wherein the
policy period is a reserving period for claims that have not
occurred or that have occurred but not been reported.
8. The computer implemented process of claim 6, wherein claim
amounts are a mix of fee for service payments and capitation
payments so that the base and underwriting periods risk factors are
appended to include dummy variables for the presence of capitation
payments by provider type and the cost estimate in the next and
policy periods is the fee for service cost that must be
supplemented with the expected capitation payments.
9. The computer implemented process of claim 6, wherein the cost
forecast is produced for first-dollar health insurance.
10. The computer implemented process of claim 6, wherein the cost
forecast is produced for specific plus aggregate stop loss health
insurance.
11. The computer implemented process of claim 10, wherein the cost
forecast produced is for aggregate-only stop loss health
insurance.
12. The computer implemented process of claim 10, wherein the cost
forecast produced is for specific stop loss health insurance.
13. The computer implemented process of claim 1, wherein each of
the diagnosis and CPT based risk factors is independent of the
sequence in time of the other diagnosis and CPT based risk
factors.
14. The computer implemented process of claim 1, wherein the
providing of risk factors for the health care claim data is
substantially free of human expert interaction.
15. The computer implemented process of claim 1, wherein capturing
the predictive ability of the main effects and interactions of
claim based risk factors and enrollment-based risk factors is
substantially free of human expert interaction.
16. The computer implemented process of claim 1, comprising in
addition the step of: setting medical insurance reserves through
application of the health care cost forecasting model, wherein the
next period is a reserving period for claim amounts that have not
occurred or that have occurred but not been reported.
17. The computer implemented process of claim 1 for forecasting
short term disability (STD) costs wherein a dependent measure for
generating the cost forecasting model is the number of STD days in
the policy period and is weighted by the expected cost per day for
the STD to produce the person-level forecast STD costs and summed
across the group to produce the group's forecast STD cost.
18. The computer implemented process of claim 1, for forecasting a
probability of long term disability (LTD) claims wherein a
dependent measure for generating the cost forecasting model is the
probability of a LTD claim in the policy period where the
probability is weighted by the net present value of the LTD claim
amount and comprises in addition producing person-level expected
LTD costs and summing person-level expected LTD costs across the
group to produce a group's expected LTD cost.
19. The computer implemented process of claim 1 for forecasting
group term life insurance costs wherein a dependent measure for
generating the forecasting model is the expected probability of
death weighted by the amount of life insurance to produce the
person-level expected term life insurance cost which is summed
across the group to produce the group's expected term life
insurance cost.
20. The computer implemented process of claim 1, wherein claim
amounts are a mix of fee for service payments and capitation
payments so that the base and underwriting periods risk factors are
appended to include dummy variables for the presence of capitation
payments by provider type.
21. A computer-implemented process of developing a hybrid
person-level health care claim cost forecasting model for
forecasting future medical costs attributable to health care claims
from members of a book of business, where person-level data are
available for a substantial portion of the members of the book of
business, comprising the steps of: providing development universe
data comprising person-level data for a statistically meaningful
number of individuals, the person-level data comprising continuous
variable data and categorical variable data; processing first the
continuous variable data for each individual with a continuous
processing technique that captures the predictive ability of main
effects and interactions of continuous variables to generate a
person-level continuous variable model; and processing the
categorical variable data for each individual including the output
from the continuous processing technique with a categorical
processing technique that captures the predictive ability of main
effects and interactions of categorical variables to generate a
person-level categorical variable model; wherein the person-level
continuous variable model and person-level categorical variable
model together comprise a hybrid person-level health care claim
amount forecasting model.
22. The computer-implemented process of claim 21, wherein the
continuous variable data comprises data selected from the group
consisting of age, length of prior enrollment, historical claim
amounts and transformations and trends in the person level claim
amounts.
23. The computer-implemented process of claim 21, wherein the
categorical variable data comprises data selected from the group
consisting of clinical risk factors, provider type and site of
care.
24. The computer-implemented process of claim 21, wherein the
continuous processing technique is selected from the group
consisting of regression techniques and neural network
techniques.
25. The computer-implemented process of claim 21, wherein the
categorical processing technique is selected from the group
consisting of median regression tree techniques, least square
regression tree techniques, rule induction techniques, and neural
network techniques.
26. The computer-implemented process of claim 21, wherein the
person-level data is available for a substantial portion of the
members of the book of business for an actual underwriting period,
and the claim amount of interest for forecasting purposes are
during an actual policy period which can be, but is not necessarily
contiguous with the actual underwriting period, and the development
universe data comprises person-level data for each individual for a
historical base period and a historical next period.
27. The computer-implemented process of claim 21, wherein the
hybrid person-level health care claim cost forecasting model is
used as an input into an interaction capturing technique that uses
all of the risk factors that were meaningful in the hybrid
person-level health care claim cost forecasting model to forecast
future medical claim amounts.
28. A computer-implemented process of developing a claim amount
forecasting model for use in forecasting the future claim amount
for members of a book of business, where person-level data are
available for a substantial portion of the members of the book of
business for an actual base period, and the claim amount of
interest for forecasting purposes is an actual next period which
can be, but is not necessarily contiguous with the actual base
period, comprising the steps of: processing the base period data
having claims to generate a having-claims claim amount forecasting
model; and processing the base period data without claims to
generate a without-claims claim amount forecasting model, wherein
the having-claims cost forecasting model and the without-claims
forecasting model comprise a claim amount forecasting model.
29. A computer-implemented process of developing a health care
claim amount forecasting model for use in forecasting the future
medical claim amount for members of a book of business, where
person-level data are available for a substantial portion of the
members of the book of business for an actual base period, and the
claim amount of interest for forecasting purposes is an actual next
period which can be, but is not necessarily contiguous with the
actual base period, comprising the steps of: providing development
universe data comprising person-level data for a statistically
meaningful plurality of individuals, wherein the person-level data
for an individual comprises health care claims data for the
individual and the data on a health care claim comprises at least a
claim amount and a claim code; Winsorizing the person-level data to
yield inlier data and outlier data; processing the inlier data to
generate an inlier cost forecasting model; and processing the
outlier data to generate an outlier cost forecasting model; wherein
the combination of the results of the inlier and outlier cost
forecasting models together produce a person-level claim amount
forecast model.
30. The computer-implemented process of claim 29 further
comprising: Winsorizing the inlier data to yield inlier data having
claims and inlier data without claims; processing the inlier data
having claims to generate an inlier-having-claims claim amount
forecasting model; and processing the inlier data without claims to
generate an inlier-without-claims claim amount forecasting model,
wherein the inlier-having-claims cost forecasting model and the
inlier-without-claims forecasting model comprise an inlier claim
amount forecasting model.
31. A computer-implemented process of forecasting a claim amount
attributable to claims from members of a book of business during an
actual policy period, comprising the steps of: providing
person-level data, comprising enrollment data for members of a book
of business to be insured for an actual underwriting period that
can be, but is not necessarily, contiguous with the actual policy
period; providing a model development universe of person-level
data, comprising enrollment data from the historical base period
and historical next period heath care claims data for a
statistically meaningful number of individuals; providing
enrollment-based risk factors for each historical base period and
providing next period claim amounts; developing a health care
cost-forecasting model for the enrollment data by capturing the
predictive ability of main effects and interactions of
enrollment-based risk factors through the application of an
interaction capturing techniques to the model development universe;
applying the health care cost-forecasting model to the person-level
underwriting period enrollment data of each of the members of the
book of business to generate a person-level expected cost forecast
for the policy period for each member of the book of business; and
producing a group-level forecast for the expected cost of the
policy period from the person-level forecasts of each person of the
group by totaling the person-level expected cost forecasts for the
actual policy period.
32. A computer-implemented process of forecasting costs
attributable to claims from members of a book of business during an
actual policy period, comprising the steps of: providing
person-level data, comprising enrollment data and actual
underwriting period health care claims data, for members of a book
of business, where the person-level data on a health care claim
comprises at least a claim amount and a claim code and the actual
underwriting period can be, but is not necessarily, contiguous with
the actual policy period; providing a model development universe of
person-level data, comprising enrollment data, historical base
period health care claims data and historical next period claim
amount data for a statistically meaningful number of individuals,
where the person-level data on a base period health care claim
includes at least a claim amount and a claim code; providing
claim-based risk factors for each historical base period based on
the claim code associated with the health care claim and providing
at least one enrollment risk factor based on the enrollment data;
developing a cost-forecasting model by capturing the predictive
ability of main effects and interactions of risk factors through
the application of an interaction capturing technique to the model
development universe; applying the cost-forecasting model to the
person-level data of each of the individuals or members of a group
to generate a person-level actual policy period expected cost
forecast for each member of the group; and producing a group-level
forecast for the actual policy period from the person-level
forecasts of each individual or member of the group by totaling the
person-level cost forecasts for the actual policy period.
33. The computer implemented process of claim 32, comprising in
addition the step of: setting claim amount reserves based on the
individual or group-level forecast, wherein the next period is a
reserving period for claims that have not occurred or that have
occurred but not been reported.
34. The computer implemented process of claim 32 for forecasting
short term disability costs wherein the interaction capturing
technique uses a dependent measure from the next period and policy
period comprising the number of STD days in the policy period and
weights the dependent measure by the expected cost per day for the
STD to produce the person-level expected STD costs and summed
across the group to produce the group's expected STD cost.
35. The computer implemented process of claim 32, for forecasting a
probability of long term disability (LTD) claims wherein a
dependent measure for generating the cost forecasting model is the
probability of a LTD claim in the policy period where the
probability is weighted by the net present value of the LTD and
applying the cost forecasting model to the person-level data
produces person-level expected LTD costs wherein summing the
person-level expected LTD costs across the group to produce a
group's expected LTD cost for an actual policy period.
36. The computer implemented process of claim 32, wherein the cost
forecast is produced for first-dollar health insurance.
37. The computer implemented process of claim 32, wherein the cost
forecast is produced for specific plus aggregate stop loss health
insurance.
38. The computer implemented process of claim 32, wherein the cost
forecast produced is for aggregate-only stop loss health
insurance.
39. The computer implemented process of claim 32, wherein the cost
forecast produced is for specific stop loss health insurance.
40. The computer implemented process of claim 32 for forecasting
group term life insurance costs wherein a dependent measure for
generating the cost forecasting model is the expected probability
of death weighted by the amount of life insurance to produce the
person-level expected term life insurance cost which is summed
across the group to produce the group's expected term life
insurance cost.
41. The computer implemented process of claim 32, wherein claim
amounts are a mix of fee for service payments and capitation
payments so that the base and underwriting periods risk factors are
appended to include dummy variables for the presence of capitation
payments by provider type and the cost estimate in the next and
policy periods is the fee for service cost that must be
supplemented with the expected capitation payments.
42. The process of claim 32 further comprising developing
group-level cost-forecasting model for groups in the book of
business by capturing the predictive ability of main effects and
interactions of group-level risk factors which include but are not
limited to groups historical claim amounts, group-level sum of the
person-level forecasts, SIC code or industry type, characteristics
of the benefit plan design, geographic locale, and number of people
and length of time covered by the insurance through the application
of an interaction capturing technique to the model development
universe of groups.
43. The computer implemented process of claim 42, comprising in
addition the step of: setting medical insurance reserves based on
the group-level forecast, wherein the next period is a reserving
period for claims that have not occurred or that have occurred but
not been reported.
44. The computer implemented process of claim 42 for forecasting
short term disability costs wherein the interaction capturing
technique uses a group-level dependent measure of residual STD days
at the group-level calculate forecast STD costs by weighting by the
group's expected STD cost per day.
45. The computer implemented process of claim 42, wherein medical
claim amounts are a mix of fee for service payments and capitation
payments so that the base and underwriting periods group-level risk
factors are appended to include dummy variables for the presence of
capitation payments by provider type and the cost estimate in the
next and policy periods is the fee for service cost that must be
supplemented with the expected capitation payments.
46. The process of claim 32 comprising in addition the steps of:
providing a provider type cost trend forecast adjustment to be
utilized by at least one member of the group to be insured;
adjusting the person-level next period cost forecast for each
member using the health care provider type with the provider type
cost trend forecast adjustment.
47. An automated system for forecasting future costs attributable
to claims from members of a book of business during an actual
policy period comprising: a central processing unit; an insured
person database, accessible by the processor, wherein the database
comprises person-level enrollment data and actual underwriting
period health care claims data, for members of a book of business
to be insured, where the person-level data on a health care claim
comprises at least a claim amount and a claim code; a model
development universe database, accessible by the processor, wherein
the second database comprises model development universe of
person-level data, comprising enrollment data, historical base
period health care claims data and historical next period claim
amount data for a statistically meaningful number of individuals,
where the person-level data on the base period health care claim
includes at least a claim amount and a claim code; a risk factor
encoder, accessible by the processor, wherein the risk factor
encoder encodes claim-based risk factors for each historical base
period based on the claim code associated with the health care
claim and the risk factor encoder encodes at least one enrollment
risk factor based on the enrollment data; a model generator,
accessible by the processor, that generates a cost-forecasting
model by capturing the predictive capacity of the main effects and
the interaction of the risk factors assigned by the risk factor
encoder to forecast the historical next period of the model
development universe data using the historical base period data; a
person-level cost generator that applies the cost-forecasting model
to the person-level actual underwriting period health care claims
data of each of the members of the book of business to generate a
person-level actual policy period claim amount forecast for each
member of the book of business; and an actual policy period
group-level cost forecast generator that totals the person-level
actual next period forecasts for each member of the group to
generate an actual policy period group-level cost forecast.
48. The system of claim 47 wherein the model generator captures the
predictive ability of main effects and interactions of group-level
risk factors which include but are not limited to groups historical
claim amounts, group-level sum of the person-level forecasts, SIC
code or industry type, characteristics of the benefit plan design,
geographic locale, and the number of people and length of time
covered by the insurance through the application of an interaction
capturing technique to the model development universe of
groups.
49. A computer-implemented process of forecasting costs
attributable to claims from members of a book of business during an
actual policy period, comprising the steps of: means for providing
person-level data, comprising enrollment data and actual
underwriting period health care claims data, for members of a book
of business, where the person-level data on a health care claim
comprises at least a claim amount and a claim code and the actual
underwriting period can be, but is not necessarily, contiguous with
the actual policy period; means for providing a model development
universe of person-level data, comprising enrollment data,
historical base period health care claims data and historical next
period claim amount data for a statistically meaningful number of
individuals, where the person-level data on a base period health
care claim includes at least a claim amount and a claim code; means
for providing claim-based risk factors for each historical base
period based on the claim code associated with the health care
claim and providing at least one enrollment risk factor based on
the enrollment data; means for developing a cost-forecasting model
by capturing the predictive ability of main effects and
interactions of risk factors through the application of an
interaction capturing technique to the model development universe;
means for applying the cost-forecasting model to the person-level
data of each of the individuals or members of a group to generate a
person-level actual policy period expected cost forecast for each
member of the group; and means for producing a group-level forecast
for the actual policy period from the person-level forecasts of
each individual or member of the group by totaling the person-level
cost forecasts for the actual policy period.
50. The system recited in claim 49 wherein the system further is
automated such that when actual underwriting period data is
provided the system automatically provides an actual policy period
claim amount forecast.
51. The system recited in claim 49 for use by a client having data
and an Internet client application, further comprising an Internet
server application such that when the client provides actual
underwriting period data to the Internet server application, the
Internet server application automatically provides an actual policy
period claim amount forecast.
52. A group insurance product comprising: an identification of the
types of benefits which are agreed to be provided by an insurer to
or on behalf of members of a group, which will be incurred by
members of said group during a future time period; and a stated
monetary insurance premium including a forecast of said benefits
made in accordance with the process of claim 32, estimated costs of
administering the insurance product, and optionally, an estimated
profit, whereby an insurer agrees to cover the identified benefits
in exchange for the payment of the stated monetary insurance
premium.
53. The group health insurance product of claim 52 for insuring
short term disability costs wherein the interaction capturing
technique uses a dependent measure from the next period and policy
period comprising the number of STD days in the policy period and
weights the dependent measure by the expected cost per day for the
STD to produce the person-level expected STD costs and summed
across the group to produce the group's expected STD cost.
54. The group health insurance product of claim 52 for insuring
long term disability (LTD) claims wherein a dependent measure for
generating the claim amount forecasting model is the probability of
a LTD claim in the policy period where the probability is weighted
by the net present value of the LTD and applying the cost
forecasting model to the person-level data produces person-level
expected LTD costs wherein summing the person-level expected LTD
costs across the group to produce a group's expected LTD cost for
an actual policy period.
55. The group health insurance product of claim 52, wherein the
cost forecast is produced for first-dollar health insurance.
56. The group health insurance product of claim 52, wherein the
cost forecast is produced for specific plus aggregate stop loss
health insurance.
57. The group health insurance product of claim 52, wherein the
cost forecast produced is for aggregate-only stop loss health
insurance.
58. The group health insurance product of claim 52, wherein the
cost forecast produced is for specific stop loss health
insurance.
59. The group health insurance product of claim 52 for insuring
group term life insurance costs wherein a dependent measure for
generating the cost forecasting model is the expected probability
of death weighted by the amount of life insurance to produce the
person-level expected term life insurance cost.
60. The group health insurance product of claim 52, comprising a
renewal product, wherein the model development universe comprises
data from the members of a group in the book of business to be
insured.
61. A method of reserving for the group health insurance product of
claim 48, comprising in addition the step of: setting insurance
reserves based on the renewal group-level forecast for the actual
underwriting period, wherein the next period is a reserving period
for claims that have not occurred or that have occurred but not
been reported.
62. A method of pricing group insurance including a cost of future
benefits according to the computer-implemented process of
forecasting future medical costs attributable to claims from
members of a group during an actual underwriting period of claim
32, comprising the additional steps of: providing an expected
amount of administrative costs allocable to providing health
insurance coverage to the group; providing a minimum acceptable
expected profit; totaling the group level cost forecast, expected
amount of administrative costs, and minimum acceptable expected
profit are to yield a total minimum price, and providing a
plurality of expected probabilities of retention for the group
corresponding to a plurality of possible prices greater than or
equal to the total minimum price, each possible price also having
an expected profit that is the amount of the price over the group
level cost forecast plus the expected amount of administrative
costs; and calculating a plurality of possible maximum profits by
multiplying each of the plurality of possible profits by the
corresponding expected probability of retention, wherein the
largest possible maximum profit, is used to price the group
insurance.
63. A method of pricing group insurance of claim 62 for insuring
short term disability costs wherein the interaction capturing
technique uses a dependent measure from the next period and policy
period comprising the number of STD days in the policy period and
weights the dependent measure by the expected cost per day for the
STD to produce the person-level expected STD costs and summed
across the group to produce the group's expected STD cost.
64. A method of pricing group insurance of claim 62 for insuring
long term disability (LTD) claims wherein a dependent measure for
generating the cost forecasting model is the probability of a LTD
claim in the policy period where the probability is weighted by the
net present value of the LTD and applying the cost forecasting
model to the person-level data produces person-level expected LTD
costs wherein summing the person-level expected LTD costs across
the group to produce a group's expected LTD cost for an actual
policy period.
65. A method of pricing group insurance of claim 62, wherein the
pricing is produced for first-dollar health insurance.
66. A method of pricing group insurance of claim 62, wherein the
pricing is produced for stop loss health insurance.
67. A method of pricing group insurance of claim 62, wherein the
pricing produced is for aggregate-only stop loss health
insurance.
68. A method of pricing group insurance of claim 62, wherein the
pricing produced is for specific stop loss health insurance.
69. A method of pricing group insurance of claim 62 for insuring
group term life insurance costs wherein a dependent measure for
generating the cost forecasting model is the expected probability
of death weighted by the amount of life insurance to produce the
person-level expected term life insurance cost.
70. A method of pricing group insurance of claim 62, comprising a
renewal product, wherein the model development universe comprises
data from the members of a group in the book of business to be
insured.
71. A method of underwriting an insurance product comprising the
steps of: providing an identification of the coverage of the
insurance product which identifies the conditions of payment under
the product during a policy period; providing person-level health
care claim information comprising enrollment data, and base period
and underwriting period claim data, the claim data comprising claim
codes having associated claim costs; capturing the predictive
ability of the person-level health care claim information through
the application of an interaction capturing technique; and
forecasting a predicted cost of the insurance product during the
policy period based on the identification of the coverage of the
insurance product and the captured predictive ability of the
person-level health care claim information; wherein each of
diagnosis and CPT based risk factor is independent of the sequence
in time of other diagnosis and CPT based risk factors.
72. The method of underwriting an insurance of claim 71, for
insuring short term disability costs wherein the interaction
capturing technique uses a dependent measure from the next period
and policy period comprising the number of STD days in the policy
period and weights the dependent measure by the expected cost per
day for the STD to produce the person-level expected STD costs and
summed across the group to produce the group's expected STD
cost.
73. The method of underwriting a insurance of claim 71, for
insuring long term disability (LTD) claims wherein a dependent
measure for generating the cost forecasting model is the
probability of a LTD claim in the policy period where the
probability is weighted by the net present value of the LTD and
applying the cost forecasting model to the person-level data
produces person-level expected LTD costs wherein summing the
person-level expected LTD costs across the group to produce a
group's expected LTD cost for an actual policy period.
74. The method of underwriting a insurance of claim 71, wherein the
cost forecast is produced for first-dollar health insurance.
75. The method of underwriting a insurance of claim 71, wherein the
cost forecast is produced for stop loss health insurance.
76. The method of underwriting a insurance of claim 71 wherein the
cost forecast produced is for aggregate-only stop loss health
insurance.
77. The method of underwriting a insurance of claim 71 wherein the
cost forecast produced is for specific stop loss health
insurance.
78. The method of underwriting a insurance of claim 71 for insuring
group term life insurance costs wherein a dependent measure for
generating the cost forecasting model is the expected probability
of death weighted by the amount of life insurance to produce the
person-level expected term life insurance cost.
79. The method of underwriting a insurance of claim 71 comprising
renewal underwriting, wherein the model development universe
comprises data from the members of a group in the book of business
to be insured.
80. The method of underwriting a insurance of claim 71 comprising
in addition the step of: setting insurance reserves based on the
renewal group-level forecast for the actual underwriting period,
wherein the next period is a reserving period for claims that have
not occurred or that have occurred but not been reported.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is based on provisional applications
60/249,060, filed Nov. 15, 2000, and 60/267,131 filed Feb. 7, 2001,
which are incorporated by reference herein.
REFERENCE TO PROGRAM LISTINGS
[0002] A computer program listing appendix has been submitted on
compact disc for this disclosure. The material on that compact disc
is incorporated by reference herein. The compact disc was filed
with 2 copies, and contains the following files with:
TABLE-US-00001 NAME OF FILE DATE OF CREATION SIZE IN BYTES
APPENDIX.TXT May 14, 2001 281,991
The names above are the names of the files on the compact disc, the
dates are the dates the files were created on the compact disk, and
the size in bytes is the size of the file. Please note that there
is a glossary of terms included at the end of the Background
section.
BACKGROUND OF THE INVENTION
[0003] This invention pertains to health, disability and life
insurance systems, particularly including processing data (in the
business of health insurance) for estimating future costs or
liability and setting optimal pricing. For convenience, we call one
embodiment of our invention More Accurate Predictions for Health
Insurance Premiums or MAP4HIP.
[0004] Group health insurance is typically priced through a series
of steps. Historical claims costs are calculated by summing the
costs of insured individuals. Actuaries estimate what the general
cost inflation trend will be next period. If an insured group is
large enough to have credible experience (historical costs), the
inflation trend may be applied to the historical claims experience
to produce an estimate of the expected claims for next period. A
profit margin and administrative costs are added to the expected
group claims costs to produce the so-called "experience rate". An
underwriter reviews the group's experience and adjusts the cost and
profit margin-based price depending on special circumstances and
competitive pressure. The standard practice is to use group-level
data for estimating costs and setting prices except for very small
groups, individual policies or specific medical stop loss
insurance. Information on the insured's (i.e., individual's)
medical conditions is typically not used when group-level data are
used for underwriting and pricing the group's aggregate cost
forecast.
[0005] The current standard practice for estimating future health
care costs for groups of 50 or more employees plus their dependents
uses one of two methods or is a combination of those methods. If
the group is large enough to have credible, stable experience, the
historical costs are assumed to be the best estimate of next
period's costs after a cost trend factor for inflation has been
included. If the group is too small to have credible historical
costs, many groups are combined together and averaged so that a
stable demographic look-up table of historical average costs by age
group by gender by family size can be developed and used as a
weighting mechanism for estimating the expected future costs for
non-credible groups. Cost trend factors for inflation are then
applied. If a group does not have completely credible or
non-credible experience, a blended average of its experience and a
demographic look-up table forecast is used. These standard
actuarial methods do not account for person-level trends in
historical costs nor medical information about the person.
[0006] Small groups (i.e., 50 or fewer employees plus their
dependents) or individual medical policies may use medical
questionnaires from initial enrollment applications as input to an
underwriter for estimating next period's group-level costs. Manual
underwriting is expensive due to the labor intensity and is prone
to variability among underwriters as their experience varies.
[0007] Some state Medicaid HMO programs (e.g., Colorado and
Maryland) and federal Medicare HMO programs are using statistical
algorithms that make person-level cost forecasts based on diagnoses
from the computerized medical bills and demographic factors. These
"risk adjustment" methods do not use procedures or historical
person-level costs as the governments do not want incentives for
increased utilization of services and spending more money. The
governments' intent for HMO payments or managed care is to make
payments proportional to the insured populations need for care
based on their health conditions but not on prior care. However,
historical cost is the single best predictor of future medical cost
for credible groups. Not using it as part of the forecasting method
decreases the accuracy of the forecast.
[0008] Some medical insurance companies may be using such "risk
adjustment" algorithms used by Medicare, Medicaid and others
intended for managed care cost forecasting or payment allocation.
However, the prospective use of historical costs, types of services
and procedures as well as diagnoses and demographics, as well as
combinations of these variables, to produce more accurate cost
forecasts than "risk adjustment" algorithms using only diagnoses
and demographic factors, would be desirable.
[0009] There are person-level diagnosis and procedure models that
measure the efficiency of medical practices (i.e., costs of care
given the patient's conditions). These models are typically
concurrent or retrospective in nature and not prospective.
Symmetry's ETGs are a good example of this class of models. It
lacks cost experience as a predictor since that is intended as the
dependent variable. It also may limit use of demographic variables.
Forecasting models would be desirable which are prospective and not
designed for concurrent or retrospective analysis. The methods of
the present invention can be applied to concurrent data to develop
models for efficiency analysis, as will be described.
[0010] Stop loss health (or medical) insurance is typically
purchased by self-insured employers that wish to limit their
medical expense exposure. The most common form of medical stop loss
insurance is known as "specific stop loss" insurance which is a
high deductible (usually $25,000 to $100,000) insurance policy per
insured person. Specific stop loss medical insurance is designed to
protect the employer or other payer from large catastrophic medical
expenses such as those incurred for liver transplants or care for
neonates with major repairable congenital anomalies. The standard
method for underwriting specific stop loss medical insurance uses a
demographic look-up table to estimate costs for individuals whose
medical expenses were under 50% of the deductible in the previous
year. If an insured's medical expenses were over a predetermined
amount, such as over 50% of the specific deductible, the insured's
medical records are reviewed manually by an underwriter, and next
year's costs are estimated by the underwriter or a doctor or nurse
using their experience and expert opinion. Manual medical
underwriting for specific stop loss has the same problems as manual
underwriting for small group medical insurance; it is expensive and
prone to underwriter variability.
[0011] Frequently, "aggregate stop loss medical insurance" coverage
is also purchased by the employer. Aggregate coverage (exclusive of
specific payments) means that the insurer will pay the employer's
or other payer's medical cost obligations for a covered group if
those costs exceed an agreed upon amount (i.e., an "attachment
point"). The attachment point is typically defined as 125% of the
group's expected cost in the insured period. The industry standard
for calculating the expected cost is substantially the same method
as used for fully insured plans. In other words, if the group is
large enough to have completely credible experience, the last
year's experience is modified by forecast inflation and increased
by 25% to produce the 125% attachment point. If the group's
experience is partially credible, then a weighted combination of
experience and demographic look-up table model is used with an
inflation forecast and increased 25% to calculate the 125%
attachment point. When the group is too small to have credible
experience, the demographic look-up table model is used as the
starting point then trended inflation increased by 25% is used to
calculate the 125% attachment point. Aggregate only medical stop
loss insurance has been recently offered by one company
(Cairnstone) to credible groups, and we believe that it uses
group-level experience plus trended inflation to estimate future
costs. Price is usually determined by competitive pressure but the
inventors are not familiar with proprietary techniques used by the
insurers.
[0012] We are including a glossary of terms that are used in
describing the invention so that we are precise in our description.
Additionally, SAS computer code and CART modeling language will be
included to provide concrete examples of the implementation of the
process or products. The software Appendix found on the compact
disc filed with the present disclosure contains computer code
(minus copyrighted formats) of a simpler embodiment of the
invention. That code is in SAS and S Plus and the regression tree
used is RPART. Details are provided for the fully insured renewal
product. The aggregate only stop loss product uses the same steps
for cost estimation. The short term disability, long term
disability and life insurance products use the same techniques for
forecasting but the dependent variables are changed to reflect the
insurance type.
GLOSSARY OF TERMS
[0013] 1. Aggregate only stop loss health insurance--A health
insurance product for self funded employers that want to cap their
maximum liability. The aggregate only policy will pay off costs
above an agreed upon limit (i.e., the attachment point). Usually,
the attachment point is 125% of expected costs but it could be 110%
or some other amount. The expected costs are estimated using an
embodiment of this invention or using standard actuarial methods.
Aggregate only stop loss does not include specific stop loss.
However, specifics can be combined with aggregate stop loss. In
that case the specific payments are not included in the costs
counted against the aggregate attachment point.
[0014] 2 Base Period--A period of typically 12 consecutive months
prior to the lag period during which services were provided to some
enrollees and reflected by claims entered in a computer file. In
practice, it may be more or less than 12 months. Risk factors are
coded on data from the base period. These data are used to forecast
the next period costs. In other words, these data are used to
calculate the predictors for the development model and are not used
for underwriting actual health insurance policies.
[0015] 3 Book of Business--The insurance of a given type (e.g.,
small group, individual, large group) for all persons covered by an
insurer at a point in time or during a specified period. An insurer
may have multiple books of business.
[0016] 4 Bias Test--A comparison of observed to predicted values
from a model. The totals of both these values are equal to the
total population which served as the standard in the preparation of
the model. Bias tests determine whether or not there is any
meaningful systematic disparity between observed and predicted cost
when persons are sorted by predicted values, age or family
composition or other characteristics. Disparities are considered as
bias which better models eliminate or reduce. Another related
measures sorts by the actual rather than the predicted values and
is a measure of the accuracy of the forecasts.
[0017] 5 Candidate Predictor Variable--An array of variables
derived from the CI (client insurer) database and available to the
statistical software which selects those which are most predictive
of the dependent variable (e.g., by stepwise OLS, CART regression
trees).
[0018] 6 Claim amount: This is the total cost or payments made by
the insurer.
[0019] 7 Claim codes: These include ICD-9-CM diagnosis and
procedures, CPT codes, National Drug Codes and other standardized
coding systems values such as SNOWMED codes.
[0020] 8 Claim-based risk factors: These are risk factors derived
from the claim code, claim amount and transformations of the claim
amount, type and place of services, provider type, units of service
and other information contained on a health care claim. These risk
factors are present in either the base or underwriting period.
[0021] 9 Clinical risk factors: Risk factors derived from the claim
codes, type and place of service and provider type but not solely
from the claim amount.
[0022] 10 Client Insurer (CI)--The insurance entity for which the
invention is to be applied.
[0023] 11 Concurrent Cost Models--Used synonymously with
Retrospective Cost Models and defined elsewhere.
[0024] 12 Costs of health care--May be defined as either of the
following. Measured in dollars (usually per person per day in this
application)
[0025] a. Claims--total bills for care submitted to the insurer for
reimbursement
[0026] b. Payments--The amounts actually paid by the insurer.
Payments are always less than the claims due to deductibles,
benefits and non-covered services.
[0027] 13 Cost Inflation--Used synonymously with cost trend. The
secular trend in costs per person for health care due to changes in
practice patterns and price per service. Does not usually consider
changes in a population's health care needs which are usually
minimal in the short run. Differs from pure price inflation such as
that measured in the consumer price index (CPI).
[0028] 14 Credibility--The degree to which this experience may
confidently be used as the basis for future rates relates to its
credibility.
[0029] 15. Demographic look-up table--This is a method used by
actuaries to estimate group-level costs when the group is too small
to have credible experience. Average costs are calculated across a
large pool of groups and averages are calculated by cell in a table
of age by sex by family composition or other similar demographics.
The appropriate cell amounts are applied to each person or employee
in a non credible group and summed to calculate its expected
cost.
[0030] 16 Dependent Measure--The dependent measure is the forecast
of the model through application of the interaction capturing
technique. A transformation may be applied to the dependent measure
to calculate the claim amount (e.g., multiplying a probability by
an average cost). For health insurance and medical stop loss
insurance the dependent measure is the future cost of health care
for the population which comprises the CI book of business at the
time the rates are to be quoted. For short-term disability the
dependent measure is disability days. For long term disability and
life insurance the dependent variable is the probability of the
event.
[0031] 17 Enrollment-based risk factors--These are risk factors
that are derived from the enrollment information only such as age,
sex, relationship to the enrollee, length of enrollment, geographic
locale and type of coverage and does not include claim information
or claim amount. The employees salary, disability coverage terms
and term life insurance coverage terms may be included in the
enrollment file also.
[0032] 18 Experience model--This is a method used by actuaries for
estimating cost next year at the group-level. If the group is
deemed credible, the last year's cost (or experience) is considered
to be the best estimate of next year's cost. A cost trend is added
to account for medical inflation for next year's cost.
[0033] 19 Group--A group is a collection of one or more people that
are covered by one insurance policy. A traditional group is a
collection of employees and their dependents that work for an
employer at a location. A group can be an individual or a family by
purchasing an "individual" health insurance policy where the
remaining immediate family may also be covered by the policy.
[0034] 20 Health Insurance--Insurance for the array of benefits
covered by the health insurance policies of the client insurance
company or a self-insured company including hospital, surgical and
medical care plus drug benefits for some plans. Medical insurance
is used as a synonym.
[0035] 21 Hybrid Tree Analysis--The use of regression trees (or
other analytic method output) as input to other regression models
such as OLS, median and logistic regression or neural networks.
Additionally, a model's output (e.g., regression or neural network)
may be used as input into the regression or probability tre.
[0036] 22 Interaction Capturing Technique--A mathematical and
logical transformation of independent variables that predicts a
response or dependent variable. The interaction capturing technique
includes main effects, interaction effects and possibly time series
effects. Statistical techniques that are examples of interaction
capturing techniques include, but are not limited to, ANOVA,
regression methods (e.g., linear, logistic, shrinkage, robust,
ridge), regression trees, moving averages and autoregressive moving
averages, look-up tables, means, probability models, clustering
algorithms and many other methods. Data mining techniques that are
examples of interaction capturing techniques include, but are not
limited to, decision trees, rule induction, genetic algorithms,
neural networks, nearest neighbor and other data mining
methods.
[0037] 23 Lag Period--A period between the base period and the next
period or the underwriting and policy period which is required
because of delays in filing claims, preparing or revising model
weights, calculating premium rates and submitting them to insured
groups in a timely way.
[0038] 24 MAP 4 HIP--This is an acronym of More Accurate
Predictions for Health Insurance Premiums which in turn is a brief
title for our invention for its application to health
insurance.
[0039] 25 Next Period--Typically a 12 consecutive month period
subsequent to the base period and the lag period that contains the
data that comprise the dependent variable used in the development
model. Actual insurance policies are not written for this period
but are underwritten for the policy period.
[0040] 26 Policy Period--Typically a 12 consecutive month period
subsequent to the underwriting period and the lag period that
contains the data that comprise the actual cost borne by the
insurer. These costs are forecast using the application of the
development model to the data from the underwriting period with
appropriate adjustments made for assumptions about inflation.
[0041] 27 Prospective Cost Models--The candidate predictor
variables relate to a time period which precedes the dependent
variable.
[0042] 28 Retrospective Cost Models--The candidate predictor
variables relate to the same time period as the dependent
variable.
[0043] 29. Specific stop loss health insurance--A health insurance
coverage for self-funded employers or other payor that has a very
high deductible per person. Usually the deductible is at least
$10,000 and may be as high as $500,000 per person. Typically the
deductible is between $25,000- and $100,000 per person and is meant
to pay for catastrophic care.
[0044] 30 Standard population--The cases in the data set which are
used to select predictor variables and to weight them by their
relation to the dependent variable. For this invention, the cases
are an insured population.
[0045] 31. Subscriber unit--The family unit that health insurance
premium is charged by. For example, the simplest are two units: 1)
a single person and 2) two or more people. Single person, married
couple and three or more people is a common classification but more
detailed versions are also used. The subscriber is the
employee.
[0046] 32 Third Party Administration or TPA--A company that
processes the health insurance claims for a self funded employer.
The TPA may be part of an insurance company or not.
[0047] 33 Underwriting Period--A period of typically 12 consecutive
months prior to the lag period during which services were provided
to some enrollees and reflected by claims entered in a computer
file. In practice, it may be more or less than 12 months. Risk
factors are coded on data from the underwriting period. These data
are used to forecast the policy period costs. In other words, these
data are used to calculate the predictors for the model that is
used for underwriting actual health insurance policies.
[0048] 34 Winsorize--Data are Winsorized if the most extreme
observations on one or both ends of the ordered samples are
replaced by the nearest retained observation. Our cost
distributions have no low cost outliers and hence Winsorization is
applied only to the high end of the ordered sample.
BRIEF SUMMARY OF THE INVENTION
[0049] One aspect of the invention contemplates a
computer-implemented process of developing a person-level cost
model for forecasting future costs attributable to claims from
members of a book of business, where person-level data regarding
actual base period health care claims are available for a
substantial portion of the members of the book of business for an
actual underwriting period, and the forecast of interest (i.e.,
future claim amount) is for an actual policy period which can be,
but is not necessarily contiguous with the actual underwriting
period, having the steps of:
[0050] providing development universe data comprising person-level
enrollment data, historical base period health care claims data and
historical next period claim amount data for a statistically
meaningful number of individuals, where the person-level data on a
health care claim comprises at least a claim code and a claim
amount;
[0051] providing at least one claim-based risk factor for each
historical base period claim based on the claim code associated
with the health care claim and providing at least one
enrollment-based risk factor based on the enrollment data; and
[0052] developing a cost forecasting model by capturing the
predictive ability of the main effects and interactions of claim
based risk factors and enrollment-based risk factors, with the
development universe data through the application of an interaction
capturing technique to the development universe data.
[0053] A further aspect of the invention contemplates a
computer-implemented process wherein the interaction capturing
technique is selected from the group consisting of median
regression tree techniques, least square regression tree
techniques, rule induction techniques, ordinary least squares
regression techniques, median regression techniques, robust
regression techniques, genetic algorithms, rule induction,
clustering techniques and neural network techniques.
[0054] Yet another aspect of the invention is a computer
implemented process wherein the person-level next period cost
forecasts are adjusted by modifying the extant cost forecast by the
expected cost trend.
[0055] A yet further aspect of the invention is a computer
implemented process of wherein the datum from the claims used as
predictors consist essentially of the claim- and enrollment-based
risk factors and the claim amount is a standardized cost of
services provided and the model is used to allocate prospective
payments to health care providers.
[0056] A still yet further aspect of the invention is a computer
implemented process wherein the data used from the claims data
consist essentially of the claim code and selected mandatory
procedures and the claim amount is a standardized cost of services
provided during the same time period as the base period and the
model is used to evaluate the efficiency of health care
providers.
[0057] Another aspect of the invention is a computer implemented
process of forecasting future claim amounts attributable to claims
from members of a book of business for an actual policy period,
wherein the model development universe comprises data from the
members of a book of business to be insured, further
comprising:
[0058] applying the cost-forecasting model to the actual
underwriting period person-level data of each of the members of the
book of business to generate a person-level actual policy period
cost forecast for each member of the book of business; and
[0059] producing a group-level forecast for the actual underwriting
period from the person-level forecasts of each member of the group
by totaling the person-level actual policy period cost forecasts
for the group for the policy period.
[0060] Yet another aspect of the invention is a computer
implemented process comprising the step of: setting insurance
reserves based on group-level forecast for the actual policy
period, wherein the policy period is a reserving period for claims
that have not occurred or that have occurred but not been
reported.
[0061] Yet still another further aspect of the invention is a
computer implemented process, wherein claim amounts are a mix of
fee for service payments and capitation payments so that the base
and underwriting periods risk factors are appended to include dummy
variables for the presence of capitation payments by provider type
and the cost estimate in the next and policy periods is the fee for
service cost that must be supplemented with the expected capitation
payments.
[0062] Still another aspect of the invention is a
computer-implemented process of developing a hybrid person-level
health care claim cost forecasting model for forecasting future
medical costs attributable to health care claims from members of a
book of business, where person-level data are available for a
substantial portion of the members of the book of business,
comprising the steps of:
[0063] providing development universe data comprising person-level
data for a statistically meaningful number of individuals, the
person-level data comprising continuous variable data and
categorical variable data;
[0064] processing first the continuous variable data for each
individual with a continuous processing technique that captures the
predictive ability of main effects and interactions of continuous
variables to generate a person-level continuous variable model;
and
[0065] processing the categorical variable data for each individual
including the output from the continuous processing technique with
a categorical processing technique that captures the predictive
ability of main effects and interactions of categorical variables
to generate a person-level categorical variable model;
[0066] wherein the person-level continuous variable model and
person-level categorical variable model together comprise a hybrid
person-level health care claim amount forecasting model.
[0067] Yet another aspect of the invention is a
computer-implemented process of developing a claim amount
forecasting model for use in forecasting the future claim amount
for members of a book of business, where person-level data are
available for a substantial portion of the members of the book of
business for an actual base period, and the claim amount of
interest for forecasting purposes is an actual next period which
can be, but is not necessarily contiguous with the actual base
period, comprising the steps of:
[0068] processing the base period data having claims to generate a
having-claims claim amount forecasting model; and
[0069] processing the base period data without claims to generate a
without-claims claim amount forecasting model,
[0070] wherein the having-claims cost forecasting model and the
without-claims forecasting model comprise a claim amount
forecasting model.
[0071] Yet another aspect of the invention is a
computer-implemented process of developing a health care claim
amount forecasting model for use in forecasting the future medical
claim amount for members of a book of business, where person-level
data are available for a substantial portion of the members of the
book of business for an actual base period, and the claim amount of
interest for forecasting purposes is an actual next period which
can be, but is not necessarily contiguous with the actual base
period, comprising the steps of:
[0072] providing development universe data comprising person-level
data for a statistically meaningful plurality of individuals,
wherein the person-level data for an individual comprises health
care claims data for the individual and the data on a health care
claim comprises at least a claim amount and a claim code;
[0073] Winsorizing the person-level data to yield inlier data and
outlier data;
[0074] processing the inlier data to generate an inlier cost
forecasting model; and
[0075] processing the outlier data to generate an outlier cost
forecasting model;
[0076] wherein the combination of the results of the inlier and
outlier cost forecasting models together produce a person-level
claim amount forecast model.
[0077] Another aspect of the invention is a computer-implemented
process of comprising:
[0078] Winsorizing the inlier data to yield inlier data having
claims and inlier data without claims;
[0079] processing the inlier data having claims to generate an
inlier-having-claims claim amount forecasting model; and
[0080] processing the inlier data without claims to generate an
inlier-without-claims claim amount forecasting model,
[0081] wherein the inlier-having-claims cost forecasting model and
the inlier-without-claims forecasting model comprise an inlier
claim amount forecasting model.
[0082] A still further aspect of the invention is a
computer-implemented process of forecasting a claim amount
attributable to claims from members of a book of business during an
actual policy period, comprising the steps of:
[0083] providing person-level data, comprising enrollment data for
members of a book of business to be insured for an actual
underwriting period that can be, but is not necessarily, contiguous
with the actual policy period;
[0084] providing a model development universe of person-level data,
comprising enrollment data from the historical base period and
historical next period heath care claims data for a statistically
meaningful number of individuals;
[0085] providing enrollment-based risk factors for each historical
base period and providing next period claim amounts;
[0086] developing a health care cost-forecasting model for the
enrollment data by capturing the predictive ability of main effects
and interactions of enrollment-based risk factors through the
application of an interaction capturing techniques to the model
development universe;
[0087] applying the health care cost-forecasting model to the
person-level underwriting period enrollment data of each of the
members of the book of business to generate a person-level expected
cost forecast for the policy period for each member of the book of
business; and
[0088] producing a group-level forecast for the expected cost of
the policy period from the person-level forecasts of each person of
the group by totaling the person-level expected cost forecasts for
the actual policy period.
[0089] A still further aspect of the invention is a
computer-implemented process of forecasting costs attributable to
claims from members of a book of business during an actual policy
period, comprising the steps of:
[0090] providing person-level data, comprising enrollment data and
actual underwriting period health care claims data, for members of
a book of business, where the person-level data on a health care
claim comprises at least a claim amount and a claim code and the
actual underwriting period can be, but is not necessarily,
contiguous with the actual policy period;
[0091] providing a model development universe of person-level data,
comprising enrollment data, historical base period health care
claims data and historical next period claim amount data for a
statistically meaningful number of individuals, where the
person-level data on a base period health care claim includes at
least a claim amount and a claim code;
[0092] providing claim-based risk factors for each historical base
period based on the claim code associated with the health care
claim and providing at least one enrollment risk factor based on
the enrollment data;
[0093] developing a cost-forecasting model by capturing the
predictive ability of main effects and interactions of risk factors
through the application of an interaction capturing technique to
the model development universe;
[0094] applying the cost-forecasting model to the person-level data
of each of the individuals or members of a group to generate a
person-level actual policy period expected cost forecast for each
member of the group; and
[0095] producing a group-level forecast for the actual policy
period from the person-level forecasts of each individual or member
of the group by totaling the person-level cost forecasts for the
actual policy period.
[0096] Yet a further aspect of the invention is an automated system
for forecasting future costs attributable to claims from members of
a book of business during an actual policy period comprising:
[0097] a central processing unit;
[0098] an insured person database, accessible by the processor,
wherein the database comprises person-level enrollment data and
actual underwriting period health care claims data, for members of
a book of business to be insured, where the person-level data on a
health care claim comprises at least a claim amount and a claim
code;
[0099] a model development universe database, accessible by the
processor, wherein the second database comprises model development
universe of person-level data, comprising enrollment data,
historical base period health care claims data and historical next
period claim amount data for a statistically meaningful number of
individuals, where the person-level data on the base period health
care claim includes at least a claim amount and a claim code;
[0100] a risk factor encoder, accessible by the processor, wherein
the risk factor encoder encodes claim-based risk factors for each
historical base period based on the claim code associated with the
health care claim and the risk factor encoder encodes at least one
enrollment risk factor based on the enrollment data;
[0101] a model generator, accessible by the processor, that
generates a cost-forecasting model by capturing the predictive
capacity of the main effects and the interaction of the risk
factors assigned by the risk factor encoder to forecast the
historical next period of the model development universe data using
the historical base period data;
[0102] a person-level cost generator that applies the
cost-forecasting model to the person-level actual underwriting
period health care claims data of each of the members of the book
of business to generate a person-level actual policy period claim
amount forecast for each member of the book of business; and
[0103] an actual policy period group-level cost forecast generator
that totals the person-level actual next period forecasts for each
member of the group to generate an actual policy period group-level
cost forecast.
[0104] Still another aspect of the invention is a
computer-implemented process of forecasting costs attributable to
claims from members of a book of business during an actual policy
period, comprising the steps of:
[0105] means for providing person-level data, comprising enrollment
data and actual underwriting period health care claims data, for
members of a book of business, where the person-level data on a
health care claim comprises at least a claim amount and a claim
code and the actual underwriting period can be, but is not
necessarily, contiguous with the actual policy period;
[0106] means for providing a model development universe of
person-level data, comprising enrollment data, historical base
period health care claims data and historical next period claim
amount data for a statistically meaningful number of individuals,
where the person-level data on a base period health care claim
includes at least a claim amount and a claim code;
[0107] means for providing claim-based risk factors for each
historical base period based on the claim code associated with the
health care claim and providing at least one enrollment risk factor
based on the enrollment data;
[0108] means for developing a cost-forecasting model by capturing
the predictive ability of main effects and interactions of risk
factors through the application of an interaction capturing
technique to the model development universe;
[0109] means for applying the cost-forecasting model to the
person-level data of each of the individuals or members of a group
to generate a person-level actual policy period expected cost
forecast for each member of the group; and
[0110] means for producing a group-level forecast for the actual
policy period from the person-level forecasts of each individual or
member of the group by totaling the person-level cost forecasts for
the actual policy period.
[0111] A still further aspect of the invention is a group insurance
product comprising:
[0112] an identification of the types of benefits which are agreed
to be provided by an insurer to or on behalf of members of a group,
which will be incurred by members of said group during a future
time period; and
[0113] a stated monetary insurance premium including a forecast of
said benefits, estimated costs of administering the insurance
product, and optionally, an estimated profit,
[0114] whereby an insurer agrees to cover the identified benefits
in exchange for the payment of the stated monetary insurance
premium.
[0115] Yet another aspect of the invention is a method of pricing
group insurance including a cost of future benefits according to
the computer-implemented process of forecasting future medical
costs attributable to claims from members of a group during an
actual underwriting period, comprising the steps of:
[0116] providing an expected amount of administrative costs
allocable to providing health insurance coverage to the group;
[0117] providing a minimum acceptable expected profit;
[0118] totaling the group level cost forecast, expected amount of
administrative costs, and minimum acceptable expected profit are to
yield a total minimum price, and
[0119] providing a plurality of expected probabilities of retention
for the group corresponding to a plurality of possible prices
greater than or equal to the total minimum price, each possible
price also having an expected profit that is the amount of the
price over the group level cost forecast plus the expected amount
of administrative costs; and
[0120] calculating a plurality of possible maximum profits by
multiplying each of the plurality of possible profits by the
corresponding expected probability of retention, wherein the
largest possible maximum profit, is used to price the group
insurance.
[0121] Still another aspect of the invention is a method of
underwriting an insurance product comprising the steps of:
[0122] providing an identification of the coverage of the insurance
product which identifies the conditions of payment under the
product during a policy period;
[0123] providing person-level health care claim information
comprising enrollment data, and base period and underwriting period
claim data, the claim data comprising claim codes having associated
claim costs;
[0124] capturing the predictive ability of the person-level health
care claim information through the application of an interaction
capturing technique; and
[0125] forecasting a predicted cost of the insurance product during
the policy period based on the identification of the coverage of
the insurance product and the captured predictive ability of the
person-level health care claim information;
[0126] wherein each of diagnosis and CPT based risk factor is
independent of the sequence in time of other diagnosis and CPT
based risk factors.
[0127] A further aspect of the invention is a method of
underwriting an insurance, for insuring short term disability costs
wherein the interaction capturing technique uses a dependent
measure from the next period and policy period comprising the
number of STD days in the policy period and weights the dependent
measure by the expected cost per day for the STD to produce the
person-level expected STD costs and summed across the group to
produce the group's expected STD cost.
[0128] A still further aspect of the invention is insuring long
term disability (LTD) claims wherein a dependent measure for
generating the cost forecasting model is the probability of a LTD
claim in the policy period where the probability is weighted by the
net present value of the LTD and applying the cost forecasting
model to the person-level data produces person-level expected LTD
costs wherein summing the person-level expected LTD costs across
the group to produce a group's expected LTD cost for an actual
policy period.
[0129] A still yet further aspect of the invention is a cost
forecast produced for first-dollar health insurance.
[0130] Another aspect of the invention is a cost forecast produced
for stop loss health insurance.
[0131] A still further aspect of the invention is a cost forecast
produced for aggregate-only stop loss health insurance.
[0132] Still another aspect of the invention is a cost forecast
produced for specific stop loss health insurance.
[0133] Yet another aspect of the invention comprises is a cost
forecast for insuring group term life insurance costs wherein a
dependent measure for generating the cost forecasting model is the
expected probability of death weighted by the amount of life
insurance to produce the person-level expected term life insurance
cost.
[0134] In a still another aspect of the model development universe
comprises data from the members of a group in the book of business
to be insured.
[0135] A still yet further aspect of the invention comprises the
step of: setting insurance reserves based on the renewal
group-level forecast for the actual underwriting period, wherein
the next period is a reserving period for claims that have not
occurred or that have occurred but not been reported.
BRIEF DESCRIPTION OF THE DRAWINGS
[0136] FIG. 1 is a flowchart of an embodiment of an overview of a
method for estimating future cost and optimizing pricing.
[0137] FIG. 2 is a flowchart of an embodiment of a method like that
of FIG. 1 which is particularly adapted for service bureau
processing.
[0138] FIG. 3 is a flowchart of an embodiment of a method like that
of FIG. 1 which is particularly adapted for use as a software
product, which may be functionally distributed locally or over the
Internet.
[0139] FIG. 4 is a more detailed flowchart of a process for data
processing of steps 102, 202 or 302 of FIGS. 1, 2 and 3.
[0140] FIG. 5 is a more detailed flowchart illustrating a process
for standardizing time periods, for use in the methods of FIGS.
1-3, and in particular steps 102, 202 and 302.
[0141] FIG. 6 is a flowchart illustrating data validation and
standardization procedures for steps 102, 202 and 302 of the
methods of FIGS. 1-3.
[0142] FIG. 7 is a flowchart illustrating the matching and merging
(integration) of data in the process steps 102, 202 or 302 of FIGS.
1-3.
[0143] FIG. 8 is a flowchart illustrating the aggregation and risk
factor coding for the steps 102, 202 or 302 of the processes of
FIGS. 1-3.
[0144] FIG. 9 is a flowchart of processing steps for developing
cost forecasting models based on "inlier" data in steps 106, 204,
210, 304 or 310 of the methods of FIGS. 1-3.
[0145] FIG. 10 is a detailed flowchart of process steps for
developing cost forecasting models based on "outlier" data of the
Winsorized data for the steps 106, 204, 210, 304 or 310, of the
methods of FIGS. 1-3.
[0146] FIG. 11 is a detailed flowchart for scoring, testing and
integrating the data, and adjusting for cost trends for use in
steps 106, 204, 210, 304 or 310 as well as 108, 208 and 306 of the
methods of FIGS. 1-3.
[0147] FIG. 12 is a detailed flowchart illustrating processing
steps for developing group-level models and making adjustments to
the summary of the person-level data of steps 106 and 108 of FIGS.
1, 204, 208 and 210 of FIG. 2, or 304, 306 and 310 of FIG. 3.
[0148] FIG. 13 is a detailed flowchart of an embodiment of a price
optimization procedure which may be used to carry out steps 110,
212, or 308 of FIGS. 1-3.
DETAILED DESCRIPTION OF THE INVENTION
[0149] The present invention is directed to insurance systems,
particularly including methods for processing health insurance data
to estimate future costs, and for optimizing pricing of health
insurance products, including both first-dollar and stop loss
insurance products. In various aspects, it involves processing
historical data, developing algorithms, applying those algorithms,
updating those algorithms and setting prices. However, the
insurance systems that can benefit from the methods and systems
disclosed herein also include, but are not limited to, health
insurance, disability insurance, both short term and long term, as
well as term life insurance systems.
[0150] This invention comprises a series of related products that
provide more accurate group-level claim amount forecasts (and
person-level forecasts for individual or family health insurance)
and more optimal group-level renewal prices for insurers at full
risk for the health insurance (e.g., indemnity, PPO, HMO, POS) or
aggregate only stop loss health insurance for self insured
employers. These forecasting models for renewal price setting are
not intended to be used for paying managed care providers but
alternate related models are developed for that purpose (see B in
Table 1 below). The products provide more accurate future cost
estimates by forecasting person-level costs using models that
include clinical information from historical health insurance
claims as well as person-level demographic and historical cost
data. In this regard, effective models may be based on data from
relatively large groups of at least 50,000 people, such as
typically covering an entire book of business for an insurer (or a
large subclass of the insurer's book of business such as all HMO
groups of the insurer) or in the case of a TPA, the TPA's entire
book of business. The most recent year of person-level medical
claim data for the individuals of a particular book of business for
which an accurate cost forecast is desired may be processed by this
model, to produce an accurate projected cost for policy pricing, as
will be described. Future cost trend estimates (inflation) are
adjusted for each individual's characteristics and applied to the
person-level estimates. Person-level cost forecasts are summarized
to the family-level or group-level and family or group-level
characteristics are used to adjust the summarized cost to produce
the adjusted family or group-level cost forecast. The price is
optimized using a system that estimates the probability of the
group accepting the insurance at the price offered, given the
group's historical insurance cost, historical claim's history, and
local competitive market conditions. The probability is weighted by
a function of the expected future profit, which equals the
anticipated price less expected medical and administrative costs.
The method and models with slight adjustments can be applied to
self insured employers aggregate only, specific only or specific
plus aggregate medical stop loss data. The products also include
the use of the method applied to a client's book of business for
estimating future claim amounts for purposes of setting a reserve
by group and for cost forecasting and pricing for new groups or
individuals for fully insured health insurance. Another alternative
application would be the use of the method to develop and deliver
products that allow HMO's to prospectively allocate health care
payments to providers. Another product is the measurement of the
efficiency of health care providers. These methods can be applied
to medical claims linked to future short or long-term disability
payments or indicators of disability and used to rate the relative
risk of disability of groups or forecast their future costs by
using the groups medical claims, enrollment data and summarized
group-level or person-level disability payments. Another
application is to group term life insurance. The dependent measure
is the probability of death next period which is linked to medical
claims in the base period and the potential risk factors are the
same potential risk factors as used with the other models.
[0151] The modeling strategy employed for the cost forecasting
models contains several novel components. We have used a
combination of specialized data collection and cleaning, regression
trees and regression (ordinary least squares or OLS, logistic and
median) models tailored to a client's book of business, and the
application of these models to the client's book of business for
improved decision making. While there are many published examples
of OLS being used for purposes similar to this application, there
are a few using trees. We are not aware of any reports using a
combination of regression trees and other regression models to
forecast health care costs. The use of the output of a tree model
as an input to other regression algorithms is known as "hybrid"
tree models. (See D. Steinberg and N. Scott Cardell, Improving Data
Mining with New Hybrid Methods, Salford Systems, May 27, 1998,
Powerpoint@ http://www.Salford-systems.com). They give examples of
models with a binary (yes-no) dependent variable for which they
used the regression tree output as predictors in a regression
model. They demonstrated that this hybrid combination was superior
to either method used alone. When our dependent variable is cost we
used OLS regression with the output of regression trees and when
the dependent variable is a probability, we used logistic
regression. This allowed us to have continuous valued predictions
rather than the step-like predictions characteristic of trees and
contingency table forecasts. Our use of the terminal nodes of a
regression tree as predictions in an OLS or logistic regression
model provides an effective way to have both the main effects and
complex interactions of candidate predictors properly weighted in
our final model.
[0152] A typical group health insurance product in accordance with
the present invention (such the various types of Blue Cross.TM. and
Blue Shield.TM. brand group health insurance policies, which are
incorporated herein by reference) comprises an identification of
the types of medical expenses which are agreed to be covered, paid
or reimbursed by an insurer to or on behalf of members of the group
(including their covered dependents) which are incurred by members
of the group during a future time period, typically one year, in
exchange for a stated monetary insurance premium which includes a
forecast of said medical expenses in accordance with the methods
described herein, estimated costs of administering the health
insurance product, and an estimated profit.
[0153] Table 1 summarizes the alternate uses of our method as
applied to health care enrollment and claims data linked with claim
amounts for first dollar and stop loss coverage, disability
coverage, reserves and term life coverage. These alternate model
development produce products that are customized for specialized
applications. Row is the application of our invention which is
presented in most detail in this application. The methods used in
A-1 are clearly related to those in each of the other rows.
TABLE-US-00002 TABLE 1 Applications of the Invention's Modeling
Methods Allowable Sources of Candidate Risk Dependent Variable for
Reference Times for Predictors Services Provided During Dep.
Dependent & Predictor Application Enrollment Data Claims Data
Variable Ref. Time Variables Model Type A. Predict Future Costs of
Health Insurance 1. Renewal Groups or All All Cost of Claims
Predictor Variable Precedes Prospective Individuals 2. Stop Loss:
Specific All All Cost of Claims over Predictor Variable Precedes
Prospective Only, Aggregate Only Deductible, over Attachment or
Specific Plus Point or Both Aggregate 3. Required Reserves All All
Reserve Period IBNR Predictor Variable Precedes Prospective 4. New
Groups or All None Cost of Claims Predictor Variable Precedes
Prospective Individuals B. Allocate payments All Diagnosis
Standardized Costs of services Predictor Variable Prospective to
health care providers provided C. Measure "Efficiency" of All
Diagnosis & selected Standardized Costs of services Predictor
Variable Retrospective care providers mandatory procedures
provided* concurrent with Dependent Variable D. Short Term
Disability All All + STD Claims STD days, Cost or Index Predictor
Precedes Prospective Payments E. Long Term Disability All All + STD
+ LTD Probability LTD, Cost or Index Predictor Precedes Prospective
F. Group Term Life All All + Death Probability Death, Cost
Predictor Precedes Prospective *Costs per service can be
standardized by use of relative values for CPT codes and DRG
weights for hospital care or average actual costs for each
service
[0154] Optimal pricing for a fully insured group requires an
accurate forecast of the group's mean cost per person in the policy
period. Optimal pricing for an aggregate only medical stop loss
insurance for a self-insured employer also requires an accurate
forecast of that group's mean cost per person in the policy period.
Therefore, the exact same methodology can be used for the cost
forecast for fully insured groups or for self-insured group's
aggregate only stop loss insurance if the same data are available.
There is a difference in the methods used to set prices since the
employer will pay for the majority of the medical expenses when it
is self-insured and thereby paying a premium that is far smaller
than with full health insurance when the insurer pays all of the
medical costs.
[0155] CapCost.TM. is an aggregate only medical stop loss product
that includes a system for making more accurate cost forecasts (for
groups with 51 to 3000 employees mainly). The attachment point for
CapCost.TM. can be the standard 125% of expected costs (called
CapCost 125.TM.) but we will offer an attachment point at 110% of
expected costs (called CapCost 110.TM.) and possibly other
attachment points. The terms of CapCost.TM. are similar to those of
traditional medical stop loss insurance, but there is cash flow
protection, medical costs are cumulated on an incurred basis rather
than a paid basis, and there is no specific stop loss coverage.
CapCost.TM. is useful for employers since many will receive prices
that are below the price of traditional specific plus aggregate
medical stop loss insurance while the maximum aggregate medical
liability for the group may be lower with CapCost.TM. than with
traditional specific plus aggregate medical stop loss insurance.
From the insurers perspective, the expected medical claims it must
pay with CapCost.TM. are frequently below those of traditional
medical stop loss products since specific stop loss coverage is not
provided. Generally, CapCost.TM. is a better value for the employer
than traditional stop loss coverage when the employer is larger
than the average employer purchasing stop loss coverage or if the
group has experienced some unusually high annual medical expenses
due to a few high cost individuals that are unlikely to have high
costs recurring in the near future.
[0156] CapCost.TM. is novel in the way expected future medical
costs are estimated. Historical medical claims, enrollment, benefit
plan and employer files in electronic format are collected from the
Third Party Administrators (TPA) or insurance company that is
paying the employers medical bills. The electronic files containing
the medical claims and enrollment data are collected for all people
with medical coverage rather than from only those that had large
claims. This invention's cost forecasting models are applied to the
insured people covered by the employer. The inflation trend and
optimized pricing are then applied to the cost estimates. The
CapCost.TM. product is a system for data collection, cost
estimates, and price optimization and is part of this invention.
Separate products are designed for pricing new or renewal coverage
for fully insured medical plans and for allocating reserves for
such medical plans. Each contain a system for data collection and
cost estimation. Price optimization is an additional part of this
invention for fully insured medical plan renewals and stop loss
coverage.
[0157] One of the important measures of the quality of a model is
the mean absolute residual (MAR). The MAR is the mean of the
absolute value of the difference between the actual and predicted
cost of a group. A lower MAR is desirable since the predicted cost
is closer to the actual cost. We compared the MAR for this
invention's predicted cost with the MAR calculated using an
experience model and the MAR calculated using a demographic look-up
table model. The results are presented as a percentage of the mean
of the groups costs or the predicted divided by the actual times
100. The MAR was 11.6% for the invention's prediction, 14.2% for
the experience model, and 25.8% for the demographic model for the
116 actual groups in our database. The invention forecast was
substantially better than either of the two conventional forecast
methods.
[0158] We conducted a Monte Carlo simulation for groups with
various numbers of employees since our database is too small to
analyze by group size. We randomly selected 1500 enrollees and
their dependents and made 500 synthetic groups. The MAR as a
percentage of the groups actual cost was about 7% for the
inventions forecast and just under 10% for the experience forecast.
A demographic forecast was not compared since groups with over 1500
employees and their dependents are deemed completely credible.
[0159] A measure of model accuracy addresses whether and by how
much the model systematically over or under predict the actual
costs for various characteristics of the insured population. In
order to compare this accuracy measure of two models, we sort the
(actual) cost of groups into deciles from the lowest 10% to the
highest 10%. We calculate the predicted (forecast) cost for the
groups in each (or finer gradation) decile. The actual cost is
divided by the forecast cost to make an index. The index should be
close to 1.0 if the model is accurate. In our simulation tests (500
groups of 1500 employees), the invention's forecast is always
closer to 1.0 for every decile indicating that it is a superior
model to the experience model. The invention's ratio of predicted
to actual was about 0.91 for the lowest decile and about 1.32 for
the highest decile while the experience models ratios were about
0.85 and about 1.55, respectively. The other deciles were closer to
1.0 but the invention forecast was always closer to 1.0 than the
experience forecast.
[0160] The invention includes a general process for developing
models for forecasting health care costs. The invention also
includes processes for products that incorporate a the process and
provide information for improving specific business decisions made
by health insurers, including, but not limited to, aggregate only,
specific only and specific plus aggregate stop loss health
insurance products. The models may be developed for specific
insurers and their book of business, and may be different for each
insurer. A software listing of an embodiment of a program for
carrying out a forecasting process in accordance with the present
invention is present on the above-cited CD-ROMs. Illustrated in
FIG. 1 is a flowchart which represents an overview of an embodiment
of a method in accordance with the present invention as applied to
cost forecasting and pricing of renewals for health insurance for
fully insured groups as shown in FIG. 1.
[0161] In accordance with the method of FIG. 1, health data on
members of the book of business is collected, cleaned, integrated
and aggregated, as shown in step 102. If the data are missing or
miscoded, the cost forecasts may be inaccurate also. Most of the
programming cost and analysis involves these phases of the process.
The client's data may typically be in many different computer
systems or databases, and the data may need to be combined to build
person-level files that are complete for a specified time
period.
[0162] A twelve month "base period" is typically used as the period
from which we collect this data to describe each person's history
of claims, diagnoses and other factors. The base period could be a
longer period or shorter period and will depend on how long the
groups have been enrolled and the time for which adequate computer
or other records are kept. The base period may have different time
periods for people and groups that do not have the same enrollment
renewal dates.
[0163] There is typically a period between the "base period" (or
underwriting period) and the "next period" (or policy period)
during which medical claims data are not available, since they were
incurred but not reported or they are between the time of the price
quote for policy period's renewal and the renewal date. We call
this the "lag period". The examples here use a lag period of three
months but that could be a longer or shorter time period depending
on the needs and constraints of the available data, the insurer or
others.
[0164] The "next period" is typically the period of twelve months
of insurance coverage immediately following the lag period. The
claim amount forecast period is the next or policy period that is
priced for the group. The "next period" is the relevant time period
for the dependent variable in the cost forecast models.
[0165] If the insurer for which future health costs are to be
forecast (e.g., a business entity which desires to provide health
insurance) is a new client, (e.g., has not had models previously
built on their book of business) then a new cost forecasting model
may need to be developed for them, for example, as shown in step
104 of FIG. 1. An alternative is to use existing forecasting models
and recalibrate those models to the new or updated data. Our
methods include a systematic process to develop new models or
recalibrate old models. A new model is developed when the old
database upon which the old model was developed is not
representative of the new database. This might occur if the new
database is substantially different in size, covers a different
geographic region, contains different types of insurees (e.g.,
predominantly elderly in Medicare; pregnancy and children are
characteristic of Medicaid) or different types of payments (e.g.,
capitation payments plus fee for service payments).
[0166] The selection of the population to be modeled is of key
importance since the predictor variables and their weights will
reflect not only the specific needs of the population, but also the
practice patterns of those providing care and the prices charged
for its health care services. The ideal population to use as a
standard is the CI's book of business for which the forecasts are
needed, provided it is of sufficient size. We have found that an
insured population (i.e., book of business) as small as 50,000
persons can produce robust cost forecasts.
[0167] Use of another, smaller or less representative population as
a standard can cause problems in both the selection of risk factors
because there is no reason to believe that needs per person even
after adjustment for demographic factors, nor practice patterns of
providers, nor prices per service will be similar enough in the
index population as what amounts to a convenience sample, no matter
how large the latter may be. The three cost component factors are
known to vary from geographic locale by socioeconomic status of the
insured and the characteristics of the providers and the features
of their health insurance.
[0168] As shown in step 106, if it is determined that a new cost
forecasting model should be developed, there is a specified process
for developing the model. The method for developing the new cost
forecasting model is part of our product and it can be applied to
any medical insurance database that includes the necessary
information.
[0169] To develop a new cost forecasting model for a specific
customer, we need data from groups that were in its historical
"base period" and "next period". Claims data from the "lag period"
are not necessary since it need not be used in the model but it is
generally collected. The cost forecasting model is calibrated on
the historical data to model the dynamics of medical care, practice
patterns, and pricing in the geographic markets and provider
networks used by the customer. The groups of insured people used as
a standard in our models must be enrolled for at least the last day
of the "base period", for the entire lag period and the next
period. Multiple sets of base period, lag period and next period
can be used to increase the amount of data used to create the cost
forecasting model. More data produces more robust models, but must
be adjusted for secular cost trends when there are multiple
calendar years for the "base period".
[0170] Scoring the data for pricing insurance for the policy period
involves applying the forecasting model to the data for the
underwriting period that will be used to forecast cost for the
policy period--the renewal year that needs pricing, as shown in
processing block 108. Generally, the most recent nine months of the
previous next period will be in the new underwriting period offset
by the three month lag period. This helps in processing the data
needed for predicting future costs. The first step in the scoring
108 is applying the data steps to the new underwriting period that
have not been previously applied (e.g., coding of risk factors).
Second, the cost forecasting model is applied to the person-level
data. External health care inflation forecasts from the CI or
consulting organization are then used to adjust the prior year's
trend inherent in the person-level forecasts. The person-level
inflation adjusted cost forecasts are then aggregated to the
group-level. Third, group-level adjustments to the forecasts are
applied for benefit plan design, SIC code, and other factors
influencing group costs.
[0171] Having forecast the group's future medical expenses, over
the selected (e.g., 1 year) period, the price to be charged for the
medical insurance for the group for that period may be determined,
as shown in block 110 of FIG. 1. The insurer generally desires to
obtain a fair, or even maximum profit, without causing the group to
leave for another insurer. The competitiveness of the market,
historical prices, and historical costs are all factors that will
influence the likelihood of the group being retained at any given
price. The policy premium, the price to be charged to the customer
for the medical insurance coverage for the specific group,
comprises the forecast medical cost, the insurer's overhead and
other business expenses, and a projected profit. The client's
underwriter(s) are asked to provide explicit probabilities of
retaining a group at various price increases. These probabilities
are multiplied by the expected profit if the group is retained,
resulting in the expected profit for that group at each price
increase. The information is presented to the underwriter with the
premium price that optimizes profit highlighted and recommended.
These recommended prices may be more or less than prior prices, but
will typically more accurately reflect the future medical costs of
the specific group.
[0172] FIGS. 2 and 3 similarly provide an overview of the
information flows for two different embodiments. The embodiment of
FIG. 2 involves substantially only the transfer of data. The
embodiment of FIG. 3 involves installing software at the client or
an Internet connection with the client's software.
[0173] Shown in FIG. 2 is a "service bureau" embodiment in which
all of the data preparation, cost forecasting, model development,
scoring the data, and pricing for specific individual groups is
carried out at a service bureau location. As shown in block 202,
medical history and claims data for members of the group are sent
to the service bureau location, and a cost forecast or per group
price or both are sent back to the client (see 212). An alternative
is for software to be installed in the client's (insurance
company's or third party administrator's) operations with model
updates being periodically provided to the client.
[0174] This historical data (typically provided by an insurance
company or TPA) is used to develop a model that is calibrated to
the book of business (see the sample data requested of the client,
and/or for specific policy types of insurance companies). A base
period, lag period, and next period are required as a minimum. The
data are fully validated prior to the model development.
[0175] As shown in block 204, cost forecasting models are developed
which include person-level inlier models based on the Winsorized
data (see FIG. 9) and outlier cost components (see FIG. 10),
inflation adjustments (see FIG. 11), group-level attribute models
(see FIG. 12), and pricing models (see FIG. 13).
[0176] As shown in block 206, once those models are developed and
preferably fully tested, we are ready to work with the most recent
data available to score the data as shown in block 208 and
establish cost forecasts and set prices for upcoming medical
insurance coverage. The most recent data are sent to us for
validation, scoring, future cost estimation, cost trend adjustments
and pricing (blocks 206 and 208). The data submission is done
approximately on a monthly or quarterly basis. There is a trade-off
between getting the most recent claims data available for pricing
and the effort required to validate the data submitted at a higher
frequency and shorter intervals.
[0177] The data are stored and combined with the previous data
submission until three to six months of new data are available, as
shown in block 210. The new data are combined with the most recent
data from the previous data submission so that the most recent 12
months of data are available and are used as the updated next
period for recalibration of the models to be used for scoring other
groups. In other words, the old models are refit with the new data
and updated cost trends are included also. Every one to two years
the models may be revised with updated predictor variables and
weights. Redoing the models will help capture changes in practice
patterns and relative pricing.
[0178] As shown in block 212, the summarized cost forecast and
pricing information are sent to the client for use by underwriters
or in an automated quotation system. The insurance company or other
underwriter client may also use its own pricing algorithm using the
cost forecast produced by the method of FIG. 2.
[0179] As indicated, FIG. 3 similarly illustrates an overview of an
embodiment of the present invention which may be directly utilized
by a health insurer or medical underwriter.
[0180] As shown in blocks 305, 306, and 308, the various parts of
operational software and work flows of the client database may be
adapted to automatically extract data, validate it, score the data
with the forecasting models, and price the groups. The medical
history, cost and other data elements used, and timing of the data
extracts are normalized or standardized for utilization in the
method and automating the recalibration of the models as shown in
block 310. An alternative to installing the software on the
client's computers is to perform that task using the Internet (as
an Internet Service Provider or ISP) to extract the data and return
cost forecast and group prices to the client.
[0181] As shown in block 306, processing software modules for
carrying out the present method may be installed on client
computers, to utilize the standardized data for the software.
[0182] As shown in block 308, after determining the medical cost
forecast for a specific group, the prices are offered to that group
for renewed medical insurance, whether it be first-dollar,
stop-loss or other coverage. This can be done using a human
underwriter or as part of an automated quotation system.
[0183] The software will capture the updated data and combine it,
as shown in block 310. Those data will be used to recalibrate the
models after about three to six months of data accumulation. The
updating may be performed offline, or may include automatic
database updating and model recalibration. Completely new models
may be developed about every one to two years offline.
[0184] Having described an overview of several embodiments as
illustrated in FIGS. 1-3, various processing steps of the
illustrated methods will now be described in more detail.
[0185] 402 The first step in the data portion of the process is the
data request. We do not need to have data in a predetermined layout
or format. Some variables may not be available for a given CI, TPA
or other data provider. This process is flexible so that it can be
modified to work around alternative formats and data sets used to
formulate the candidate predictor variables. However, the dollar
value of claims made in the base period and claims paid (or
disability or life indicator ratios) in the next period are
essential. Enough time for run out of claims is necessary so that
incurred but not reported (IBNR) claims are included in the data.
The following is an example of a data formats, which may be used as
a request for health and medical cost data to be used in the
forecasting of medical costs:
EXAMPLE DATA REQUEST
[0186] In a preferred embodiment, this data may preferably be in
the form of five different data files that are linked by an
encrypted identifier. The identifier should include unique
characters for the company, family, and person. The data files
should include group-level information, person-level information,
detailed medical claims information (e.g., hospital, physician,
durable medical equipment, home health, etc.), detailed pharmacy
claims and capitation information, if germane.
[0187] Preferably, data for a relatively large number, e.g.,
500,000 people, covering 27 consecutive months (12 month base, 3
month lag, and 12 month test periods).
[0188] Descriptions of preferred data are as follows. Some of these
variables may not be readily available, especially some of the
group-level variables, and accordingly would not be used in the
model building and medical cost forecasting. Other data which may
define useful variables may also be included.
1. Group-level data (for any group covered during the test
period)
[0189] a. Company identifier
[0190] b. Group location (zip code or state and county codes)
[0191] c. Benefit plan description (format and content TBD)
[0192] d. SIC code or other industry classification
[0193] e. Original group effective date
[0194] f. Employer and Employee premium contribution %
[0195] g. Total number of covered employees on date last renewed or
date lapsed
[0196] h. Next scheduled renewal date
[0197] i. % employee participation
[0198] j. Capitation payments by provider type by geographic
locale
2. Enrollment data (person-level for each person covered above)
[0199] a. Company identifier
[0200] b. Person identifier
[0201] c. Age and birth date
[0202] d. Sex
[0203] e. Relationship to employee
[0204] f. Status of employee (e.g., COBRA, pensioner)
[0205] g. Employee type (e.g., hourly)
[0206] h. Zip code of residence
[0207] i. Date of enrollment
[0208] j. Date of termination during study period, if any
[0209] k. Presence of other health insurance (e.g., spouse
coverage, Medicare)
[0210] l. Salary or wage
[0211] m. Amount of term life coverage
[0212] n. Amount and terms of disability coverage
3. Medical claims (claim-level)
[0213] a. Person/company identifier
[0214] b. Service line-level information: [0215] i. Billed charges,
covered charges, payments, amounts applied to deductibles,
coinsurance, co-pays, and out-of-network penalties, amounts of COB,
pre-existing, capitation payments and other cutbacks [0216] ii.
Dates-incurred, entered, and paid [0217] iii. Array of ICD-9
diagnoses (5+) for each claim [0218] iv. CPT code for each claim
[0219] v. Provider type (e.g., physical therapist, clinical
psychologist, cardiologist) [0220] vi. For confinement in any sort
of inpatient facility, include partial bills, DRG for inpatient
hospital, admission and discharge dates, partial/final bill
indicator [0221] vii. Service type/location (e.g., ER, surgicenter,
home) [0222] viii. Amount of subrogation [0223] ix. Type of payment
(e.g., fee for service or capitation) 4. Pharmacy data
(claim-level)
[0224] a. Person/company identifier
[0225] b. National Drug Code or other classification
[0226] c. Date of prescription
[0227] d. Number of units, dose of units, and number of units/day
(if available)
[0228] e. Billed charges, discounted charges, and payments
5. Capitation payments, if germane
[0229] a. Geographic locale or market
[0230] b. Provider type
[0231] c. Amount and dates
[0232] d. Method for payment (e.g., per member per month)
[0233] The models can be built without pharmacy data if that is not
covered by the insurance. Enrollment and medical claims data are
required. Many of the group-level variables are desirable, but
optional. The data format would specify the dates for the beginning
of the base period and the end of the next period or new base
period to be used for the cost forecast for pricing. Because the
data may originate from a variety of different databases and
sources, control totals (e.g., number of records, sums of fields)
are also included, to assure that the data is excerpted and
formatted properly. The customer or TPA may provide a layout or
format for the data, because a specific format is not required. The
layout or other documentation should, however, describe all of the
legitimate values for the variables and the meaning of those values
(e.g., provider type=3=physician).
[0234] As shown in block 404 of FIG. 4, the customer or TPA sends a
layout and a sample database, so that tests can be run prior to
extracting all of the data. Valid ranges of variables are checked
as shown in block 406. Control totals are matched, and encrypted
IDs may be tested. The data need not be aggregated and tested since
it is a small subset of the data universe, but the conformity of
the sample data to the layout is checked.
[0235] If the database is accurate, the entire universe of data is
processed, as shown in block 408.
[0236] If the database and layout do not correspond or there are
data values outside of the range of legitimate values, the data
extraction program or layout are fixed and another sample data set
or layout is tested.
[0237] The dates for the model development overall, and the base
period for actual cost forecasting and pricing are established and
defined, and the respective dates for each respective group have
been set prior to the data request. Now the dates for each group
must be determined for its inclusion in the universe of the model
development.
[0238] As shown in FIG. 5, the process perhaps is easiest to
understand by working it backwards. A list is developed for the
renewal dates for the first year of coverage that would have
prospective prices set using this method, as shown in block
502.
The following Table 2 lists an example of time sequencing for
developing models and implementing cost predictor models.
TABLE-US-00003 TABLE 2 Time Sequences for preparing and
Implementing Cost Prediction Models B.sup.a Number Model
Implementation for of Consecutive A Predicting Costs and Setting
Calendar Months Model Development Prospective Prices 12 Base Period
Data Underwriting Period Data 3.sup.b Lag in Data Forecast Cost,
Incorporate 12 Next period Inflation Forecast and Set 3.sup.b Model
Weight (re) Premium calibration 12 Policy Period .sup.aColumn B
pertains to Groups which have the same renewal data (e.g., January
1) .sup.bPeriods greater than 3 months, may be required for these
phases depending on clients needs
[0239] The groups need to get a price in advance of the coverage
date for new customers, or the renewal date for existing customers,
to accept or reject it prior to the renewal coverage. Additionally,
time for receiving data from the client or a TPA and analyzing it
must be added to the lag period. We have used a three month lag
period, may be used in processing block 504, but it could be longer
or shorter depending on database and business needs.
[0240] As shown in block 506, the beginning of the lag period is
the last date that bills can be paid for the base period of the
model development period. Otherwise, the cost forecasting model
would include information that would not be available in the
future. The lag period information (claims paid or made) need not
be used to provide an accurate cost forecast for a future time
period for a particular group. The claims incurred during the next
period is the dependent variable for the model of the illustrated
embodiment. An estimate of claims incurred but not reported may be
added on if there is insufficient time for a proper run-out period
(i.e., if only one base period and next period are used for model
development). The lag period precedes the next period and the base
period is typically the year preceding the beginning of the lag
period in the universe of model development.
[0241] Table 2 illustrates one example of timing for the processing
of block 508. Column A represents the model development period and
Column B represents timing for the application of cost forecasting
and prospective pricing. The model development time period precedes
the actual pricing period but there is overlap since the next
period of the model development period is used as part of the
underwriting period for the application of cost forecasting and the
pricing model. The timeline will be modified when longer lag
periods are required. Column B pertains to groups with the same
renewal date. Alternate flowcharts may be used to represent each
renewal date.
[0242] Illustrated in FIG. 6 is a flowchart illustrating data
validation and standardization procedures for steps 102, 202 and
302 of the methods of FIGS. 1-3.
[0243] Preliminary data validation checks, and initial data
preparation as a second set of data checks, as shown in block 602.
Utilizing a file structure that will allow for standards to be
compared to the data prior to the data aggregation is a
facilitating procedure.
[0244] As shown for processing by block 604, medical claims include
diagnoses that are typically coded in ICD-9-CM codes, procedures
that are coded in CPT codes, prescriptions that are coded using NDC
codes, hospitalizations coded using DRGs, ICD-9-CM and other codes,
that may appear on claims. Tables are developed that contain the
values for all of these codes. These tables are standards for
comparison with the customer's data and the values in the data must
correspond to valid values for these coding systems.
[0245] As shown in block 606, tables are made for each client,
because the place of service, type of provider, dates, and other
fields on the claims and enrollment data will frequently have
values that are idiosyncratic to a particular database or
customer.
[0246] The values should preferably be put in a table format that
will allow checking and standardizing the data for accuracy, as
shown in block 606.
[0247] As shown in 608, the time periods at the group-level (see
TABLE 2) may be used to screen if claims and insureds should be in
the universe. A table is used for comparison. Prior experience
permits the development of norms that can be used to check the data
for reasonableness. Examples include the charge and payment per
claim, the number of claims per person, and other norms. These
values are put into a table for comparison, and processing in block
610.
[0248] Preparation (see block 612) of the raw data involves the
same data process steps used in FIG. 4, utilizing specified read
programs.
[0249] The data (see block 614) are provided in the agreed upon
medium, the data are read and control totals are checked, see block
616. If errors are noted, the cause is determined and
corrected.
[0250] The raw data are reformatted, see 618, into a SAS database
in the illustrated embodiment. Other database software (e.g., SPSS,
Oracle, etc.) could be used which are also capable of handling
large scale databases.
[0251] In subsequent process steps as shown in FIG. 6, the fields
are reformatted (see 620) so that the values correspond to the
standard tables, the group-level time period (see TABLE 2) tables
are used to extract, see 622, the universe of relevant claims and
insured people, and claims for people that are not in the model
development universe are put into a separate file (see 624). Data
following the model development universe time period may fit into
the underwriting period data that will be used for the application
of cost forecasting and pricing.
[0252] The claims and enrollment data from the model development
universe are compared, (see 626) to the standards. A decision is
made, see 628, whether the data are in compliance with the
standards.
[0253] Data that do not match the standards are put, see 630, into
a separate file. The cause of the mismatches is evaluated, and the
data is deleted or corrected where appropriate. Records may need to
be sent back to the customer for replacement or fixing. If there is
a large number of mismatches, they must be fixed prior to
aggregation.
[0254] The records that match the standards need to be matched and
merged, see 632, into person-level summaries. Incomplete data
should not be aggregated as it will be misleading.
[0255] FIG. 7 is a flowchart illustrating the matching and merging
(integration) of data in the process steps 102, 202 or 302 of FIGS.
1-3.
[0256] In order to match and merge the enrollment and claims data,
there needs to be a unique group, family within group, and enrollee
or dependent within family identifier, as indicated in the
processing of block 702. The social security number or other
identifier is encrypted so that actual people cannot be identified
and group numbers are used instead of company names. Street
addresses are not used so the people cannot be personally
identified. However, records need to be linked for accurate models
and pricing. One linking system that is effective uses the group ID
as a prefix, encrypted social security number of the enrollee as
the family ID, and enrollee or dependent number as the person ID.
Birth dates and sex are useful as checks on the ID.
[0257] As shown in processing blocks 704, 706, the claims data are
prepared separately, and a look-up table is generated that lists
the group, family, person ID for all claims with the respective
birth date and sex.
[0258] In accordance with processing blocks 708, 710, the
enrollment data are used to develop a separate enrollment look-up
table which contains the same information as the claims look-up
table. There will be more in the enrollment table since each person
in the group does not necessarily have a claim but should be in the
enrollment file.
[0259] The processing for the respective blocks of FIG. 7 are
described as follows:
[0260] 712 The tables are merged and compared. The claims table
should be a subset of the enrollment table. Claim IDs that do not
match enrollment IDs indicate an error. These claims are put into a
separate file and manually analyzed.
[0261] 714 The claims records that match enrollment records are
merged together into one long variable length record.
[0262] 716 The person-level merged file contains the enrollment
information and claim information, but the record is not
aggregated.
[0263] 718 A flag is assigned to people that have claims and
enrollment information since these records will require
aggregation.
[0264] 720 A flag is assigned to people that do not have any claims
since their record does not require aggregation.
[0265] 722 Additional data validation checks occur such as the
number of insureds per group and the percentage of people within
each group that have no claims.
[0266] 724 If there are aberrations in the data, there is a manual
review. If that does not fix the problem, the errors are reviewed
with the customer.
[0267] 726 The data are valid and ready to transform into the
analytic database.
[0268] FIG. 8 is a flowchart illustrating the aggregation and risk
factor coding for the steps 102, 202 or 302 of the processes of
FIGS. 1-3. The respective processing blocks of FIG. 8 are described
as follows:
[0269] 802 The claims data are sorted by person ID by incurred date
of the claim.
[0270] 803 This sort allows for a final screening on the
chronological eligibility. A person in the group typically needs to
have at least one day of eligibility in the base period and next
period and continuous eligibility between those dates. Otherwise,
they are dropped from the modeling database. If a person loses
eligibility prior to next period, he or she is dropped from the
entire analytic database. If the person enrolls in the lag period,
that person is kept in a separate analytic database. This last
category of people will have their next period payments compared to
those of similar demographics. If a person is enrolled in the base
period and disenrolls during the next period, those people are put
into a separate file in the analytic database. Their next period
payments will be compared to people with the same characteristics
that did not leave in the next period. People in other time
sequences may be dropped from the analytic database.
[0271] 804 A new record is produced for each person. It includes
the enrollment data and information extracted from the claims, when
available. The risk factors use ICD-9-CM codes, CPT codes, place of
service, provider type, demographic data, and other variables (see
risk factor listing in Appendix G). As the records for a person are
read, the ICD-9-CM diagnosis codes, CPT codes and other variables
that are used to define the risk factors are extracted from the
claim records. The new record is a vector of variables that are
initialized to zero and then incremented by one when that variable
is read in the claims. These variables are coded from claims from
the base period only. Payments and charges are summed for the base
period, lag period, and next period. It is important to compare the
expected cost from the forecasting model with the actual cost next
period of those that were not in the modeling universe. If there
are large discrepancies, the model may need adjustment.
[0272] 806 The risk factors are then coded by processing the
information on each person's aggregated record (See Appendix G).
Risk factors were developed using a combination of expert medical
opinion, statistical analyses, and knowledge of the medical
insurance market. Diagnoses are divided into diseases and
conditions and by inherent risk. Procedures are divided by body
system, type of test, type of procedure, and type and site of care.
Other risk factors are designed based on the relationship to the
enrollee, family composition and demographics. There is a trade off
between a very specific risk factor that has very few but very
homogeneous people in it and broad risk factors that have
heterogeneous people in it. Correlations with the next periods
payments and regression models are two ways to determine if a risk
factor is worthwhile empirically. The base period charges and
payments plus the shape of relative amounts of those payments by
month, day, or other amount of time are some of the strongest risk
factors (See TABLE 4). The amount of time enrolled in the base
period is another risk factor. The key is developing robust risk
factors that are not too heterogeneous. A priori logic plus trial
and error are useful approaches. Our candidate risk factor codes
are listed in Appendix G. TABLE 5 illustrates two family
composition risk factors. A detailed listing of risk factors is
contained in Appendix G: Risk Factors.
TABLE-US-00004 TABLE 4 Risk Factors for person level experience
Hibymos1 The maximum cost per day for any month cost for the base
period Hibymos2 The 2.sup.nd Highest cost per day for any month for
the base period Hibych2a (1, 0) 1 = The second highest month cost
per day is adjacent to the highest month Hibych2b (1, 0) 1 = The
second highest month cost per day is not adjacent to the highest
month Hi1dvby The index of Highest cost per day divided by average
cost per day per month Hi2dvby The index of 2.sup.nd highest cost
per day divide by average cost per day per month Tenmoch Average
from the sum of all months in the base period excluding the 2
highest months per day
TABLE-US-00005 TABLE 5 Risk Factors - Family Composition Ensxkd
Combines the use of Employee Relationship: `1` = `A Enrollee` `2` =
`B Spouse` `3` = `C Son` `4` = `D Daughter` `5` = `E Stepson` `6` =
`F Stepdaughter` `7` = `G Other Male` `8` = `H Other Female` `9` =
`I Surv Spouse` and Gender of Enrollee: `M` = "male` `F` = "female"
values for ensxkd: 1 Enrollee, Male 2 Enrollee, Female 3 Spouse,
Male 4 Spouse, Female 5 Son, daughter, Stepson or Stepdaughter 6
Other Female or Surviving Spouse kid1_3 Count of the Number of
Children in a family. 0 = no children, 1, 2 or 3 or more
children
[0273] Some insurance plans are paid on the basis of a combination
of fee for service (FFS) payments and capitation payments. The
previous discussion has assumed a FFS payment system. If the
combination or hybrid payment system is used, then adjustments for
capitation payments must be made at the person and group levels. We
recommend developing risk factors as dummy variables when there are
capitation payments for a particular provider types (e.g., primary
care, obgyn). This is especially important when the capitation
coverage is not consistent across groups or geographic region.
[0274] 808 Validation checks can now be made on person-level data.
Frequency counts for dichotomous or categorical variables are
prepared and compared among groups, geographic area, time period,
as well as against norms. Missing value percentages are calculated
by group, time period and geographic area for each risk factor. The
mean number of claims per day and mean dollars per claim (this can
be Winsorized) are calculated by group, time period and geographic
region. Large discrepancies in the number or average claim size is
reviewed and analyzed to uncover data errors. The ratio of charges
to payments is calculated by group, time period, and geographic
region and compared with norms.
[0275] 810 Aberrant results are evaluated to determine if there is
an error. If data cannot be corrected or replaced, those people are
dropped from the model universe.
[0276] 812 The model universe is left and ready for final
preparation for analysis.
[0277] FIG. 9 is a flowchart of processing steps for developing
cost forecasting models based on "inlier" data in steps 106, 204,
210, 304 or 310 of the methods of FIGS. 1-3. Processing blocks of
FIG. 9 are described as follows:
[0278] 901 A clean analytic database is required as the modeling
universe. Otherwise, spurious results will lead to idiosyncratic,
non-reliable models or, at best, weakly predictive models.
[0279] 902 The modeling universe database is separated into
Winsorized data (i.e., inliers) and the outlier data. There is an
"inlier" model with the dependent variable Winsorized and an
"outlier" model that uses the difference between the actual claims
next period and their Winsorized values. The independent variables
are similar for the inliers and outliers. It has been found that
models are more accurate when average payments per day is used as
the dependent variable and average charges per day as predictor
variables (and components of it such as the lowest ten months
average charge per day). Cost per day adjusts for persons not
enrolled for a complete year.
[0280] The Winsorization point is typically selected as the top 5%
of payments per day. If that value is $55 per day, then the inlier
model uses a value of $55 per day as the dependent variable for
people with greater than or equal to $55 per day in payments.
People with under $55 per day in payments do not have their
dependent variable changed.
[0281] The database for the outlier models flags people with next
period payments greater than or equal to the Winsorization value
(e.g., $55 per day). If they are at or over the Winsorization
amount, the flag equals one and zero otherwise. Also, the actual
payments per day next period less the Winsorization amount is
calculated. If it is negative, the outlier payment is set to
zero.
[0282] 903 The Winsorized modeling universe database is separated
into two separate components: those individuals with claims in the
base period and those individuals without claims in the base
period. Those without claims have only demographic risk factors
whereas those people with claims have a payment history and
clinical information as additional risk factors. Those without
claims are on average lower in risk than those with claims.
[0283] 904 The no claims database includes demographic variables,
such as age and the family relationship to the enrollee plus risk
factors from the enrollment file.
[0284] 906 People with claims in the base period also have the
enrollment file risk factors plus those risk factors derived from
the claims file.
[0285] An example of a program segment to run OLS regression model
on inlier with claims data is as follows:
TABLE-US-00006 *** `5.sup.th root of winsorized cost is DEP measure
`; ***OLS MODEL; proc reg data=`DATA WITH CLAIMS` outest=`OLS
1.sup.st MODEL FOR LAD CART`; exp9olsd : model w5_6850= ensagen
sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a h5bchg2b
ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2 hi1dvby
hi2dvby / selection=stepwise selection=backward details; run; proc
score data=`DATA WITH CLAIMS` score=`OLS 1.sup.st MODEL FOR LAD
CART` out=`DATA WITH CLAIMS` type=PARMS predict; var ensagen
sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a h5bchg2b
ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2 hi1dvby
hi2dvby; run; ***CHECK RESULTS; proc means data=`DATA WITH CLAIMS`
; class modeled; var w5_6850 exp9olsd ; proc corr data=`DATA WITH
CLAIMS` ; var w5_6850 exp9olsd ; where modeled eq `YES`;
[0286] 908 The initial person-level model for people with claims
uses the continuous independent variables only. Examples include
the age, number of days enrolled in the base period, charges in the
peak spending month, and average charge per day in the lowest ten
months. The dependent variable is the Winsorized payment per day
(or a transformation of it such as the fifth root) in the next
period. An ordinary least squares (OLS) model has been used. Other
forms of regression models (e.g., median or robust) or neural
networks could be used. The example given in the software in the
CD-ROM Appendix does not include this step, but the program above
does provide an example. This step can be important when there are
several numerical candidate predictor variables.
[0287] 910 The expected payments per day from the previous step is
used as an input to the next model along with the categorical
variables (e.g., sex, site of care, diagnosis, etc.) We have found
that a regression tree is a very effective method for capturing the
interactions between the clinical variables and the amount charged
in the base period. The CART software with the median regression
tree option has produced the best results to date. Other forms of
data mining (e.g., rule induction, clustering, F genetic
algorithms, neural networks) could also be used. The key is to
capture the interactions between base period charges and both
clinical and demographic risk factors. An example of a Program to
run CART median regression tree using expectations created from OLS
regression (see 910) and other risk factors is found in Appendix
A.
[0288] 912 A CART median regression tree or other data mining
technique is used to model the "no claims" Winsorized database. The
first model (i.e., the one for continuous variables used in 908) is
omitted since none of the continuous variables derived from claims
are available for this universe other than age or length of
enrollment. This model uses the same statistical techniques as 910
but its independent variables are limited to those that can be
derived from the enrollment file. The output from the regression
tree (terminal nodes) identifies groupings of people that have
homogeneous next period payments.
[0289] 914 The regression tree terminal node's groups people with
similar median payments next period. A set of dummy variables is
developed that identify people in each terminal node. These dummy
variables, the variables that were used to form the dummy
variables, and the significant variables from 908 are entered into
a final prediction model. We have used OLS, but other techniques,
such as median or robust regression, neural networks or other
modeling methods could be used instead. The result of those models
is an expected payment per person per day in the next period. This
only includes the Winsorized portion of the payments for people
with claims in the base period. An example of a program to run OLS
regression using terminal nodes from regression tree and other
important risk factors from the tree (see 910 and Appendix A) is
found in Appendix B.
[0290] 916 The same technique as 914 is applied to the model output
from 912. The result of this model is the expected payments per day
for next period for people that do not have claims in the base
period.
[0291] 918 Model testing can be done at this point or after each
step in the modeling process (i.e., after 908, 910, and 914 for
models for people with claims). It is probably more efficient done
after the final step. There are five criteria that are used in
model evaluation in the illustrated embodiment: the mean absolute
residual, r.sup.2, accuracy measure (previously defined), bias, and
cross validation. Mean absolute residual, accuracy measure
(previously defined) and r.sup.2 are related to the accuracy of the
forecast. Bias refers to systematic over or under prediction when
cases are sorted by their expected value. Regression models can be
biased but regression trees are not biased. Cross validation refers
to the accuracy of the models when they are applied to different
sets of data. The tree software tests for cross validation.
Hold-out samples can be used for testing the entire hybrid models.
An example of a Program to run bias test, mean absolute residual,
and r.sup.2 analyses (examples of model testing) is found in
Appendix C.
[0292] 920 The same tests of the quality of the models are applied
to the models developed on people without claims in the base
period. The model tests are probably most efficiently applied after
the final model is developed (i.e., 920). These models will have
far less predictive accuracy than the models covering people with
base period claims since there are fewer risk factors and the
variability in next periods payments is not very predictable.
[0293] FIG. 10 is a detailed flowchart of process steps for
developing cost forecasting models based on "outlier" data of the
Winsorized data for the steps 106, 204, 210, 304 or 310, of the
methods of FIGS. 1-3. The illustrated processing blocks of FIG. 10
are described as follows:
[0294] 1002 The outlier database has next period's payments of zero
for everybody whose payments were below the Winsorization point and
the amount above the Winsorization point for everybody else. The
outliers can have very high cost per day so the variability is very
large. Therefore, we have chosen to model the outlier portion
separately. This two step approach leads to more accurate and
stable results since the extreme outliers are almost impossible to
predict accurately.
[0295] 1004 People with base period claims are modeled separately
as they have risk factors not available with people without base
period claims (e.g., diagnosis and amount charged).
[0296] 1006 People with no base period claims are modeled
separately since they only have risk factors available from the
enrollment file.
[0297] 1008 The same continuous risk factors available for 908 are
used to model the probability of these people having payments above
the Winsorization point. The dependent variable is 1 if the total
amount of next period's payment is above the Winsorization point or
zero otherwise. A logistic regression is used to estimate the
probability of each person's next period's payments exceeding the
Winsorization point. Other types of regression models (median or
robust), neural networks, or other predictive modeling can be used
instead of logistic regressions.
[0298] A program to run logistic regression probability model on
outliers with claims follows.
TABLE-US-00007 **HILO is the 1=Outlier, 0=Inlier; proc logistic
data=`DATA WITH CLAIMS` outest=`LOGISTIC WEIGHTS`; exphilo : model
HILO=ensagen sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1
h5bchg2a h5bchg2b ten5moch zeroa zerob zerooth enrldayb hibymos1
hibymos2 hi1dvby hi2dvby; ; proc score data=`DATA WITH CLAIMS`
score=`LOGISTIC WEIGHTS` out=`DATA WITH CLAIMS` type=PARMS predict;
var ensagen sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a
h5bchg2b ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2
hi1dvby hi2dvby ; run; run; data `DATA WITH CLAIMS`; set `DATA WITH
CLAIMS`; exphilo=exhbilo*`mean of outliers`;
[0299] 1010 The model is tested for accuracy using the criteria
described in 918. Note that the probability of each person being an
outlier is being modeled rather than classifying each person as an
outlier or not an outlier. All of the techniques from processing
block 918 of FIG. 9 are applicable.
[0300] 1012 A regression tree is used to refine the estimated
probability of being an outlier. The dependent variable is the same
as 1008. We recommend a least square regression tree but other
types of predictive models could be used that capture interactions
(e.g., neural network, rule induction or genetic algorithms). The
expected value from the logistic regression plus all of the
categorical risk factors from the claims data and enrollment file
are used as candidate independent variables (See 910). The output
are terminal nodes of a least squares regression tree that have
homogeneous probabilities of being an outlier. The probability of
each person is determined by their terminal nodes. Note that this
is not a classification tree.
[0301] A program to run CART least squares probability tree on
outlier with claims data using expectations from OLS regression
(see 1008) and other risk factors is found in Appendix D.
[0302] 1014 The same methods are applied to the people with no
claims data (See 1012). The output are groupings of people with
homogeneous probabilities of being an outlier.
[0303] 1016 and 1018 The models are tested for accuracy, bias and
cross validation as the models were tested in 918.
[0304] 1017 and 1019 The terminal nodes and risk factors defining
those terminal nodes are used as input into another logistic
regression or other forecasting technique (see 914 and 916). The
examples in Appendix E are for 1017 since it includes data from
claims.
[0305] 1020 and 1022 For each terminal node, the median payment
above the Winsorization point next period is calculated. When the
medians are not significantly different, the terminal nodes (mean
above the Winsorization point) are combined for additional
stability. Note that the probabilities are not combined. The means
are calculated arithmetically for the people in the combined
terminal nodes and for those kept in separate nodes due to their
distinctive median dollar costs. The means are then multiplied by
the respective probabilities for each person giving the expected
outlier payments for each person. The probability from the logistic
regression (see 1017 and 1019) is used rather than from the
regression tree. People are "tagged" with their respective terminal
nodes (see 1012 and 1014) so that the correct mean is multiplied by
the probability.
[0306] 1024 The inlier Winsorized cost forecast and the expected
cost of the outlier portion are summed to give the total expected
cost for next period.
[0307] The process of scoring the data refers to applying the model
to a set of data. The data need not be the same data on which the
model was developed. However, it is best if the weights are derived
from that client's book of business. The data need to have the same
risk factors coded on it that were included in the models of the
probability of being an outlier and those used for the expected
inlier payment calculations. Also, the models must be applied to
the universe of people that were defined using the same criteria
that were used to define the model universe. The model gives a set
of weights applied to individual risk factors or combinations of
risk factors yielding the expected payments or probability. Most
statistical packages or data mining software have automated methods
for scoring data once the risk factors are properly coded.
[0308] Illustrated in FIG. 11 is a detailed flowchart for scoring,
testing and integrating the data, and adjusting for cost trends for
use in steps 106, 204, 210, 304 or 310 as well as 108, 208 and 306
of the methods of FIGS. 1-3. The description is written as steps in
developing the model so the data are referred to as the base and
next periods. The application of the model to the actual
underwriting data is essentially the same and it produces the
policy period expected cost. The respective processing blocks of
FIG. 11 are described as follows:
[0309] 1102 The probability of a person being an outlier (i.e.,
with policy period payments greater than the Winsorization point)
is calculated for all people without claims. Their probabilities
will be lower than those with base period claims.
[0310] 1104 The mean for each terminal node or group of terminal
nodes (block 1022 of FIG. 10) is multiplied by the associated
probability. This calculates the amount over the Winsorization
point that each person is expected to cost in the next period. This
gives the expected outlier dollars per day for each person. The
mean expected dollars per day for each person is well below the
Winsorization point.
[0311] 1106 and 1108 The exact same process is applied to the
outlier probability model and mean policy period payments for
people that have base period claims. The expected value is
calculated by multiplying the probability by the mean.
An example of a Program to score the outlier with claims data (see
1017) is as follows:
TABLE-US-00008 proc score data=`data from cart` score=`logistic
output ` out=`data with claims` type=PARMS predict; var ensagen
agesq exp9olsd exp9sqd sq5oth ten5moch dxresp othdiges hi2dvby
dxdigest dxcircul tnde5ls1 tnde5ls3-tnde5ls5 ensxkd1a ensxkd2b
ensxkd3c ensxkd4d ensxkd6f; run; run; data `DATA WITH CLAIMS`; set
`DATA WITH CLAIMS`; expprob=hilols*`mean of outliers`;
[0312] 1110 and 1112 The expected next period inlier (less than or
equal to the Winsorization point) payments are added to the
expected next period outlier payments to produce the total expected
payments in the next period for people with no claims (from 920)
and for people with claims in the base period (from 918). The
following program is an example of scoring inlier data with
claims.
TABLE-US-00009 Program to run scoring of inlier with claims data
(output from OLS regression see 914) ***score ALL data; PROC score
data=`DATA WITH CLAIMS` score=`OLS regression scores` out=`DATA
WITH CLAIMS` type=PARMS predict; var ensagen agesq exp9olsd exp9sqd
sq5oth ten5moch dxresp othdiges hi2dvby dxdigest dxcircul
td5lad2-td5lad13 ensxkd1a ensxkd2b ensxkd3c ensxkd4d ensxkd6f ;
run; run; title2 `REPORT TO REVIEW SCORED DATA With model
universe`; PROC means data=`DATA WITH CLAIMS` ; var wins6850
expolsls exp5rLAD exp5rtLs ensagen agesq exp9olsd exp9sqd sq5oth
ten5moch dxresp othdiges hi2dvby dxdigest dxcircul td5lad2-td5lad13
ensxkd1a ensxkd2b ensxkd3c ensxkd4d ensxkd6f; where exp9olsd ge
1.15;
[0313] 1114 This database includes everybody that was included in
the modeling universe (i.e., the standard population). However,
there are people that were enrolled next period but not included in
the modeling universe.
[0314] 1116 When everybody included in the modeling database is
combined, the sum of the expected payments per day next period
should equal the actual payments. Additional model testing is
performed at this point. The same methods (see 918 and 920) that
were used to test the models developed on subsets of the modeling
universe are reapplied now. This summary testing is even more
important than testing the components of the complete model.
[0315] 1118 There are three categories of persons used for which
insurers will be at risk during the next period but who are
excluded in the modeling database (i.e., the standard
population).
[0316] 1. Persons enrolling during the lag period
[0317] 2. Persons enrolling during the next period
[0318] 3. Persons terminating during next period [0319] a. in 1 or
2 above [0320] b. other categories
[0321] For those in categories 1 or 2, no base period claims data
are available when the rates must be developed and offered.
Consequently no model predictions can be made for them. However, we
know their actual payment costs during next period. The following
tabulations will show if any adjustment in expected next period
costs is needed for them.
[0322] Compare the next period actual costs per persons per day for
those in categories 1 and 2 with both the expected next period cost
per person per day and the actual next period cost per person per
day for those in the following categories (note that these are
detailed examples of subscriber units that could be used for
pricing also):
[0323] Subscriber only
[0324] Subscriber and spouse
[0325] Subscriber spouse and 1 dependent
[0326] Subscriber spouse and 2+ dependents
[0327] Subscriber and 1 dependent, no spouse
[0328] Subscriber and 2+ dependents, no spouse
Because outlier next period costs may distort these findings, the
following quantities of costs per person per day should also be
compared to reduce the effects of outlier.
[0329] Median
[0330] 75th percentile
[0331] 90th percentile
If there are no significant differences between the excluded and
included categories of persons, no adjustment is needed. For those
categories for which there are significant differences, the
adjustment factor will be (excluded category mean next period
cost/day) divided by (included category mean next period
cost/day).
[0332] The number of persons in category 1 can be determined for
those who actually enrolled in the lag period while the number in
category 2 can be estimated from underwriting period data. The
final adjustment factor will be the product of the per person
adjustment factor (as above) and the proportion of all next period
person days estimated to be comprised by those in category 1. The
proportion of next period person days comprised by those in the
model will have an adjustment factor of 1.00.
[0333] The use of these adjustment factors can be further refined
by applying them separately for sets of insured groups which have
similar adjustment factors, instead of applying one adjustment
factor to all groups.
[0334] Additional adjustment for those in category 3a above is not
required since these persons experience will be included in the
adjustment for those in categories 1 and 2. Those persons in
category 3b will be included in the population used as the standard
for our overall risk models. They can thus be scored by their base
period attributes, and their next period expected costs can be
estimated from the described models. We can thus score them by
their base period attributes and estimate their next period
expected costs from our models. These can then be compared to the
actual next period costs per person per day, in total and by the
subscriber family categories listed above.
[0335] After checking for the influence of outliers, any subsets
with actual values differing significantly from expected values can
be the basis of adjustment. The proportions of person days in
category 3b can be estimated from the available data.
[0336] As noted above, separate adjustments can be made to expected
next period costs for groups which have similar adjustment
component factors.
[0337] 1. actual to expected costs
[0338] 2. proportion of next period person days attributable to
those in category 3b There may well be an interaction in these two
factors.
[0339] 1120. The database of all people covered next period is
compiled next. A flag is set to one if the person has an expected
payment next period that was calculated from the risk adjustment
models. Only the new joiners in the lag period or next period
cannot have an expectation calculated from the risk adjustment
model.
[0340] 1122 When this product is used for an application of
prospective pricing for insurance coverage, the future cost of
health care needs to be included. The risk adjustment models
include the historical cost trend since it was present in the data.
In other words, no additional adjustment was required for the
modeling since the model uses the base period to forecast next
period's payments so the cost trend inherent in the data is built
into the model. Note that with a 3 month lag period, this is a 15
month cost trend. If the future annual cost trend is expected to be
identical to he cost trend between the base period and the next
period, then no further adjustment is needed since it is already
incorporated in the data and model. If the future cost trend is
different from the cost trend implicit in the data used for model
development, the ratio of the future cost trend divided by the
model period cost trend should be used as an adjustment.
[0341] All health insurance companies use an estimate of the future
medical cost trend to increase future expected claim costs to what
they expect them to be in the policy period. The simplest
group-level cost forecast for a credible group is last year's cost
multiplied by cost trend producing the "experience" forecast. The
CI will provide a cost trend forecast for use in this invention.
The development model has an implicit cost trend built into it
since it was present in the model development data. Therefore, the
development model must be detrended and then the CI's cost trend
forecast can be applied to the person-level cost forecast when the
model is applied to the underwriting period data. In order to
detrend the development model, we calculate the cost for a
standardized population for the book of business in the base and
next periods. The standardized population assumes a specific mix of
demographics in the CI's book of business for the base and next
periods. A particular embodiment would calculate the proportion of
cost in each of the following categories: male employee; female
employee; male spouse; female spouse and other dependent
cross-classified by 5-10 age categories (e.g., <5, 5-17, 18-24,
25-34, 35-44, 45-54, 55-64, 65-74, 75+). This particular
classification would produce up to 40 demographic cells. Other
classifications could be used. Too many cells will cause a loss of
robustness in the estimates. The mean cost per person per cell in
the next period divided by the associated mean cost in the same
demographic cell in the base period calculates the cost trend per
cell during the model development period. One method to standardize
the population in order to produce a single cost trend for the
entire book of business is to weight each cell by the proportion of
cost it accounts for in the base period. The weighted average of
the cells' cost trend is a summary cost trend for the book of
business for that standard population for the time period between
the base and next periods. If those periods are contiguous and one
year each, the annual development model cost trend has been
calculated. Otherwise, an adjustment must be made for the time
periods to calculate an annual trend. If the base, lag and next
period are each one year, the square root of the cost trend will
calculate the annual cost trend since the trend compounds. If the
lag period is three months and the base and next period are one
year, the fifth root of the cost trend is the three month cost
trend. The three month cost trend is taken to the fourth power to
calculate the annual cost trend. To apply the CI's single number
cost trend (which will be an annual trend), the reciprocal of the
annual development model cost trend is multiplied by the CI's
annual cost trend to calculate the cost trend that should be
applied to the underwriting period data after application of the
development model. This method works for first dollar medical
insurance, aggregate only medical stop loss and reserving for those
insurance products.
[0342] The development model next period data need to be detrended
and then retrended with the CI's cost trend forecast prior to
calibrating the development model for specific stop loss coverage
or aggregate stop loss in combination with specific stop loss
coverage. Once those adjustments are made, additional cost trend
adjustments do not need to be made before applying the specific or
aggregate in combination with specific stop loss models to the
underwriting period data to forecast the policy period costs.
[0343] Alternatively, the CI may have cost trend calculated
separately by geographic locale or by provider type (e.g., drugs,
physician, inpatient hospital). If the CI's cost trend is specific
to each geographic locale, the same method of demographic cell
adjustments can be employed as previously described but a separate
table is calculated for each geographic locale. The CI's locale
specific cost trend is applied to the cost trend estimated for the
model development period using the standardized population
adjustments for each locale. Each locale's detrending and
retrending is applied to the underwriting data for that locale to
calculate the policy period cost for that locale.
[0344] If the CI's cost trend forecast is by provider type, we need
to estimate the development model trend by provider type so that
the policy period forecast will be appropriately detrended and
retrended. This can be done by cross-classifying the demographic
cells by provider type costs for the base and next periods and
calculating the provider type trend for each demographic cell
separately by provider type. The provider type cost trend by
demographic cell are combined by weighting by the proportion of
base year cost by each by the proportion of total cost for that
demographic cell for each provider type separately. This calculates
a provider type cost trend for the base to next period for the
entire book of business. The CI's forecast cost trend by provider
type is multiplied by the reciprocal of the model development cost
trend for the same provider type. This adjusted cost trend by
provider type is multiplied by the cost forecast for each terminal
node by the associated cost by provider type in the policy period
and then summed across provider type by person to calculate the
policy period forecast cost per person. The associated cost in the
policy period by provider type is calculated by multiplying the
proportion of cost by provider type in the next period by terminal
node by the total forecast cost for the policy period for that
terminal node.
[0345] 1124 The person-level inflation adjusted forecasts are
summed by group and actual is compared to forecast. The group-level
models make adjustments when the actual is different from
forecast.
[0346] The underwriting period data are scored using the model
developed on the base and next periods. Risk factors need to be
calculated for the underwriting period data in order to apply the
model. The summed scored data, with appropriate cost trend
assumptions, produce the expected policy period costs or actual
expected cost for the policy period using the person-level
models.
[0347] FIG. 12 is a detailed flowchart illustrating processing
steps for developing group-level models and making adjustments to
the summary of the person-level data of steps 106 and 108 of FIGS.
1, 204, 208 and 210 of FIG. 2, or 304, 306 and 310 of FIG. 3. The
steps are similar to the person-level modeling steps. First the
development model is calculated using the base and next period
data. The model is then applied to the underwriting period data
(i.e., scoring the data) to forecast the policy period costs. With
the group-level model there is the model development using the base
and next period and then the risk factor coding and scoring of the
underwriting period to produce the estimated policy period costs
for pricing the policy. The processing block descriptions for FIG.
12 are:
[0348] 1202 There are likely to be characteristics of insured
groups which can influence the group's costs of care over and above
that based on the characteristics of the persons in the insured
groups. For this reason we develop a model to identify such
intergroup differences and a way of applying the model's results to
adjust each groups expected payments from the models based on
individuals. First, the person-level expected payments are summed
by group.
[0349] 1204 The group-level development models have the following
characteristics: [0350] Unit of observation--the "group" [0351]
Dependent variable--Next period residual dollars per person per day
in the group (i.e., group total next period actual payments less
Group total next period forecast payments divided by the number of
people in the group divided by 365 days) [0352] Candidate predictor
variables are coded and include the following [0353] Benefit
attributes [0354] alternative insurance plan [0355] deductible
[0356] co payment [0357] exclusions [0358] dependent coverage
[0359] Benefit plan type: indemnity, PPO, POS, lock-in HMO [0360]
Payment type: fee for service or capitation [0361] Demographic
cells: proportion in age range by relationship by sex [0362] COB in
Base period [0363] Capitation payments by provider type [0364]
Number of subscribers [0365] Average family size and proportion in
each family composition class [0366] SIC code [0367] Geographic
locale [0368] Actual mean payments in base (underwriting) period
per person per day [0369] Expected mean payments in next (policy)
period per person per day [0370] Percent of enrollees joining
during base period or leaving during base period
[0371] Payment carve outs for capitation--if specific types of are
paid by capitation (e.g., primary care, obgyn), then risk factors
need to be developed that will allow the group-level model to
reduce the payments since the services are covered by the
capitation payments. Dummy risk factors for the presence or absence
of capitated payments by provider type will need to be included
when all services are not covered by fee for service payments.
[0372] 1206 A least square regression tree including selected
interaction terms as predictors (other data mining techniques that
develop and test numerous interactions such as neural networks,
rule induction, genetic algorithms, clustering techniques or other
methods could be used instead of regression trees) is developed on
the group-level data. This second level of modeling makes
adjustments for information not included at the person-level.
[0373] 1208 An ordinary least squares model (other types of
regressions, neural networks, or other types of predictive models
could be used instead of the OLS regression) is applied to the
predictor variables that were important in the model preceding this
step. The candidate predictor variables include the terminal nodes
as dummy variables and the main effects used to define the terminal
nodes.
[0374] 1210 The predicted values from the model in 1208 are the
average per person per day error (i.e., residual) in the estimate
of next period's payments for everybody in the group. This residual
is added to each person's next period expected payments from the
person-level models (subtracted if it is a negative value). The
model is developed on historical data that have no need for a cost
trend adjustment except to be annualized since the cost trend is in
the data. When the models will be used for setting prices for the
policy period, the inflation adjusted person-level next period
payment estimates are used as input and the groups are scored using
the group-level models. Risk factors are coded for the group using
the underwriting period data and the groups are scored with the
group-level model to produce the policy period expected group-level
costs.
[0375] Alternatively, the MAP4HIP method can be used to forecast
person-level cost for individual (or family) renewal health
insurance. The same methods apply but there is no "group" other
than the family. The cost for the individual family members are
summed to produce the family-level forecast. A family-level model
can be used for final cost adjustments. The family-level risk
factors are family composition, benefit plan, geographic locale and
other factors germane to the family rather than an employment
"group".
[0376] FIG. 13 is a detailed flowchart of an embodiment of a price
optimization procedure which may be used to carry out steps 110,
212, or 308 of FIGS. 1-3. The processing block procedures of FIG.
13 are:
[0377] 1302--The group cost estimate is the final output from the
cost estimation system (i.e., expected medical costs in the policy
period). It is at the group-level and includes the inflation trend
estimate.
[0378] 1304--The CI provides three sets of inputs that are used in
the price optimization. The first set of input is their expected
probability of retaining the group if the group's price is
increased a specified amount. Rate increases will not be negative,
generally, unless there is medical price deflation. Many
probability estimates are gathered with small changes in the price
increase around the client's target profit and fewer more sparse
estimates further from the targeted profit margin. The client needs
to consider the group's historical costs, inflation, local
competitive pricing, and other factors that influence the group's
likelihood of accepting the various price increases. Another
necessary input from the client is the administrative costs
allocable to that group. This cost may be expressed as a percentage
of the expected medical costs or in dollars per year. The final
input required is a minimum expected profit or profit margin that
is acceptable.
[0379] The following Table 3 is an example of price forecasting
using probability of retention and other related input data for
steps 1304, 1306, 1308 and 1310:
TABLE-US-00010 TABLE 3 Price Forecast Example Probability Ratio
Next Next Price of Admin Year Year Expected Increase retention to
Cost Price Total Cost Profit 0.00 0.95 0.25 1500 1375 118.75 0.02
0.92 0.25 1530 1375 142.60 0.04 0.90 0.25 1560 1375 166.50 0.06
0.85 0.25 1590 1375 182.75 0.08 0.80 0.25 1620 1375 196.00 0.10
0.73 0.25 1650 1375 200.75 0.12 0.68 0.25 1680 1375 207.40 0.14
0.63 0.25 1710 1375 211.05 0.16 0.58 0.25 1740 1375 211.70 0.18
0.53 0.25 1770 1375 209.35 0.20 0.45 0.25 1800 1375 191.25 0.25
0.35 0.25 1875 1375 175.00 0.30 0.25 0.25 1950 1375 143.75 0.35
0.15 0.25 2025 1375 97.50 0.40 0.05 0.25 2100 1375 36.25 0.45 0.01
0.25 2175 1375 8.00 0.50 0.00 0.25 2250 1375 0.00
[0380] The optimal price is $1740 per person or a 16% increase.
Costs are expected to be $1375/person and there is a 58% chance of
retaining the group. This yields $211.70 expected profit per
person.
[0381] 1306--The expected profit (or profit margin) is calculated
by the following formula: expected profit=(probability of accepting
price offered).times.[((1+proportion price increase).times.(price
in previous period))-(expected policy year medical
costs)-(administrative costs)].
This is the expected profit (margin is calculated by dividing by
the group's price) and it is calculated for each rate increase and
probability of retention or acceptance. The maximum expected profit
is the largest amount (or the closest to zero if they are negative)
calculated in the preceding step. The largest expected profit is
compared to the client's minimum acceptable expected profit.
[0382] 1308--If the expected profit is below the minimally
acceptable, then the expected profit calculations are printed out
and the underwriter may run additional analyses to test the
sensitivity of the assumptions. Also, the price at which the
expected profit equals the minimally acceptable profit is printed
out. If the underwriter wants to modify the probabilities in the
retention curve, those are changed and 1304 is repeated.
[0383] 1310 If the maximum expected profit is greater than the
minimum acceptable profit, then the price optimizing profit, its
percentage increase, expected costs and profits are printed out for
the underwriter along with the same output for non-optimal prices.
The underwriter would offer the price that maximizes their
profits.
[0384] Another consideration when pricing the product is the
variability of the forecast cost for the policy year. Greater
variability should carry an additional risk premium. Therefore, the
standard error of the group's expected medical cost is calculated
and printed also. SAS or S Plus regressions will calculate the
variability of the mean or the standard error of the estimate of
the policy year cost by combining the standard errors of the
person-level forecasts. The price that provides a 90% (or some
other high probability) chance of break-even is calculated using
the standard error and printed. An underwriter can use the
break-even with a high probability price and the relative standard
error in negotiating price. If there is a large relative standard
(e.g., standard error of group/average standard error), the
underwriter would be less inclined to discount the price in a
competitive market since the likelihood of a loss is increased.
Code for a program to run a pricing example is found in Appendix
F.
[0385] 1312--If the underwriter does not want to modify the
retention curve, the underwriter offers the group the price that
produces the minimally acceptable profit for the client even if the
group is expected to reject the offer.
[0386] 1314 The final step in pricing involves translating the
average price per person per day into a monthly price per
subscriber unit (e.g., single person, enrollee with spouse,
enrollee with two or more additional dependents--other subscriber
unit constellations are also possible). Costs are traditionally
presented in cost per member per month or pmpm. However, subscriber
units are used for pricing and it is important that costs are
rationally allocated to the subscriber units. The price is
multiplied by 365/12 to calculate the monthly price (or rescaled
for another time period). One alternative for pricing the
subscriber units is to calculate the mean cost forecast per
subscriber unit for the group and then inflate each mean subscriber
cost by the average profit margin for the group (i.e., recommended
optimal price/expected cost). The mean cost forecast per subscriber
unit is calculated by summing the forecast cost per person for each
person that is a member of that type of subscriber unit in the
underwriting period and then dividing that sum by the number of
subscribers of that type (not people) in the underwriting period.
This gives the group's mean daily cost per subscriber for each
different type of subscriber unit. Another pricing alternative is
to set the price for the subscriber units that are considered to be
very price sensitive just below the market price. The remaining
subscriber units must then be priced so that the overall expected
profit is maintained. That can be calculated by estimating the
expected profit for the market priced subscriber units and
subtracting it from the total expected profit for the group. The
other subscriber units must account for the remaining profit
requirement. Their price can be set so that the profit margin
equals the remaining profit requirement by solving the following
equation for price per subscriber unit: (total expected
profit-market priced subscriber profit)=remaining profit=(number
remaining subscriber units).times.((price/remaining subscriber
unit)-(mean expected cost/remaining subscriber unit)). Solving the
equation provides an average price/remaining subscriber unit or
(price/remaining subscriber unit)=((remaining profit)/(number
remaining subscriber units))+(mean expected cost/remaining
subscriber unit). If there are two or more remaining subscriber
units, the price can be pro rated based on the average forecast
cost/remaining subscriber unit. This approach can be used for
pricing stop loss medical insurance also. Alternative allocation of
profits to subscriber groups are possible. Those of ordinary skill
will appreciate that the relation of expected cost to the terms of
the medical insurance will vary among insurance types. For example,
first dollar products will have a higher expected costs than
stop-loss products.
[0387] Estimating costs that need to be considered for reserves for
first dollar health insurance and for stop loss coverage are
alternative uses for the cost forecasting process. Rather than
predicting payments that will occur over the entire policy period,
reserving requires predicting costs that will occur in the upcoming
financial reporting period (e.g., fiscal year or quarter). The same
cost forecasting process using data collection and validation, risk
factors, data mining and statistical techniques at the person and
group-levels, testing and reporting can be applied to produce cost
estimates to be used in setting reserves. The dependent variable
needs to be changed so that the reserving model is calibrated to
the appropriate time period.
[0388] The model for reserving forecast's costs that have been
incurred but not reported (IBNR) and this may include some costs of
claims that have not occurred yet but are in the financial
reporting period. Typically, the reserving period will run through
the end of the current fiscal quarter or year. Inflation needs to
be accounted for but the time period is far shorter than for the
renewal cost forecast product, but the same techniques apply over
the shortened time period.
[0389] A development period model is calibrated using the risk
factors from the claims and enrollment data in a base period to
forecast total incurred claims for the financial reporting period.
The underwriting period for reserving can be the previous 12 months
of claims (if available) preceding the reserving date or some other
time period such as this policy period to the reserving date. The
base period for the developmental model must have approximately the
same number of days as the underwriting period so the forecast will
not be biased. The policy period for IBNR claims begins at the
first date of the financial reporting period and ends at the last
day of the reporting period. The next period for the model
development cost for IBNR or claims that have not occurred yet must
be of the same length as the actual reserving period during the
policy period for correct model calibration. This is a standard
person-level model for MAP4HIP with a shorter next period (e.g.,
quarter) possibly. The total forecast claims are summed to provide
a total claim amount forecast. This is used as an independent
variable and is supplemented by additional independent variables
that include the reported claims, historical completion rates by
time into the reserving period, claims backlogs and seasonality.
The total of the IBNR claims from the reserving period is the
dependent variable. Note that this model is at the book of business
level. A quarter will yield only one data point for the book of
business. If there are too few quarters for developing a stable
model, an alternative approach is recommended.
[0390] The alternative approach defines reserves as the difference
between the total claim forecast for the reserving period and the
incurred and reported claims during that period. In other words,
the sum of the incurred and reported claims is subtracted from the
total forecast claims and this equals the reserve forecast.
[0391] The reserving product can be delivered as a service bureau
product or as software, either stand alone or an ISP model, using
the same data flows as used with the cost forecasting models for
fully insured or stop loss coverage. The pricing module is not
relevant for reserving.
[0392] The fully insured medical product uses claims information as
a critical component of the cost forecasting model. Claims are
available if the group is renewing first dollar health insurance
but not for a new group. Enrollment data may be available for new
groups (possibly only for employees) or individual health
insurance. The same process can be applied to new groups or
individual (or family but called by convention individual) policies
by using the method for the people with no claims and only
enrollment data. The base period enrollment data must contain the
same potential risk factors as are available for the new groups.
Note that there is only one model since there are no claims data so
people cannot be separated into claims and no claims people in the
base or underwriting periods. The cost forecasting model should be
developed on the client's current book of business. The dependent
variable is next period's payments. The independent variables are
the same as the risk factors used in the no claims model (i.e.,
detailed enrollment data only). The modeling universe includes
everybody rather than only those with no claims. Sometimes claims
data are available for high cost cases in the new group and also
may include the demographics and diagnoses associated with those
high cost cases. This information can be included as person-level
risk factors but the same information will need to be included as
potential person-level risk factors in the base period for the
development model. A group-level model can be applied to the
summarized group-level data as with renewal business. Frequently
the total cost for the new group last year is available and may be
used as a risk factor for the group-level model. The total group
cost would then need to be included in the base period as a
potential risk factor also.
[0393] The fully insured new business cost forecasting and pricing
product can be delivered as a service bureau product or as
software, either stand alone or an ISP model, using the same data
flows as used with the cost forecasting models for fully insured or
stop loss coverage.
[0394] Aggregate only medical stop loss insurance, such as CapCost,
can have different data sources than fully insured insurance (where
the data is held and owned by the insurance company), as a TPA pays
the claims and holds the data for the self-insured employer. It is
our intent to get the data for all of the TPA's groups so that our
client, the stop loss insurer, can bid on all of the groups
serviced by the TPA. Therefore, any renewal business for the TPA
can use the full cost forecasting models. New business for the TPA
will not have claims data available. The enrollment data only new
business model cost forecasting technique is applicable for new
business for the TPA. The enrollment data are needed for the new
group. Future refinements will include combining the historical
payments, summarized by month or quarter, with the enrollment
information since person-level claims will not be available.
[0395] In order to understand the performance of CapCost versus the
traditional specific plus aggregate stop loss insurance, we had to
create synthetic groups since our database only contained 116
actual groups of very different sizes. Monte Carlo random samples
were developed for synthetic groups of 50, 100, 250, 500, 750,
1000, and 1500 employees plus their dependents. A group of 50
employees is smaller than the smallest employer in the target
market and 1500 employees is toward the upper end of the target
market for stop loss health insurance. Five hundred random groups
were selected with replacement. All family members of the employees
were included in the group. The claims payments were calculated for
traditional $50,000 specific with 125% aggregate exclusive of
specific and for CapCost 110.TM.. CapCost 110.TM. is aggregate only
at 110% of the attachment point. TruRisk models were applied to
forecast next years claim payments. CapCost 110.TM. medical claims
payments for groups of 50 employees is about 80% of the claims paid
out for traditional $50,000 specific plus $125% aggregate stop
loss. Once there are 250 or more employees the CapCost 110.TM.
claims pay out is less than 50% of the traditional stop loss
coverage. Similar results were seen for $25,000 specific and
$75,000 specific both plus 125% aggregate coverage. The pay out for
CapCost 110.TM. is much lower for $25,000 specific plus 125%
aggregate and closer to the $75,000 specific plus 125% aggregate.
The mean and standard deviation are presented in TABLE 6 for three
different size groups. 125% aggregate is included with each of the
specific coverage. The mean claims paid out are less with CapCost
110.TM. and the standard deviation is smaller than with traditional
stop loss coverage. The main factor causing this is the far lower
frequency of claims with CapCost 110.TM. (18-26% of groups) as
compared to traditional specific plus aggregate coverage (87-98% of
groups). When a claim was made with CapCost 110.TM. coverage, it
was greater and the standard deviation was also greater than for
claims with traditional stop loss coverage.
[0396] The claims paid out for CapCost 110.TM. and traditional stop
loss are highly correlated:
R=0.95 for 250 employees with $25,000 specific and 125% aggregate
R=0.91 for 500 employees with $50,000 specific and 125% aggregate
R=0.87 for 750 employees with $75,000 specific and 125% aggregate
The risks or claims paid out are correlated but lower for CapCost
110.TM. since the claim frequency is far lower with that
coverage.
[0397] An aggregate only policy can be underwritten using the
group-level experience for credible groups. However, it is very
important to accurately estimate the group's costs for next year
since that determines the 110% attachment point. Therefore, the
MAP4HIP cost forecasting method is recommended as the preferred
embodiment since the predicted mean cost is more accurate than the
predicted mean cost derived using the standard approach with
group-level experience as predictor. The same steps are taken in
developing the models for CapCost as are used with the general
MAP4HIP process. The only difference is the variety of TPAs as
multiple data sources versus one CI with fully insured medical.
Person-level and group-level models are developed for cost per
person per day. The risk factors, statistical methods and dependent
variables are the same. The attachment point needs to be set to the
appropriate amount (e.g., a 110% attachment point is calculated by
multiplying the cost trend adjusted forecast cost by 1.1).
[0398] The aggregate only cost forecasting product can be delivered
as a service bureau product or as software, either stand alone or
an ISP model, using the same data flows as used with the cost
forecasting models for fully insured coverage.
TABLE-US-00011 TABLE 6 250 employees 500 employees 750 employees
CapCost $25,000 spec CapCost $50,000 spec CapCost $75,000 spec
total 500 groups $/employee 229 582 104 278 87 212 std. dev. 681
789 301 385 229 303 group claims > 0 % groups > 0 26.40%
98.20% 18.20% 89.20% 20.80% 87.00% # groups > 0 132 491 91 446
104 435 $/employee 867 592 569 311 419 243 std. dev. 1099 791 483
394 336 312 minimum 2.08 0.3 8.04 6.34 5.32 2.63 maximum 6479 7066
1823 1921 2027 2026
[0399] The MAP4HIP method can be used for cost forecasting for
specific stop loss coverage. Specific stop loss pays for claims
above a specified threshold (i.e., the deductible). Those claims
costs can be forecast using the same techniques that MAP4HIP uses
for forecasting outlier amounts. First, the forecast inflation or
cost trend adjustment for the policy period must be applied to the
model development data. This is a different order of steps from the
standard MAP4HIP sequence but it is necessary due to the specific
deductible. For example, if there is a $50,000 deductible and a 10%
cost trend then a $50,000 claim in the next period would yield a $0
specific claim. If that claim occurred in the policy period after
10% inflation it would produce a $5,000 specific claim
($50,000.times.1.1=$55,000 subtracting the $50,000 deductible
yields a $5,000 specific claim). Inflation during the lag period
must be added also and inflation built into the development model
must be divided out to provide accurate future cost estimates for
modeling specific claims. After the inflation adjustment for the
next period data, costs are then recalculated so that they are zero
if the person's claims are below the deductible in the next year
(similar to Winzorization). If costs total above the deductible,
then the specific cost is set to that amount. Probability models
are developed for claims and no claims people in the base period.
The probabilities are weighted by the average cost in the terminal
node (above the deductible) to produce the expected cost. The
person-level forecasts are summed to make the group-level forecast.
Group-level models with the same risk factors as MAP4HIP are
developed using the residual of the actual specific payments per
person per day minus the forecast specific costs. After development
period models are complete, they can be applied to data from an
underwriting period to develop cost forecasts for a policy
period.
[0400] Aggregate stop loss is frequently added to specific
coverage. The aggregate coverage with specific coverage is paid
exclusive of specific claims and specific claims are not used in
defining the attachment point. Therefore, aggregate stop loss (with
specific coverage also) claim amount can be modeled using the
inlier methods in the MAP4HIP method. The Winsorization point is
the specific deductible. As with specific, the cost trend forecast
for the policy period must be applied to the next period data prior
to the inlier calculations. Only inliers are modeled since the
specific costs will be borne by the specific coverage. Both the
specific and aggregate with specific should be modeled and priced
separately. Note that this is different from aggregate only stop
loss coverage since all costs contribute to the attachment point
and aggregate claim amount for aggregate only stop loss
coverage.
[0401] The specific cost forecasting and specific plus aggregate
cost forecasting products can be delivered as a service bureau
product or as software, either stand alone or an ISP model, using
the same data flows as used with the cost forecasting models for
fully insured coverage.
[0402] Group short term disability insurance (STD) is insurance
that pays a portion of an employees wages (typically 50-100%), a
flat amount or the lesser of the portion or the flat amount when an
employee is disabled due to a non-work related accident, sickness
or pregnancy. The duration of the salary replacement is typically
13, 26 or 52 weeks. The MAP4HIP method can be applied to forecast
STD payments with a few modifications. The potential risk factors
are the same as the risk factors used with medical insurance and
described in section 806 with the additional risk factors of number
of STD days and payments in the base and underwriting periods and
job classification when these data are available. Otherwise, the
exact same potential risk factors as used with MAP4HIP can be
linked to the STD days next year and modeled using the MAP4HIP
modeling techniques and processes. The dependent variable in the
model development database is the number of STD days in the next
period. In other words, the medical claims and STD days in the base
period are linked in the database to STD days in the next period
for the same person and a STD day forecasting model for the next
period is developed. The interaction capturing techniques and other
modeling methods are the same as for medical claims but it is
unlikely that the data need to be Winsorized and outliers modeled
separately since STD is capped at a short period. The development
model is applied to score the actual underwriting period data to
calculate the expected number of STD days during the policy period
to calculate the forecast claim amount. The expected number of STD
days needs to be weighted by the expected cost per STD day. This
can be calculated by averaging the STD cost per day in the
underwriting period and increasing it by wage inflation and
multiplying it by the expected number of STD days. Alternatively
and preferably, each person's salary or flat rate benefit is linked
to the database and the forecast STD days are multiplied by the STD
per day benefit amount (i.e., portion of salary covered by STD) and
increased by the salary inflation history. The STD cost per person
is summed to produce the group's expected cost. Confidence bounds
can be calculated for the number of expected STD days to provide a
range of high to low cost for the group. A group-level model is
built using the same group characteristics as with MAP4HIP and
possibly supplemented with characteristics of the benefit plan. The
group-level dependent variable is residual STD days per person
weighted by the mean cost per person per day to calculate the
forecast claim amount.
[0403] The STD cost forecasting product can be delivered as a
service bureau product or as software, either stand alone or an ISP
model, using the same data flows (with STD days and salary
information added) as used with the cost forecasting models for
fully insured coverage.
[0404] Long term disability insurance (LTD) is wage replacement
insurance for disabilities that run longer than STD coverage and
may continue until the insured is 65 years old. Group LTD coverage
is for a policy period that is typically one year. The insurer does
not bear the cost of continuing disability liability from previous
periods unless it was the insurer for that period also. The insurer
will bear the cost for new long term disabilities that occur during
the policy period and will continue to be responsible for that cost
until the coverage expires (e.g., the beneficiary dies or turns 65
years old) or the beneficiary can go back to work. The probability
of a LTD claim occurring during the policy period (i.e., the
dependent measure) can be modeled and forecast using linked medical
and LTD claims at the person-level. The base period risk factors
are the same as the STD model, including medical claims, and STD
claims with the addition of LTD claims linked, recoded and used as
supplemental risk factors when available. The forecasting model can
be built using only medical claims and enrollment information.
Logistic regression, regression tree or hybrid tree with terminal
nodes feeding into a logistic regression (the hybrid tree being the
preferred embodiment) are the statistical techniques for modeling
the incidence rate of LTD claims during the next period (typically
one year). Other interaction capturing techniques can be used to
predict the incidence rate but must be appropriate for modeling a
variable that is bounded by 0 and 1. The development model is
applied to underwriting period data to calculate the expected
probability of a LTD claim during the policy period. The
probabilities need to be weighted by the expected net present value
of the disability to estimate the total cost of the disability
(i.e., the claim amount). The net present value of the disability
cost is obtained from actuarial tables. The expected costs are
summed across the group members to produce the expected group cost.
The net present value needs to be derived from other databases and
should be conditionalized on the cause of the disability since the
cost will vary depending on the cause. The cause of the disability
can be estimated by the clinical conditions defining the terminal
node of the person. A more accurate total cost of the disability
will be calculated if the weights are conditionalized on the cause
of the disability.
[0405] If a good estimate of the net present value of the future
cost or length of the disability is not available for the various
terminal nodes, then an index can be calculated. This index is the
expected number of new disabilities for the group during the policy
period divided by the "average" number of disabilities calculated
using standard actuarial techniques for new business for LTD. A
confidence interval can be calculated for the expected number of
disabilities using the expected probability of disability per
person and computing the upper and lower bounds for the group by
using a Lexian distribution that calculates the exact
probabilities. A binomial distribution can be used but the
confidence interval will not be exact since it assumes that
everybody has the same average probability within the group.
Group's that have a confidence interval that does not cover the
"average" calculated from standard actuarial techniques are
significantly higher or lower in risk and should be priced
differently than the average group. Alternatively and preferably,
the group's standard deviation from the mean expected number of LTD
cases can be calculated using on of the distributions above. The
number of standard deviations from the mean is a scale that can be
used for pricing. The end points of the scale can be anchored by
market prices for the lowest and highest risk market prices or by
actual historical LTD experience, conditionalized on group
size.
[0406] The LTD cost forecasting product can be delivered as a
service bureau product or as software, either stand alone or an ISP
model, using the same data flows (with the addition of STD and LTD
claims and salary information) as used with the cost forecasting
models for fully insured coverage.
[0407] Group term life insurance is very similar to group
disability, it is for a policy period (usually one year) and the
coverage and rates are typically not guaranteed beyond that period.
Unlike LTD, the death benefit is a one-time payment for a known
amount (the amount is usually a multiple of salary up to a limit)
so there is no uncertainty over the size of the benefit. Therefore,
knowing the expected number of deaths (weighted by the amount of
the life insurance) will provide an accurate estimate of the cost
of that group. Alternatively, a relative risk index can be
calculated in the same manner as with LTD. The numerator is the
expected number of deaths (possibly weighted by the death benefit)
and the denominator is the "average" number of deaths (possibly
weighted by the death benefit) where the average is calculated
using the age by sex distribution and standard life tables
calculated by actuaries. The significance of the index can be
calculated using the Lexian (preferably) or binomial distributions
for the person-level probabilities and testing if the average is
covered by the confidence bounds for the group. Groups with
expected numbers of deaths outside the average should have higher
or lower rates than average. Groups with large confidence intervals
should be charged more than groups with small confidence intervals,
all other factors being equal.
[0408] The same approach for developing the person-level
probability models is used for life insurance as is used for LTD.
Medical claims from a base period are linked with deaths occurring
in the next period for a very large block of business. The risk
factors are the same as or developed using a similar technique as
used with the medical cost forecasting models. The dependent
variable is the probability of death. The same interaction
capturing techniques used for the LTD probability model are used
for the life insurance model (i.e., the preferred embodiment is the
hybrid probability tree). The developmental model is applied to
medical claims during an underwriting period and death forecasts
are calculated for the policy period. The probability of death is
weighted by the death benefit to calculate the forecast claim
amount per person. The claim amounts are summed across people in
the group. A group-level model can be developed that uses the sum
of the probabilities (i.e., the number of expected deaths), actual
number of deaths in the base period and the number and amount of
STD and LTD claims to supplement the risk factors used in a
standard MAP4HIP group-level model, when available. Otherwise, the
same medical claims and enrollment information used with MAP4HIP
will suffice. The dependent measure is the forecast number of
deaths and is weighted by the expected death benefit per person to
calculate the forecast claim amount.
[0409] The group term life insurance death rate and claim amount
forecasting products can be delivered as a service bureau product
or as software, either stand alone or an ISP model, using the same
data flows (preferably supplemented with the addition of death and
salary information) as used with the cost forecasting models for
fully insured medical coverage.
[0410] While the present invention has been described with respect
to specific embodiments, it will be appreciated that various
alternatives and modifications will be apparent based on the
present disclosure, and are intended to be within the spirit and
scope of the following claims.
TABLE-US-00012 APPENDIX G Data Elements & Descriptions For
Software Of CD-ROM Appendix Field Names Descriptions Legal Values
abdpain Abdominal pain or dxvar = `7890` 1, 0 abheart Abnormal
heart sounds or `7850` <= dxvar <= `7853` 1, 0 acne Acne or
`706` <= dxvar <= `7061` 1, 0 actinseb Actinic and seborrheic
keratosis or `702` <= dxvar <= `70219` 1, 0 acubronc Acute
bronchitis and brochiolitis-dx = 466 1, 0 acuphary Acute
pharyngitis-dxvar = `462` 1, 0 acusinu Acute sinusitis-dxvar =
:`461` 1, 0 acutonsl Acute tonsillitis-dxvar = `463` 1, 0 add
Attention deficit disorder-dxvar = : `3140` 1, 0 agebrk35 age 35+ 1
(35+), 0 (35 under) agegp 0-0.9 then agegp = `a`; 1-4.9 then agegp
= `b`; agegroups values = a-k 5.0-17.9 then agegp = `c`; 18-24.9
then agegp = `d`; 25-34.9 then agegp = `e`; 35-44.9 then agegp =
`f`; 45-54.9 then agegp = `g`; 55-64.9 then agegp = `h`; 65-74.9
then agegp = `i`; 75-84.9 then agegp = `j`; ge 85 then agegp = `k`;
agesq Age Squared ahypothy Acquired hypothyroidism-dxvar =: `244`
1, 0 aidstest AIDS-cpt testing codes if cpts{i} in 0, 1, 2 . . .
number of (`86687`, `86701`, `86702`, `86703`, `86688`, `86689`)
then tests aidstest = sum (aidstest, 1); alcohdep Alcohol
dependence syndrome-dxvar = :`303` 1, 0 alerhin Allergic rhinitis
or dxvar = :`477` 1, 0 amt generic-test purposes 1, 0 anemia
Anemia-`280` <= dxvar <= `2859` 1, 0 anginap Angina pectoris
or dxvar = :`413` 1, 0 antitemp temporary to assist in coding
prenatal cpts{i} in (`59425`, `59426`) 1, 0 anxiety Anxiety
states-dxvar = :`3000` 1, 0 artery Dis of the arteries, arterioles,
and capillaries-`440` <= dxvar <= `4489` 1, 0 arthero
Coronary atherosclerosis or dxvar = :`4140` 1, 0 artipost
Artificial opening status and oth postsurgical states or `V44`
<= dxvar <= 1, 0 `V4589` assault Assault or `E960` <=
dxvar <= `E969` 1, 0 asthma Asthma-dxvar = : `493` 1, 0 attsurgd
Attention to surgical dressing and sutures or dxvar = `V583` 1, 0
bargain BARGAIN STATUS- ? H, S basecat a thru v-see
basecata-basecatv a . . . v basecata .0001 <= chgd <= .33999
1, 0 basecatb .34 <= chgd <= .48999 1, 0 basecatc .49 <=
chgd <= .70999 1, 0 basecatd .71 <= chgd <= 1.03999 1, 0
basecate 1.04 <= chgd <= 1.4999 1, 0 basecatf 1.5 <= chgd
<= 1.99999 1, 0 basecatg 2 <= chgd <= 2.59999 1, 0
basecath 2.6 <= chgd <= 3.44999 1, 0 basecati 3.45 <= chgd
<= 4.54999 1, 0 basecatj 4.55 <= chgd <= 5.9999 1, 0
basecatk 6 <= chgd <= 7.89999 1, 0 basecatl 7.9 <= chgd
<= 10.44999 1, 0 basecatm 10.45 <= chgd <= 13.7999 1, 0
basecatn 13.8 <= chgd <= 18.19999 1, 0 basecato 18.2 <=
chgd <= 23.99999 1, 0 basecatp 24 <= chgd <= 35.99999 1, 0
basecatq 36 <= chgd <= 53.99999 1, 0 basecatr 54 <= chgd
<= 80.99999 1, 0 basecats 81 <= chgd <= 121.49999 1, 0
basecatt 121.5000 <= chgd <= 181.99999 1, 0 basecatu 182
<= chgd <= 272.99999 1, 0 basecatv chgd ge 273 1, 0 baseclmn
# of claims in base pd baseclms presence of base claims - yes/no 1,
0 basemos # OF MONTHS in base period 1-12 baseyr Year associated
with base 2 digit yr - 95, 96, 97, 98 bdate birth date VALID DATE
benfitcd BENEFIT CODE 3 = medical claim bn_oth oth benign neoplasm-
(`210` <= dxvar <= `2159`) or (`217` <= dxvar <=
`2299`) 1, 0 bn_skin Benign neoplasm of skin-dxvar=:`216` 1, 0
bychggp .0001 <= chgd <= .33999 bychggp = 1; .34 <= chgd
<= .48999 bychggp = 2; 1 thru 22 .49 <= chgd <= .70999
bychggp = 3; .71 <= chgd <= 1.03999 bychggp = 4; 1.04 <=
chgd <= 1.4999 bychggp = 5; 1.5 <= chgd <= 1.99999 bychggp
= 6; 2 <= chgd <= 2.59999 bychggp = 7; 2.6 <= chgd <=
3.44999 bychggp = 8; 3.45 <= chgd <= 4.54999 bychggp = 9;
4.55 <= chgd <= 5.9999 bychggp = 10; 6 <= chgd <=
7.89999 bychggp = 11; 7.9 <= chgd <= 10.44999 bychggp = 12;
10.45 <= chgd <= 13.7999 bychggp = 13; 13.8 <= chgd <=
18.819999 bychggp = 14; 18.2 <= chgd <= 23.99999 bychggp =
15; 24 <= chgd <= 36.99999 bychggp = 16; 36 <= chgd <=
53.99999 bychggp = 17; if 54 <= chgd <= 80.99999 bychggp =
18; if 81 <= chgd <= 121.49999 bychggp = 19; 121.5000 <=
chgd <= 181.99999bychggp = 20; 182 <= chgd <= 272.99999
bychggp = 21; chgd ge 273 bychggp = 22; calckidy Calculus of kidney
and ureter 1, 0 cancldte enrl cancel date candidia
Candidiasis-dxvar =: `112` 1, 0 carddysr Cardiac dysrhythmias-dxvar
= : `427` 1, 0 carpltun Carpal tunnel syndrome-dxvar = `3540` 1, 0
CAT4BASE see cat4base1 - 4-code as 1 thru 4 1, 2, 3, 4 cat4base1
group 1 of 4 set-base yr claim .0001 <= chgd <= 1.49999 1, 0
cat4base2 1.5 <= chgd <= 5.99999 1, 0 cat4base3 6.0 <=
chgd <= 23.999999 1, 0 cat4base4 chgd ge 24 1, 0 cataract
Cataract-dxvar = : `366` 1, 0 CATBMOS groupings of months in base
year (with or without chgs) for an ordered A thru F candidate
predictor variable WITH BASE CLAIMS 1 <= basemos <= 3 >
catbmos = `A: 1-3`; basemos = (4, 5) > catbmos = `B: 4-5`;
basemos = (6, 7) > catbmos = `C: 6-7`; basemos = (8, 9) >
catbmos = `D: 8-9`; basemos = (10, 11) > catbmos = `E: 10-11`;
basemos = (12) then catbmos = `F: 12`; WITHOUT BASE CLAIMS basemos
= 1 > catbmos = `A: 1`; basemos = (2, 3, 4, 5) > catbmos =
`B: 2-5`; basemos = (6, 7, 8) > catbmos = `C: 6-8`; basemos =
(9) > catbmos = `D: 9`; basemos = (10, 11) > catbmos = `E:
10-11`; basemos = (12) > catbmos = `F: 12`; celluabs Cellulitis
and abscess 1, 0 cerebrov Cerebrovascular disease-`430` <= dxvar
<= `4389` 1, 0 charge CHARGE CHEMO chegroup codeapy combines
(lspxchem, mcptchem) 1, 0 chestpn Chest pain or dxvar = :`7865` 1,
0 chf Congestive heart failure-dxvar = `4280` 1, 0 chg Charge in
Base Year chgage Base yr chg * Age CHGC2NZ base pd 2n chg category
chgcata Base charge category A 1, 0 chgd Base Pd Chg per Enrolled
Day chgdiff Pred Pd Chg-Base Pd Chg chgl log10 base pd charge chgp
Charge in Prediction Year CHGPC2NZ Nxy yr pd 2n chg category
chgpcata Next Year charge category A 1, 0 chgpd Pred Pd Chg per
Enrolled Day chgpl log10 pred pd charge chgps Spec Pred Pd chg
chgpw Pred Pd Chg Winsor $400k, if chgp > 400000 then chgpw =
400000 + (.5 * (chgp - 400000)); chgpwc Pred Charges w/ Claims
chgpwd Pred Pd Chg per Enrolled Day Winsor $400k chgpwl log10 pred
pd winsorized charge chgs if chg >50000 then chgs = chg - 50000;
else chgs = 0; chgsq Base yr chg squared chgt1 Charge in Base Year
Trimester 1 chgt2 Charge in Base Year Trimester 2 chgt3 Charge in
Base Year Trimester 3 chgw Base Pd Chg Winsor $400k chgwc Base
Charges w/ Claims chgwd96 Next Pd Chg per Enrolled Day Winsor $96pd
DOLLAR SPECIFIC TO SOURCE chgwl log10 base pd winsorized charge
chlamyd Unspecified viral and chlamydial infections-dxvar in
(`0799`, `07998`, `07988` 1, 0 chronbro Chronic and unspecified
bronchiolitis or `490` <= dxvar <= `491` 1, 0 chrsinu Chronic
sinusitis or dxvar = : `473` 1, 0 ckasrcl Check Age/Sex/Relation
cell, occurs when all sex/relationship variables 1, 0 f/m###en, kd
or cl are exhausted claimcde CLAIM CODE REQUIRE SOURCE INPUT
claimno CLAIM NBR REQUIRE SOURCE INPUT clmim1 # Claims October 1995
clmim10 # Claims July 1996 clmim16 # Claims January 1997 clmim22 #
Claims July 1997 clmim28 # Claims January 1998 clmim34 # Claims
July 1998 clmim39 # Claims December 1998 clmim4 # Claims January
1996 cmpms Comps of surg and med care, not elsewhere
classified-`996` <= dxvar <= 1, 0 `9999` cobgnpd Month #
Company Enrollment Starts 1-39 coendpd Month # Company Enrollment
Ends 1-39 coins COINSURANCE amount commdsr Potential health hazards
related to communicable Dis 1, 0 compcde enrl company REQUIRE
SOURCE INPUT compcode COMPANY CODE REQUIRE SOURCE INPUT complic
combines (cmpms, dxcomp, gadvmed) 1, 0 compname COMPANY NAME
REQUIRE SOURCE INPUT conderm Contact dermatitis and oth eczema 1, 0
conjunct Conjunctivitis-`3720` <= dxvar <= `3729` 1, 0
constip Constipation or dxvar = `5640` 1, 0 contus Contusions with
intact skin surfaces or `920` <= dxvar <= `9249` 1, 0 convuls
Convulsions or dxvar = : `7803` 1, 0 corncal Corns, callosities,
and oth hypertrophic and atrophic skin or `700` <= dxvar <=
1, 0 `7019` cough Cough or dxvar = `7862` 1, 0 cpt CPT CODE cutobjs
Cutting or piercing instruments or objects or dxvar = `E920` 1, 0
cycle Pedal cycle, nontraffic and oth or dxvar IN 1, 0 (`E8003`,
`E8013`, `E8023`, `E8043`, `E8053`, `E8063`, `E8073`, `E8206`,
`E8216`, `E8226`, `E8236`, `E8246`, `E8256`, `E8261`, `E8269`)
cystbldd Cystitis and oth dsrs of the bladder or `595` <= dxvar
<= `5969` 1, 0 cysturin combines(cystbldd, othurin) 1, 0 datechk
CHECK DATE REQUIRE SOURCE INPUT datefrom FROM DATE REQUIRE SOURCE
INPUT dateproc PROCESS DATE REQUIRE SOURCE INPUT daterpt REPORTED
DATE REQUIRE SOURCE INPUT datethru THRU DATE REQUIRE SOURCE INPUT
deduct DEDUCTIBLE REQUIRE SOURCE INPUT deltemp cpts{i} in (`59100`,
`59830`, `59430`) or `59120` <= cpts{i} <= `59160` or `59812`
<= 1, 0 cpts{i} <= `59821` or `59840` <= cpts{i} <=
`59857` or `59400` <= cpts{i} <= `59414` or `59510` <=
cpts{i} <= `59525` or dxs starting with (`V22`, `V23`) depnbr
DEP NBR 01 = enrollee depress Major depressive disorder- (`2962`
<= dxvar <= `2963`) 1, 0 dermtosi Dermatophytosis-dxvar =:
`110` 1, 0 diab combines (diabmell, dxdiabet) 1, 0 diabmell
Diabetes mellitus-dxvar = :`250` 1, 0 dial combines (lspxdial,
mcptdial) 1, 0 discdsr Intervertebral disc dsrs or dxvar = :`722`
1, 0 disstat DISCHARGE STATUS diverint Diverticula of intestine or
dxvar = :`562` 1, 0 dizzi Dizziness and giddiness or dxvar = `7804`
1, 0 dob date of birth VALID DATE dobpatn PATIENT BIRTH DATE VALID
DATE docspec DOCTOR SPECIALITY ABBR REQUIRE SOURCE INPUT doctype
DOCTOR TYPE REQUIRE SOURCE INPUT
drg DRG specify version drgaltst drug, alcohol, methodone usage
tsts (cpt) if (`80100` <= cpts{i} <= `80103`) or number of
tests (cpts{i} eq `82055`) or (`80150` <= cpts{i} <= `80299`)
then drgaltst = sum(drgaltst, 1); drugdep Drug dependence and
nondependent use of drugs-`304` <= dxvar <= `3059` 1, 0
dsranal Anal and rectal Dis or `569` <= dxvar <= `56949` 1, 0
dsrbone dsrs of bone and cartilage or `730` <= dxvar <=
`73399` 1, 0 dsrbrst dsrs of breast-`610` <= dxvar <= `6119`
1, 0 dsrear dsrs of external ear-dxvar = : `380` 1, 0 dsreyeld dsrs
of eyelids-`373` <= dxvar <= `3749` 1, 0 dsrgallb dsrs of the
gallbladder and biliary tract-`574` <= dxvar <= `5769` 1, 0
dsrlipid dsrs of lipid metabolism-dxvar = : `272` 1, 0 dsrmens dsrs
of menstruation and abnormal bleeding-dxvar = : `626` 1, 0 dsrrefra
dsrs of refraction and accommodation-dxvar = : `367` 1, 0 dx1/proc1
ICD-9-CM CODE specify version dx2/proc2 ICD-9-CM CODE 2 specify
version dx3/proc3 ICD-9-CM CODE 3 specify version dx4 ICD-9-CM CODE
4 specify version dx5 ICD-9-CM CODE 5 specify version dx6-40
ICD-9-CM Diagnosis (after aggregate) specify version dxabort DX
Abortion-630" <= substr (dxvar, 1, 3) <= "639 1, 0 dxblood DX
Blood-"280" <= substr (dxvar, 1, 3) <= "289" 1, 0 dxcircul DX
Circul System-390" <= substr (dxvar, 1, 3) <= "459 1, 0
dxcomp DX Complications of Care-"996" <= substr (dxvar, 1, 3)
<= "999" 1, 0 dxcondtn DX Condn Influence Health Status-V40"
<= substr (dxvar,1,3) <= "V49 1, 0 dxcongen DX Congenital
Anomaly-740" <= substr (dxvar,1,3) <= "759 1, 0 dxdiabet DX
Diabetes-"250" = substr (dxvar,1,3) 1, 0 dxdigest DX Digestive
System-520" <= substr (dxvar,1,3) <= "579 1, 0 dxdonor V59" =
substr (dxvar,1,3 1, 0 dxecode DX E-Code-"E01" <= substr
(dxvar,1,3) <= "E99" 1, 0 dxendocr DX Endocrine, Nutrition,
Metabolic-"240" <= substr (dxvar,1,3) <= "249" or 1, 0 "251"
<= substr (dxvar,1,3) <= "279" dxgu DX GU System-580" <=
substr (dxvar,1,3) <= "629 1, 0 dxinfec DX Infec &
Parasite-"001" <= substr (dxvar,1,3) <= "139" 1, 0 dxinjury
DX Injury-"800" <= substr (dxvar,1,3) <= "959" or 1, 0 "980"
<= substr (dxvar,1,3) <= "959" dxlvebrn DX Liveborn-V30"
<= substr (dxvar,1,3) <= "V39 1, 0 dxmental DX Mental-"290"
<= substr (dxvar,1,3) <= "319" 1, 0 dxmgest DX Multiple
Gestation-"651" = substr (dxvar,1,3) 1, 0 dxmskel DX Musculoskel
& connect tiss-710" <= substr (dxvar,1,3) <= "739 1, 0
dxneoben DX Neoplasm Benign-210" <= substr (dxvar,1,3) <=
"229 1, 0 dxneomal DX Neoplasm Malig-"140" <= substr (dxvar,1,3)
<= "209" 1, 0 dxnerves DX Nervous System-"320" <= substr
(dxvar,1,3) <= "359" 1, 0 dxob DX Preg, Childbirth, Puerp-630"
<= substr (dxvar,1,3) <= "677 1, 0 dxperhis DX Personal
History-dxvar: V10-V19 1, 0 dxperntl DX Perinatal-760" <= substr
(dxvar,1,3) <= "779 1, 0 dxpoison DX Poisoning 1, 0 dxpreg DX
Pregnancy-640" <= substr (dxvar,1,3) <= "649" 1, 0 or"652"
<= substr (dxvar,1,3) <= "667" dxpregv DX Pregnancy
V-Code-V20" <= substr (dxvar,1,3) <= "V29 1, 0 dxresp DX Resp
System-460" <= substr (dxvar,1,3) <= "519 1, 0 dxsense 360"
<= substr (dxvar,1,3) <= "389 dxskin DX Skin &
Subcut-680" <= substr (dxvar,1,3) <= "709 1, 0 dxspecpx DX
Spec Procs & Aftercare-V50" <= substr (dxvar,1,3) <= "V58
1, 0 dxsymptm DX Symptoms, Signs, & III Defined-"780" <=
substr (dxvar,1,3) <= "799" 1, 0 dxvaccin DX Disease Contact or
Vaccine 1, 0 dxvgnldl DX Normal Delivery-"650" = substr (dxvar,1,3)
1, 0 dysp_pul combines (cyspnea, othopd) 1, 0 dyspnea Dyspnea and
respiratory abnormalities-dxvar= :`7860` 1, 0 effdte enrl eff date
VALID DATE encoconr Encounter for contraceptive management-dxvar=
:`V25` 1, 0 ENRLADDR1 address 1 CONFIDENTIAL enRLADDR2 Address 2
CONFIDENTIAL enrlarea AREA CODE CONFIDENTIAL enrlcity city
CONFIDENTIAL enrlm1 Enrolled October 1995 1, 0 enrlm10 Enrolled
July 1996 1, 0 enrlm16 Enrolled January 1997 1, 0 enrlm22 Enrolled
July 1997 1, 0 enrlm28 Enrolled January 1998 1, 0 enrlm34 Enrolled
July 1998 1, 0 enrlm39 Enrolled December 1998 1, 0 enrlm4 Enrolled
January 1996 1, 0 enrlphne phone number CONFIDENTIAL enrlst state
REQUIRE SOURCE INPUT enrollee Person is Enrollee 1, 0 enrrelfm
enrollee relationship enrrells ensagenc Age at end of year 0 Code,
.<age < 1 then ensagenc = "<1"; SEE DESCRIPTION 1 <=
age < 5 then ensagenc = "01-05"; 5 <= age < 18 then
ensagenc = "05-18"; 18 <= age < 25 then ensagenc = "18-25";
25 <= age < 45 then ensagenc = "25-45"; 45 <= age < 65
then ensagenc = "45-65"; 65 <= age < 80 then ensagenc =
"65-80"; 80 <= age then ensagenc = "80+"; ensxkd enrrells = 1
& ensex = M > ensxkd = A; enrrells = 1 & ensex = F >
ensxkd = B; A thru F enrrells = 2 & ensex = M > ensxkd = C;
enrrells = 2 & ensex = F > ensxkd = D; enrrells = (3, 4, 5,
6) > ensxkd = E; else ensxkd = F; entrost comines(artipost,
lspxentr, lspxgast) 1, 0 epistax Epistaxis dxvar = 7847 1, 0 esopha
Esophagitis dxvar = 5301 1, 0 esshyp Essential hypertension-dxvar=
:`401` 1, 0 excamt1 EXCLUSION AMT 1 excamt2 EXCLUSION AMT 2 excamt3
EXCLUSION AMT 3 excamt4 EXCLUSION AMT 4 exccatg1 EXCLUSION CATG
1-CATEGORY DEF - 1-coverage inelig, 2-medical 18-Jan necessity,
3-n/a, 4-deductibles, 5-coins, 6-cob, 7-medicare, 8- contract max,
9-dupicate, 10-n/a, 11-non-cov, 12-copay, 13- flexplan, 14-n/a,
15-exceeds sched, 16-alt proc, 17-panel contract fee, 18-n/a
exccatg2 EXCLUSION CATG 2 - see description of catg 1 see exccatg1
desc exccatg3 EXCLUSION CATG 3 - see description of catg 1 see
exccatg1 desc exccatg4 EXCLUSION CATG 4 - see description of catg 1
see exccatg1 desc exchg2a 2nd Highest month chg ADJacent to 1st`
exchg2b 2nd Highest month chg NOT ADJacent to 1st` exclh1 Base Year
Highest Monthly Pymt Per Day` exclh1ch Base Year Highest Monthly
chg Per Day` exclh2a Baseyr `2nd Highest Monthly Pymt ADJacent to
1st` exclh2b Baseyr `2nd Highest Monthly Pymt NOT ADJacent to 1st`
eyemix combines(cataract, lensrepl, retinldt, scpteye) 1, 0 f0105kd
Female 01-05 Child 1, 0 f0518kd Female 05-18 Child 1, 0 f1825en
Female 18-25 Enrollee 1, 0 f1825sp Female 18-25 Spouse 1, 0 f1865kd
Female 18-65 Child 1, 0 f2545en Female 25-45 Enrollee 1, 0 f2545sp
Female 25-45 Spouse 1, 0 f4565en Female 45-65 Enrollee 1, 0 f4565sp
Female 45-65 Spouse 1, 0 f4580ss Female 45-80 Widow 1, 0 f6580en
Female 65-80 Enrollee 1, 0 f6580sp Female 65-80 Spouse 1, 0 f80pen
Female 80+ Enrollee 1, 0 f80psp Female 80+ Spouse, Widow 1, 0 fall
Falls 1, 0 fam1p1c Family is 1 Par 1 Child 1, 0 fam1p2cp Family is
1 Parent 2+ Children 1, 0 fam2p1cp Family is 2 Parents 1+ Children
1, 0 famcoup Family is Couple 1, 0 famdau Daughter in Family 1, 0
famempo Family Employee Only 1, 0 famenr Enrollee in Family 1, 0
famlst trimn(famlst)||enrrells; 1, 0 famnkid # of Kids per Enrollee
1, 0 famofem Oth Female in Family 1, 0 famomal Oth Male in Family
1, 0 famsdau Step Daughter in Family 1, 0 famsize # Covered Lives
Per Enrollee COUNT famson Son in Family 1, 0 famspse Spouse in
Family 1, 0 famsson Step Son in Family 1, 0 famsurv Surviving
Spouse in Family; 1, 0 firearm Firearm missile 1, 0 firestem Fire,
flames, hot sub, object, caustic, corrosive, steam 1, 0 flt1kd
Female < 1 Child 1, 0 followup Follow-up examination dxvar =:
V67 1, 0 frachand Fracture of hand and fingers dxvar =: (814-8171)
1, 0 fracllim Fracture of lower limb dxvar =: (820-8291) 1, 0
fracoth oth fractures dxvar = 800-81259 or 818-8191 1, 0 fracrad
Fracture of radius and ulna dxvar =: 813 1, 0 fracskul Intracranial
injury, excluding those with skull fracture-`850` <= dxvar <=
`8541` 1, 0 gadvmed Adverse effects of medical treatment dxvar =
E870-E879, E930-E9499 1, 0 gasthemm Gastrointestinal hemorrhage
dxvar =:578 1, 0 gastri Gastritis and duodenitis dxvar = 535 1, 0
gblood Dis of the blood and blood-forming organ-`280` <= dxvar
<= `2899`-group code 1, 0 of anemia, othblood gcircul Dis of the
circulatory system-`390` <= dxvar <= `4599`-group code of
anginap, 1, 0 arthero, othische, carddysr, chf, othheart, esshyp,
cerebrov, artery, hermorrh, othcirc gconanom Congenital anomalies
dxvar = 740-7599 1, 0 gdigest Dis of the digestive system dxvar =
520-5799 1, 0 gendo Endocrine, nutril and metab Dis, and immunity
dsrs-`240` <= dxvar <= `2799`- 1, 0 group code code-ahypothy,
othhyr, diabmell, dsrlipid, obesity, othendo genmedex General
medical examination dxvar =: V70 1, 0 ggenito Dis of the
genitourinary system - GROUP OF 1, 0 OTHURIN, CALCIDY, CYSTBLDD,
HYPROS, INFLFEML, OTHNOLE, DSRBRST, NINFFEM DSRMENS gibluc
combines(gasthemm, stomulcr) 1, 0 gihs Suppl classif of factors
influ hlth stat & contact w hlth se dxvar = V01-V829 1, 0
ginfect Infectious and parasitic Dis-`001` <= dxvar <= `1398`
group code of 1, 0 strep, hivinfec, virlwart, chlamyd, dermtosi,
candidia & ohtinfs ginjpoi Injury and poisoning group code
fracrad frachand fracllim fracoth sprnwrst 1, 0 sprnkne sprnankl
sprnneck sprnobk sprnostr fracskul owndhd owndhnd othopnwd suprcorn
othspin contus oinjury poison unspex cmpms ginjudet Injuries of
undetermined intent no sub dxvar = E980-E989 1, 0 gintinj
Intentional injuries - group code assault, selfinfl, voilenc, 1, 0
glacoma Glaucoma-dxvar =: `365` 1, 0 gmentl Mental dsrs-`290` <=
dxvar <= `319` - group code of schizo, depress, othpsycy, 1, 0
anxiety, neurotic, alcohdep, drugdep, stress, othdepr, add &
othmentl gmuscu Dis of the musculoskeletal system and connective
tissue dxvar = 710-7399 1, 0 gneoplsm Neoplasm-`140` <= dxvar
<= `2399` - group code of mn_coln, mn_skin, mn_brst, 1, 0
mn_pros, mn_lymp, mn_oth, & secondary neo's bn_skin, bn_oth,
neounsp gnervous Dis of the nervous system and sense organs-`320`
<= dxvar <= `3899` group 1, 0 code of migraine, othcentr,
carpltun, othnerv, retinldt, glacoma, cataract, dsrrefra, conjunct,
dsreyeld, otheye, dsrear, otitismd, othear gperi Certain cond
originating in the perinatal period NO SUB CATS-760-7799 1, 0
gpregn Comps of pregnancy, childbirth, and the puerperium NO SUB
CATS 1, 0 dxvar = 630-677 gpsorias Psoriasis and similar dsrs group
code of oinfskin, corncal, actinseb, acne, 1, 0 sepacyst, urticari,
osksub gresp Dis of the respiratory system-`460` <= dxvar <=
`5199` - group code of 1, 0 acusinu, acuphary, acutonsl, acubronc,
othacres, chrsinu, alerhin, chronbro, asthmas, othopd, othresp
gskin Dis of the skin and subcutaneous tissue group code -
celluabs, oiskin, 1, 0 conderm gsymsig Symptons, Signs, and
III-defined cond group code - syncope, convuls, dizzi, 1, 0 pyrexi,
suminteg, headach epistax, abheart, dyspnea, cough, chestpn,
sympurin, abdpain, othssil gunint Unintentional injuries group code
- fall, mototraf, struck, overext, cutobjs, 1, 0 natenvr, poisdrg,
firestem, machinr, cycle, mototra, othtran, firearm, othclas,
mechunsp gynexam Gynecological examination-dxvar = `V723` 1, 0 hchc
hcpcs CODES 1, 0 headach Headache-dxvar = `7840` 1, 0
hemat combines (anemia, dxblood, acutonsl, gblood, othblood) 1, 0
hermorrh Hemorrhoids-dxvar =: `455` 1, 0 herniabd Hernia of
abdominal cavity-`550` <= dxvar <= `5539` 1, 0 Hi1dvby The
index of Highest cost per day divided by Average cost per day per
month Hi2dvby The index of 2.sup.nd Highest cost per day divide by
Average cost per day per month Hibych2a (1, 0) 1 = The second
highest month cost per day is adjacent to the first month Hibymos1
The maximum cost per day for any month cost for the base year
Hibych2b (1, 0) 1 = The second highest month cost per day is not
adjacent to the first month Hibymos2 The 2.sup.nd Highest cost per
day for any month for the base year hilo classify high cost nxt yr
cases based on charges 0-low <96, 1-High ge 96 hilopay classify
high cost nxt yr cases based on payments 0-low <68.5, 1-High ge
68.5 hivinfec HIV infection-dx starting w/042 1, 0 hspatri1 Hosp
Admit in Trimes 1 COUNT hspatri2 Hosp Admit in Trimes 2 COUNT
hspatri3 Hasp Admit in Trimes 3 COUNT hsptlos Total Hospital LOS
DAYS hsptlosc Total Hospital LOS Category hyprpros Hyperplasia of
prostate-dxvar = `600` 1, 0 icu_etc combines(lspxvein, lspxvent,
mcptccth, mcptintr, pcptcrit, pulart) 1, 0 infertil Any mention of
infertility male or female (cpt) or dxvar in: (`628`, `606`) 1, 0
inflfeml Inflammatory dsrs of female pelvic organs-`614` <=
dxvar <= `6169` 1, 0 irratcol Irritable colon-dxvar = `5641` 1.0
itemno ITEM NBR jntdsrs Derangements and oth and unspecified joint
dsrs-`717` <= dxvar <= `7199` 1, 0 kid1_3 Count of the Number
of Children in a family. 0 = no children, 1, 2 or 3 or more 0-3
children lensrepl Lens replaced by pseudophakos-dxvar = `V431` 1, 0
locatnme LOCATION NAME CONFIDENTIAL locatno LOCATION CONFIDENTIAL
logi combines (constip, diverint, othdiges) 1, 0 lspxampu Life PX
Amputation cpts{i} in 1, 0 (`23900`, `23920`, `24900`, `25900`,
`25927`, `27295`, `27590`, `27591`, `27592`, `27596`, 1, 0 `27598`,
`27880`, `27881`, `27882`, `27886`, `27888`, `27889`, `28880`,
`28805`) lspxchem Life PX Chegroup codeapy-cpts{i} in 1, 0
(`96400`, `96408`, `96410`, `96412`, `96414`, `96420`, `96422`,
`96423`, `96425`, `96445`, `96450`, `96520`) lspxdial Life PX
Dialysis cpts in `90935`, `90937`, `90945`, `90947` 1, 0 lspxentr
Life PX Enterostomy-cpts{i} in 1, 0 (`44300`, `44310`, `44312`,
`44314`, `44316`, `44320`, `44322`, `44340`, `44345`, `44346`)
lspxgast Life PX Gastrostomy cpts in `3750`, `43760`, `43830`,
`43832` 1, 0 lspxorgn Life PX Major Organ Transplants-cpts{i} in
(`33935`, `33945`, `47135`, `40260`) 1, 0 lspxradt Life PX
Radiation Therapy-if cpts{i} 1, 0 in `77261 `77263`, `77280`,
`77285`, `77290`, `77295`, `77299`, `77300`, `77305`, `77310`,
`77315`, `77321`, `77326`, `77327-8, `77331`, `77336`, `77370`,
`77399`, `77401- 4`, `77406-9`, `77411-4`, `77416-20`, `77425`,
`77430- 2`, `77470`, `77499`, `77000`, `77605`, `77610`, `77615`,
`77620`, `77750`, `77761`- 3`, `77776-8`, `77781-4`, `77789`,
`77790`, `77799` lspxtrch Life PX Tracheostomy-cpts{i} in (`31600`,
`31603`, `31610`) 1, 0 lspxvein Life PX Venous Access Port-cpts{i}
in (`36495`, `36496`) 1, 0 lspxvent Life PX
Intubation/Ventilation-cpts in (`31500`, `94650`, `94651`, `94656`,
`94657` 1, 0 lumbago Lumbago dxvar = `7242` 1, 0 m0105kd Male 01-15
Child 1, 0 m0518kd Male 05-18 Child 1, 0 m1825en Male 18-25
Enrollee 1, 0 m1845sp Male 18-45 Spouse 1, 0 m1865kd Male 18-65
Child 1, 0 m2545en Male 25-45 Enrollee 1, 0 m4565en Male 45-65
Enrollee 1, 0 m4565sp Male 45-65 Spouse 1, 0 m6580en Male 65-80
Enrollee 1, 0 m6580sp Male 65-80 Spouse 1, 0 m80pen Male 80+
Enrollee 1, 0 m80psp Male 80+ Spouse, 65+ Widower 1, 0 machinr
Machinery-dxvar =: `E919` 1, 0 male 1, 0 mcptallr Med CPT Allergy,
"95004" <= cpts{i} <= "95199" 1, 0 mcptcard Med CPT Cardiogr,
"93000" <= cpts{i} <= "93350" 1, 0 mcptcarv Med CPT
CardVascThor, "92950" <= cpts{i} <= "92996" 1, 0 mcptccth Med
CPT CardCath, "93501" <=cpts{i} <= "93572" 1, 0 mcptchem Med
CPT Chegroup code, "96400" <= cpts{i} <= "96549" 1, 0 mcptcns
Med CPT CNS, "96100" <= cpts{i} <= "96117" 1, 0 mcptderm Med
CPT Dermatology, "96900" <= cpts{i} <= "96999" 1, 0 mcptdial
Med CPT Dialysis, "90918" <= cpts{i} <= "90999" 1, 0 mcptent
Med CPT ENT, "92502" <= cpts{i} <= "92599" 1, 0 mcptintr Med
CPT IntraCard, "93600" <= cpts{i} <= "93660" 1, 0 mcptneur
Med CPT Neurology, "95805" <= cpts{i} <= "95975" 1, 0
mcptopth Med CPT Opthalm, "92002" <= cpts{i} <= "92499" 1, 0
mcptoste Med CPT OsteoPath, "98926" <= cpts{i} <= "98929" 1,
0 mcptphys Med CPT PhysTher, "97010" <= cpts{i} <= "97999" 1,
0 mcptpsy Med CPT Psych, "90801" <= cpts{i} <= "90899" 1, 0
mcptpulm Med CPT Pulmon, "94010" <= cpts{i} <= "94799" 1, 0
mcptvasc Med CPT VascStudy, "93875" <= cpts{i} <= "93980" 1,
0 mechunsp Mechanism unspecified 1, 0 menopa Menopausal and
postmenopausal dsrs dxvar=:627 1, 0 migraine Migraine-dxvar = :
`346` 1, 0 misc_hrt combines (arthero, carddysr, mcptccth,
mcptintr, othische) 1, 0 mlt1kd Male < 1 Child 1, 0 mn_brst
Malignant neoplasm of breast-`174` <= dxvar <= `1759`) or
(dxvar = `19881`) 1, 0 mn_coln Malignant neoplasm of colon and
rectum-(`153` <= dxvar <= `1548`) or 1, 0 (dxvar = `1975`)
mn_lymp Malignant neoplasm of lymphatic and hematopoietic
tissue-dxvar in (`1765`, 1, 0 `1969`)) or (`200` <= dxvar <=
`20891`) mn_oth oth malignant neoplasm-(`140`<= dxvar <=
`1529`) or (`155`-`1719`) or(`1761`-`1764`) 1, 0 or (`1766`-`1849`)
or (`186`-`1958`) or (`197`-`1974`) or (`1976`-`1981`) or
(`1983`-`1987`) or (`19882`-`1991`) or (`230`-`2349`) or dxvar =
`1988` mn_pros Malignant neoplasm of prostate-dxvar = `185` 1, 0
mn_skin Malignant neoplasm of skin-(`172` <= dxvar <= `1739`)
or 1, 0 (dxvar in (`1760`, `1982`)) MOSA thru F dummies for CATBMOS
1, 0 motontra Motor vehicle, nontraffic- 1, 0 dx(`E8200`, `E8210`,
`E8220`, `E8230`, `E8240`, `E8250`, `E8205`, `E8215`, `E8225`,
`E8235`, `E8245`, `E8255`, `E8207`, `E8217`, `E8227`, `E8237`,
`E8247`, `E8257`, `E8209`, `E8219`, `E8229`, `E8239`, `E8249`,
`E8259`) mototraf Motor vehicle, traffic-`E810` <= dxvar <=
`E8199` 1, 0 mrh1drg 1st Most Recent Hosp DRG DRG mrh1los 1st Most
Recent Hosp LOS DAYS mrh1mdc 1st Most Recent Hosp MDC MDC mrh1ms
1st Most Recent Hosp Medsurg Medical surgical indicator mrh2drg 2nd
Most Recent Hosp DRG DRG mrh2los 2nd Most Recent Hosp LOS DAYS
mrh2mdc 2nd Most Recent Hosp MDC MDC mrh2ms 2nd Most Recent Hosp
Medsurg Medical surgical indicator mrh3drg 3rd Most Recent Hosp DRG
DRG mrh3los 3rd Most Recent Hosp LOS DAYS mrh3mdc 3rd Most Recent
Hosp MDC MDC mrh3ms 3rd Most Recent Hosp Medsurg Medical surgical
indicator mxchgtri replaces chgt1-t3 and uses index 1, 2, 3 mylagi
Myalgia and myositis, unspecified-dxvar = `7291` 1, 0 namefir FIRST
NAME confidential namelast LAST NAME confidential namemidl MIDDLE
INITIAL confidential natenvr Natural and environmental
factors-(`E900` <= dxvar <= `E9099`) or (`E9280` <= 1, 0
dxvar <= `E9282`) ncpt9xc # of 9xxxx cpts in year category count
ncpt9xxx # of 9xxxx cpts in year count neounsp Neop of uncertain
behavior and unspec nature-`235` <= dxvar <= `2399` 1, 0
nervsys combines (gneoplsm, othcentr) 1, 0 netwkcd NETWORK CODE
confidential netwknme NETWORK NAME confidential neurotic Neurotic
depression-dxvar = `3004` 1, 0 newchg10 Mean of BaseChg months
minus 2 highest months' newpay10 Mean of BasePay months minus 2
highest months' nhosps # of hosp visits count nhospsc # of hosp
visits Category ninenter Noninfectious enteritis and colitis `555`
<= dxvar <= `5589` 1, 0 ninffem Noninflammatory dsrs of
female genital organs dxvar = 622-6249 1, 0 nobasepy `basechg
without payment` 1, 0 noclaims No Claims in Base or Study Period 1,
0 normpreg Normal pregnancy 1, 0 numagegp 0 <= ensagen <= 0.9
numagegp = 1; 1 <= ensagen <= 4.9 numagegp = 2; 1 thru 11 5.0
<= ensagen <= 17.9 numagegp = 3; 18 <= ensagen <= 24.9
numagegp = 4; 25 <= ensagen <= 34.9 numagegp = 5; 35 <=
ensagen <= 44.9 numagegp = 6; 45 <= ensagen <= 54.9
numagegp = 7; 55 <= ensagen <= 64.9 numagegp = 8; 65 <=
ensagen <= 74.9 numagegp = 9; 75 <= ensagen <= 84.9
numagegp = 10; ensagen ge 85 numagegp = 11; obesity Obesity-dxvar =
: `2780` 1, 0 obseval Observation and evaluation for suspected cond
not found dxvar = : v71 1, 0 oinfskn other inflammatory condition
of skin and subcutaneous tissue dxvar = 690-6918, 1, 0 693-6959,
697-6989 oinjury oth injuries 1, 0 oiskin oth infection of the skin
and subcutaneous tissue 1, 0 omusccn oth Dis of the muscutoskeletal
system and connective tissue dxvar-734-7399 1, 0 osksub oth dsrs of
the skin and subcutaneous tissue-dxvar: 7028, 709, 703-7059, 1, 0
7063-7079 ostealld Osteoarthrosis and allied dsrs-dxvar: 715 1, 0
othacres oth acute respiratory infections-(dxvar = `460`) or (`464`
<= dxvar <= 1, 0 `4659`) otharth oth arthropathies and
related dsrs-dxvar 710-7138, 7141-7149, :716 1, 0 othblood oth Dis
of the blood and blood-forming organs-`286` <= dxvar <=
`2899` 1, 0 othcentr oth dsrs of the central nervous system-(`320`
<= dxvar <= `326`) or (`330` <= 1, 0 dxvar <= `3379`)
or (`340` <= dxvar <=`3459`) or (`347` <= dxvar <=
`3499`) othcirc oth Dis of the circulatory system-(dxvar IN (`390`,
`3929`, `403`, `405`, `417`)) 1, 0 or (`451`-`4549`) or
(`456`-`4599`) othclas oth and not elsewhere classified-dxvar
E925-E9269, E988, E9290-E929, E925-E9269, 1, 0 E9288, E9290-E929
othdepr Depressive reaction, not elsewhere classified-dxvar = `311`
1, 0 othdiges oth Dis of the digestive system-DXVAR 526-5300,
5302-5309, 536-5439, 5642-5649, 1, 0 567-5689, 5695-5739, :(560,
577, 579) othdorso oth dorsopathies-DXVAR 720-72191, 723-7241,
7243-7249 1, 0 othear oth Dis of the ear and mastoid process-`383`
<= dxvar <= `3899` 1, 0 othendo oth endocrine, nutrit and
metabolic Dis, and immunity dsrs- 1, 0 (`251` <= dxvar <=
`2719`) or (`273` <= dxvar <= `2779`) or (`2781`<= dxvar
<= `27903`) otheye oth dsrs of the eye and adnexa-(dxvar =
:`360`) or (`363`-`3649`)or (`368`-`3699`) 1, 0 or (`370`-`3719`)
or (`3724`-`3729`) or (`375`-`3799`) otheye (dxvar = :`360`) or
(`363` <= dxvar <= `3649`) or (`368` <= dxvar <= 1, 0
`3699`) or (`370` <= dxvar <= `3719`) (`3724` <= dxvar
<= `3729`) or (`375` <= dxvar <= `3799`) othfeml oth dsrs
of the female genital tract-DXVAR 617-6199, 621, 625 628, 629 1, 0
othhealt oth factors influencing hlth stat and contact with hlth
serv-DXVAR V200-201, 1, 0 :V21, V290-V430, V432-V389, V46-V669,
V68-V699, V720-V722, V724-V829 othheart oth heart disease-(`391`
<= dxvar <= `3920`) or (`393`-`39899`) or (dxvar IN 1, 0
:(`402`, `404`)) or (`415`-`4169`)or (`420`-`4269`) or
(`4281`-`4299`) othinfs oth infectious and parasitic disease-(`001`
<= dxvar <= `0339`) or (`0341`-`0419`) 1, 0 or (`045`-`0780`)
or (`0782`-`07981`) or (-`07999`) or (`080`-`1049`) or
(dxvar= :`111`) or (`114`-`1398`) othische oth ischemic heart
disease-DXVAR 410-412, 4141-4149 1, 0 othmale oth dsrs of male
genital organs-DXVAR 601-6089 1, 0 othmentl oth mental dsrs-(`312`
<= dxvar <= `3139`) or (`3141`-`319`) or (`3001`-`3003`) 1, 0
or (`3005`-`3009`) or (`301`-`3026`) or (`306`-`3079`) or (dxvar =:
`310`) othnerv oth dsrs of the nervous system-(`350` <= dxvar
<= `3539`) or (`3541` <= 1, 0 dxvar <= `3599`) othopd oth
COPD and allied cond-DXVAR 492, 494-496 1, 0 othopnwd oth open
wound-DXVAR 874-8812, 884-8977 1, 0 othpsych oth psychoses-(`290`
<= dxvar <= `2949`) or (`2960` <= dxvar <= `2961`) 1, 0
or (`2964` <= dxvar <= `2999`) othrepro oth encounter related
to reproduction-V23-V242, V26-V289 1, 0 othresp oth Dis of the
respiratory system-470-4722, 474-4761, 478, 4780, 4781, :487, 1, 0
500-5199 othrhexb oth rheumatism, excluding back-DXVAR 725,
7271-7279, :728, :7290, 7292-7299 1, 0 othspin oth superficial
injury-DXVAR 910-9180, 9182-9199 1, 0 othssil oth symptoms, signs,
and ill-defined cond-DXVAR 7800-7801, 1, 0 :(7805, 781, 783, 7861),
7807-7809, 7841-78469, 7848-7849, 7854-7859, 7863-7864, 7866-78799,
7891-7999 oththyr oth dsrs of the thyroid gland-(`240` <= dxvar
<= `243`) or 1, 0 (`245` <= dxvar <= `2469`) othtran oth
transportation dxvar FOR E800X-E807X WHEN X EQ 0, 2, 8 OR 9 1, 0
othtype Other genetic typing tsts for transplants cpts{i} in 1, 0
(`86805`, `86806`, `86807`, `86808`, `86821`, `86822`, `86849`)
othurin oth Dis of the urinary system-580-5899, 590-591, 593-5949,
597-5989, 5991-5999 1, 0 otitismd Otitis media and Eustachian tube
dsrs-`381` <= dxvar <= `3829` 1, 0 ounspex oth and
unspecified effects of external causes DXVAR 990-99589 1, 0 overext
Overexertion and strenuous movements DXVAR E927 1, 0 owndhd Open
wound of head-DXVAR 870-8739 1, 0 owndhnd Open wound of hand and
fingers DXVAR 882-8832 1, 0 pay Payment in Base Year payment
PAYMENT AMT payp Payment in Prediction Year pbynoby `prebase, base
1, 0 pbyothr `prebase, base >0` 1, 0 pcptborn CPT Place
Newborn-99431" <= cpts{i} <= "99490 1, 0 pcptcons CPT Place
Consult-99241" <= cpts{i} <= "99275 1, 0 pcptcrit CPT Place
Critical Care-99291" <= cpts{i} <= "99292 1, 0 pcpter CPT
Place ER 99281" <= cpts{i} <= "99288 1, 0 pcpthome CPT Place
Home-99341" <= cpts{i} <= "99353 1, 0 pcpthosp CPT Place
Hosp-99217" <= cpts{i} <= "99238 1, 0 pcptnicu CPT Place Neon
ICU-99295" <= cpts{i} <= "99298 1, 0 pcptnurs CPT Place Nurs
Facil 99301" <= cpts{i} <= "99313 1, 0 pcptoff CPT Place
Office"99201" <= cpts{i} <= "99215" 1, 0 pcptoltf CPT Place
Oth LTCF-99321" <= cpts{i} <= "99333 1, 0 pcptpmed CPT Place
Prev Med-99381" <= cpts{i} <= "99429 1, 0 penvasc combines
(artery, mcptvasc, othcirc) 1, 0 periph Peripheral enthesopathies
and allied dsrs-DXVAR: 726 1, 0 pershyst Potential health hazards
related to personal and family hist-DXVAR V10-V198 1, 0 pharclms
Pharmacy Claims count planname PLAN NAME confidential planno
SERIAL-PLAN NBR confidential pmtchg base & (pmt/basechg ge .2)
as 1 1, 0 pneumon Pneumonia-DXVAR 480-486 1, 0 poisdrg Psning
drugs, med subst, biolog, oth solid, liqd, gases, vapor 1, 0 poison
Poisonings-DXVAR 960-9899 1, 0 postpart Postpartum care and
examination-DXVAR: V24 1, 0 prenatal Undelivered Pregnancy-Prenatal
care 1, 0 prgage age It35 (1), 0 1, 0 provlocn PROVIDER LOCATION
confidential provname PROVIDER NAME confidential provnetw PROVIDER
NETWORK confidential provno PROVIDER NBR confidential provst
PROVIDER STATE confidential provtype PROVIDER TYPE confidential
pulart pulmonary artery cath placement cpts{i} eq `93503` 1, 0
pyrexi Pyrexia of unknown origin: 7806 1, 0 rad combines (Ispxradt,
radther, radnuc) 1, 0 radnuc Nuclear Medicine cpts{i} starting with
(`78`, `79`) 1, 0 radther Any radiation therapy cpts{i} starting
with `77` 1, 0 relation RELATIONSHIP 1-9 `1` = `A Enrollee` `2` =
`B Spouse` `3` = `C Son` `4` = `D Daughter` `5` = `E Stepson` `6` =
`F Stepdaughter` `7` = `G Other Male` `8` = `H Other Female` `9` =
`I Surv Spouse` retinldt Retinal detachment and oth retinal
dsrs-`361` <= dxvar <= `3629` 1, 0 rheuarth Rheumatoid
arthritis-DXVAR 7140 1, 0 routchk Routine infant or child health
checks-DXVAR V202 1, 0 schizo Schizophrenic dsrs-dxvar = `295` 1, 0
scptaudi Surg CPT Auditory, "69" = substr(cpts{i}, 1, 2) 1, 0
scptbaby Surg CPT Matern, "59" = substr(cpts{i}, 1, 2) 1, 0
scptcard Surg CPT Card Vasc, "33" <= substr(cpts{i}, 1, 2) <=
"37" 1, 0 scptdgst Surg CPT Digest, "40" <= substr(cpts{i}, 1,
2) <= "49" 1, 0 scptdiap Surg CPT MED & Diaphr, "39" =
substr(cpts{i}, 1, 2) 1, 0 scptendo Surg CPT Endocrine, "60" =
substr(cpts{i}, 1, 2) 1, 0 scpteye Surg CPT Eye, "65" <=
substr(cpts{i}, 1, 2) <= "68" 1, 0 scptfem Surg CPT
Lap/Perit/Hyst Female Genital, "56" <= substr(cpts{i}, 1, 2)
<= "58" 1, 0 scpthern Surg CPT Hernia & Lymph, "38" =
substr(cpts{i}, 1, 2) 1, 0 scptmale Surg CPT Male Genital, "54"
<= substr(cpts{i}, 1 ,2) <= "55" 1, 0 scptmskl Surg CPT
Muscular-Skeleton, "20" <= substr(cpts{i}, 1, 2) <= "29" 1, 0
scptnrve Surg CPT Nerve, "61" <= substr(cpts{i}, 1, 2) <=
"64" 1, 0 scotresp Surg CPT Respiratory, "30" <= substr(cpts{i},
1, 2) <= "32" 1, 0 scptskin Surg CPT Integument, "10" <=
substr(cpts{i}, 1, 2) <= "19" 1, 0 scpturin Surg CPT Unirnary,
"50" <= substr(cpts{i}, 1, 2) <= "53" 1, 0 selfinfl
Self-inflicted-dxvar e950-e959 1, 0 selmalig combines(mn_coln,
mn_lymph, mn_oth, mn_pros) 1, 0 sepacyst Sebaceous cyst-dxvar 7062
1, 0 servamb Serv Locn Ambulance-servlocn = "11" 1, 0 servasrg Serv
Locn Ambul Surg-servlocn = "16" 1, 0 servecc Serv Locn EM Care
Ctr-servlocn = "09" 1, 0 servehsp Serv Locn Emerg Hosp-servlocn =
"07" 1, 0 servhmhl Serv Locn Home Hlth servlocn = "12" 1, 0
servhome Serv Locn Home-servlocn = "04" 1, 0 serviane Serv Locn
Inpat Anes-servlocn = "15" 1, 0 servih Serv Locn Inpat
Hosp-servlocn = "01" 1, 0 servilab Serv Locn Indep Lab-servlocn =
"08" 1, 0 servlocn SERVICE LOCATION 00-16 servnurs Serv Locn Nurs
Home-servlocn = "05" 1, 0 servoane Serv Locn Outpat Anes-servlocn =
"14" 1, 0 servoff Serv Locn Office-servlocn = "03" 1, 0 servoh Serv
Locn Outpat Hosp-servlocn = "02" 1, 0 servothl Serv Locn Other
locn-servlocn = "10" 1, 0 servphar Sent Locn Pharmacy-servlocn =
"13" 1, 0 servsnf Serv Locn SNF-servlocn = "06" 1, 0 servtype SERV
TYPE list provided by source sex sex 1, 2, 9 sexpatn PATIENT SEX 1,
2, 9 sprnankl Sprains and strains of ankle dxvar-: 8450 1, 0
sprnkne Sprains and strains of knee and leg dxvar-: 844 1, 0
sprnneck Sprains and strains of neck-dxvar 8470 1, 0 sprnobk oth
sprains and strains of back dxvar: 846, 8471-8479 1, 0 sprnostr oth
sprains and strains nos-840-8419, :(843, 8451, 848, :842) 1, 0
sprnwrst Sprains and strains of wrist and hand 1, 0 sqech1 `square
high chg` 1, 0 sqech2a `Square adj chg` 1, 0 sqech2b Square not
ADJacent chg` 1, 0 sqexc2a `Square adj pay` 1, 0 sqexc2b Square not
ADJacent pay` 1, 0 sqexch1 `square high pay` 1, 0 sqnewchg `Square
10mos Bchg` 1, 0 sqnewpy `Square 10mos Bpay` 1, 0 ssn Enrollee SS
number 1, 0 statact Enrollment Type Active, status = `00` 1, 0
statcobr Enrollment Type Cobra, status = 15, 16, 17, 18, 19 1, 0
statlife Enrollment Type Life Only 1, 0 statltd Enrollment Type
LTD, status = 50 1, 0 statmult Enrollment Type Multiple - if 1, 0
sum(statact,statss,statpens,statltd,statcobr,statlife)>1 then
statmult = 1; statpens Enrollment Type Pensioner-status = 10 1, 0
statss Enrollment Type Surv Spouse-status = 01 1, 0 status STATUS,
`00 Active`, `01 Surv Spouse`, `10 Pensioner`, `12 LTD`, `15
Cobra`, `16 Cobra`, `17 Cobra`, `18 Cobra`, `19 Cobra`, `50 Life
Only` stomulcr Ulcer of stomach and small intestine-531-5349 1, 0
strep Streptococcal sore throat-dxo340 1, 0 stress Acute reaction
to stress and adjustment reaction-`308` <= dxvar <= `3099` 1,
0 struck Striking against or struck accidentally by objects or
person 1, 0 suprcorn Superficial injury of cornea-dxvar 9181
surgpath surgical path levels 4, 5, 6 cpt in (`88305`, `88307`,
`88309` 1, 0 syminteg Symptoms involving skin and oth integumentary
tissue dxvar: 782 1, 0 sympurin Symptoms involving urinary
systemdxvar: 788 1, 0 syncope Syncope and collapse dxvar 7802 1, 0
synovit Synovitis and tenosynovitis dxvar: 7270 1, 0 teeth Dis of
the teeth and supporting structures-dxvar 520-5259 1, 0 temppace
Temporary pacer placement cpts{i} in (`33210`, `33211`) 1, 0
tenmoch Average from the sum of all months in the base year
excluding the 2 highest months per day transfus Transfusion Medical
`86850` <= cpts{i} <= `86999` 1, 0 trantype Transplant donor
and genetic typing `86812` <= cpts{i} <= `86817` 1, 0 units
UNITS COUNT urticari Urticaria dxvar: 708 1, 0 uti_unsp Urinary
tract infection, site not specified dxvar 5990 1, 0 violenc oth
causes of violence dxvar E970-E978, E990-E999 1, 0 virlwart Viral
warts-dxvar =: `0781` 1, 0 wbasechg `basechg present` 1, 0 wbasepy
`basepmt present` 1, 0 zipenrl ENROLLEE ZIP CODE 5 digitl zipprov
PROVIDER ZIP CODE 5 digit
Confidential information may be encrypted to protect the identity
and privacy of individuals
* * * * *
References