U.S. patent application number 14/156919 was filed with the patent office on 2014-06-26 for computer implemented method for performing cloud computing on data being stored pseudonymously in a database.
This patent application is currently assigned to COMPUGROUP MEDICAL AG. The applicant listed for this patent is COMPUGROUP MEDICAL AG. Invention is credited to Frank Gotthardt, Jan Lehnhardt, Adrian Spalka.
Application Number | 20140181512 14/156919 |
Document ID | / |
Family ID | 44246648 |
Filed Date | 2014-06-26 |
United States Patent
Application |
20140181512 |
Kind Code |
A1 |
Spalka; Adrian ; et
al. |
June 26, 2014 |
COMPUTER IMPLEMENTED METHOD FOR PERFORMING CLOUD COMPUTING ON DATA
BEING STORED PSEUDONYMOUSLY IN A DATABASE
Abstract
The invention relates to a computer implemented method for
performing cloud computing on data of a first user employing cloud
components, the cloud components comprising a first database and a
data processing component, wherein an asymmetric cryptographic key
pair is associated with the first user, said asymmetric
cryptographic key pair comprising a public key and a private key,
the data being stored pseudonymously non-encrypted in the first
database with the data being assigned to an identifier, wherein the
identifier comprises the public key, the method comprising
retrieving the data from the first database by the data processing
component, wherein retrieving the data from the first database
comprises receiving the identifier and retrieving the data assigned
to the identifier from the first database, wherein the method
further comprises processing the retrieved data by the data
processing component and providing a result of the analysis.
Inventors: |
Spalka; Adrian; (Koblenz,
DE) ; Lehnhardt; Jan; (Koblenz, DE) ;
Gotthardt; Frank; (Eitelborn, DE) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
COMPUGROUP MEDICAL AG |
Koblenz |
|
DE |
|
|
Assignee: |
COMPUGROUP MEDICAL AG
Koblenz
DE
|
Family ID: |
44246648 |
Appl. No.: |
14/156919 |
Filed: |
January 16, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12968530 |
Dec 15, 2010 |
8661247 |
|
|
14156919 |
|
|
|
|
Current U.S.
Class: |
713/165 ;
713/162 |
Current CPC
Class: |
H04L 9/0869 20130101;
H04L 2209/42 20130101; H04L 63/0421 20130101; H04L 9/0863 20130101;
G06F 21/602 20130101; G06F 21/6254 20130101; H04L 9/30 20130101;
G06F 21/64 20130101; H04L 9/0825 20130101 |
Class at
Publication: |
713/165 ;
713/162 |
International
Class: |
H04L 9/08 20060101
H04L009/08 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 18, 2009 |
EP |
EP09179974.2 |
Mar 11, 2010 |
EP |
EP10156171.0 |
Jun 29, 2010 |
EP |
EP10167641.9 |
Aug 18, 2010 |
EP |
EP10173163.6 |
Aug 18, 2010 |
EP |
EP10173175.0 |
Aug 18, 2010 |
EP |
EP10173198.2 |
Dec 13, 2010 |
EP |
EP10194681.2 |
Claims
1.-11. (canceled)
12. A computer implemented method for receiving a result from a
second user of a data processing component of a network cloud by a
first user, wherein an asymmetric cryptographic key pair is
associated with the first user, said key pair comprising a public
key and a private key, the method comprising receiving the result
by said first user with the recipient address at which the result
is received comprising the public key.
13. A computer implemented method for storing data of a first user
in a first or a second database of a network cloud, the first user
having a private key, the method comprising: calculating a set of
public keys, wherein the private key and each public key of the set
of public keys form an asymmetric cryptographic key pair, storing
the data encrypted with one of the public keys and assigned to the
first user in the second database or storing the data
pseudonymously in the first database with the data being assigned
to an identifier, wherein the identifier comprises one of the
public keys.
14. The method of claim 13 further comprising directly receiving
the private key or generating the private key, wherein generating
the private key comprises receiving an input value and applying a
cryptographic one-way function to the input value for generation of
the private key, wherein the cryptographic one-way function is an
injective function.
15.-18. (canceled)
19. A cloud network for performing cloud computing on data of a
first user, the cloud comprising cloud components, the cloud
components comprising a first database and a data processing
component, wherein an asymmetric cryptographic key pair is
associated with the first user, said asymmetric cryptographic key
pair comprising a public key and a private key, the data being
stored pseudonymously non-encrypted in the first database with the
data being assigned to an identifier, wherein the identifier
comprises the public key, wherein the data processing component is
adapted for retrieving the data from the first database, wherein
retrieving the data from the first database comprises receiving the
identifier and retrieving the data assigned to the identifier from
the first database, wherein the data processing component is
further adapted for processing the retrieved data and providing a
result of the analysis.
20. A computer system for receiving a result from a second user of
a network cloud by a first user, wherein an asymmetric
cryptographic key pair is associated with the first user, said key
pair comprising a public key and a private key, the system
comprising means for receiving the result by said first user with
the recipient address at which the result is received comprising
the public key.
21. A computer system for storing data of a first user in a first
or a second database of a network cloud, the first user having a
private key, the system comprising: processor means for calculating
a set of public keys, wherein the private key and each public key
of the set of public keys form an asymmetric cryptographic key
pair, means for storing the data encrypted with one of the public
keys and assigned to the first user in the second database or means
for storing the data pseudonymously in the first database with the
data being assigned to an identifier, wherein the identifier
comprises one of the public keys.
Description
RELATED APPLICATIONS
[0001] This application claims the priority of: [0002] 1. European
Application Number: EP10 194 681.2, filed Dec. 13, 2010; [0003] 2.
European Application Number: EP10 173 198.2, filed Aug. 18, 2010;
[0004] 3. European Application Number: EP10 173 175.0, filed Aug.
18, 2010; [0005] 4. European Application Number: EP10 173 163.6,
filed Aug. 18, 2010; [0006] 5. European Application Number: EP10
167 641.9, filed Jun. 29, 2010; [0007] 6. European Application
Number: EP10 156 171.0, filed Mar. 11, 2010; and [0008] 7. European
Application Number: EP09 179 974.2, filed Dec. 18, 2009.
FIELD OF THE INVENTION
[0009] The present invention relates to the field of computer
implemented identifier generators.
BACKGROUND AND RELATED ART
[0010] Cloud computing is well known in the art and describes
distributed computing in a large network, like the internet,
wherein shared resources, software, and information are provided to
computers and other devices on demand. One popular example for
cloud computing is the usage of web-based applications which can be
accessed and used on demand trough for example web interfaces of
personal computers.
[0011] A problem that arises in cloud computing is that data on
which cloud computing has to be performed has to be provided to
respective cloud components like hardware or software components
distributed within the cloud. However, in case the data comprises
personal information of individual cloud users or any kind of
sensitive information, it is desired to ensure that the information
is stored in a way that it is not possible to analyze the data in
order to draw conclusions on the personal identity of the
users.
SUMMARY
[0012] The invention provides a computer implemented method, a
computer program product and a computing device in the independent
claims. Embodiments are given in the dependent claims.
[0013] The invention provides a computer implemented method for
performing cloud computing on data of a first user employing cloud
components, the cloud components comprising a first database and a
data processing component, wherein an asymmetric cryptographic key
pair is associated with the first user, said asymmetric
cryptographic key pair comprising a public key and a private key,
the data being stored pseudonymously and non-encrypted in the first
database with the data being assigned to an identifier, wherein the
identifier comprises the public key, the method comprising
retrieving the data from the first database by the data processing
component, wherein retrieving the data from the first database
comprises receiving the identifier and retrieving the data assigned
to the identifier from the first database, wherein the method
further comprises processing the retrieved data by the data
processing component and providing a result of the analysis.
[0014] The term `identifier` as used herein may be a reference used
for identifying or locating data in the database. For example, in
some embodiments an identifier may be a pseudonym. The pseudonym
allows identification of the assignment of various records. In
other embodiments the identifier may identify a record or records
within the database. Records may be individual data files or they
may be a collection of data files or tuples in a database relation.
An identifier may be an access key like a primary key for a
relation in a database. An identifier may also be a unique key for
a relation in a relational database.
[0015] Embodiments of the invention have the advantage that even
though personal data of a user is stored in a database in an
unencrypted manner, analyzing the data of the user only provides a
result of the analysis which does not permit to draw any
conclusions about the user's identity. The only identifier which
assigns the data to the user comprises a public key of the first
user, which appears as a random value and thus does not allow to
draw any conclusions about the first user's identity. Thus, cloud
computing can be performed while ensuring a high user anonymity in
the cloud.
[0016] In accordance with an embodiment of the invention, the cloud
components further comprise a second database, wherein initially
the data is only stored encrypted with the aforementioned first
user's public key in the second database, wherein the method
further comprises retrieving the data from the second database and
storing the unencrypted data pseudonymously in the first database
assigned to the identifier, wherein after said storage of the data
in the first database the data is retrieved from the first database
by the data processing component, wherein the data processing is
only performed on said data retrieved from the first database.
[0017] This has the advantage, that the real data processing can be
performed on non-encrypted data by a clear text processing site,
wherein permanent data storage is performed in an encrypted manner.
Thus, the data is only temporally required to be stored
non-encrypted in the first database for the real data processing
procedure. Consequently, this permits to use clear text processing
sites with personal data--a user may decide by himself if he trusts
a certain clear text processing site, in turn decrypts his data,
stores his decrypted data pseudonymously in the first database and
provides the clear text processing site with the respective
identifier. Nevertheless, absolute anonymity in data processing is
ensured since the identifier comprising the user's public key does
not allow to draw any conclusions on the personal identity of the
first user.
[0018] In accordance with a further embodiment of the invention,
the method further comprises storing a result of the data
processing step in the first database assigned to the identifier.
Then, the user may retrieve the result of the data from the first
database and store said retrieved data encrypted with his public
keys in the second database. Finally, it may be preferred that the
result of the data is deleted from the first database. Also, it may
be preferred that the data being stored pseudonymously and
non-encrypted in the first database with the data being assigned to
an identifier will be deleted after a final data processing step on
said data.
[0019] This has the advantage that the first database can be kept
as a temporary storage space while all relevant user data and
respective data analysis on said user data is stored together in
the second database in a highly secure manner.
[0020] Preferably, the first user has a private key, from which a
whole set of public keys is calculated, wherein the private key and
each public key of the set of public keys form an asymmetric
cryptographic key pair. Then, the first user's data is stored
pseudonymously in the first database with the data being assigned
to an identifier, wherein the identifier comprises one of the
public keys. With an increasing number of public keys generated
from a single private key and the data being stored pseudonymously
in the first database with the data being assigned to an
identifier, each identifier comprising a different one of said
user's public keys, personal anonymity is further enhanced. Thus, a
user may be associated with `multiple identities` using a single
credential, namely the private key.
[0021] In an embodiment, the public keys may be used as identifiers
in the first database such that the identifiers are database
identifiers. In this case, the method preferably comprises
depositing data into the database using the identifiers. For
example, the data may comprise medical datasets of a patient, the
patient being the owner of the private key, wherein the medical
datasets may be stored in the database sorted by medical topics. In
this case, an individual public key of the set of public keys of
the patient may be associated with a certain medical topic. Thus,
the medical datasets are not stored in the database using one
common identifier for all medical topics, but the datasets are
stored in a distributed manner in the database using a set of
identifiers with a different identifier for each medical topic.
Nevertheless, by means of his one private key the patient is able
to individually generate the set of public keys for access to the
datasets stored in the database.
[0022] It has to be noted that various computer implemented schemes
for providing an identifier for a database exist with the
identifier being for instance a pseudonym. A pseudonym is typically
used for protecting the informational privacy of a user. Such
computer implemented schemes for providing a pseudonym typically
enable the disclosure of identities of anonymous users if an
authority requests it, if certain conditions are fulfilled. For
example, Benjumea et al, Internet Research, Volume 16, No. 2, 2006
pages 120-139 devise a cryptographic protocol for anonymously
accessing services offered on the web whereby such anonymous
accesses can be disclosed or traced under certain conditions.
[0023] However, even in case absolute anonymity is guaranteed using
a certain pseudonym, with an increasing amount of data the risk
increases that by means of data correlation techniques applied on
the data stored with respect to said pseudonym, conclusions can be
drawn from said data about the owner of the pseudonym. Further, a
large number of different features stored with respect to a person
increases the probability that the person's identity can be
revealed, for example by means of the combination of the ZIP code,
age, profession, marital status and height. Thus, with an
increasing amount of data stored with respect to a pseudonym, the
risk of breaking the user's anonymity is also increasing.
[0024] By means of the above mentioned method of using a private
key from which a whole set of public keys is calculated,
correlation attacks are more likely to fail since correlations can
only be detected for a given single identifier.
[0025] The term `database` as used herein is a collection of
logically-related data or files containing data that provide data
for at least one use or function. Databases are essentially
organized data that may be provided or used by an application.
Examples of a database include, but are not limited to: a
relational database, a file containing data, a folder containing
individual data files, and a collection of computer files
containing data.
[0026] In accordance with an embodiment of the invention, the
identifier corresponds to a public key of the first user.
[0027] In accordance with an embodiment of the invention, the
identifier is a pseudonym of the first user and/or the identifier
is an access key to the data in the database.
[0028] In accordance with an embodiment of the invention, analyzing
the retrieved data is performed by an inference engine. Herein, the
term `inference engine` is understood as any device or computer
program that derives answers from the database. Inference engines
are considered to be a special case of reasoning engines, which can
use more general methods of reasoning. In an embodiment, analyzing
the data can be performed by a decision support system, e.g. in the
medical field for evaluating a user's individual medical data and
processing the data by rules. The result of the evaluation and
processing by rules may be hints and recommendations to the
physician regarding the user's health condition and further
treatment.
[0029] In accordance with an embodiment of the invention, the
method further comprises retrieving a digital signature of the data
stored in the first database along with the data and verifying the
digital signature using the public key of the first user. Thus, the
public key of the first user has two purposes. First, it is used as
an identifier for the user's data in the database. Second, it is
used in order to verify for example the integrity of the user's
data by applying the user's public key to the digital signature of
the data. This significantly simplifies the data analysis process
since no further actions have to be taken in order to ensure the
data integrity. Only the first user is able to sign his data using
his private key--which is only known to the first user.
Consequently, counterfeiting of the user's data is not possible and
also errors which occurred during data storage in the database can
easily be detected.
[0030] In accordance with an embodiment of the invention, providing
the result of the analysis comprises sending the result to said
first user with the recipient address to which the result is sent
comprising the public key.
[0031] This has the advantage, that a `blind messaging` can be
performed. Thus, the recipient's identity is not revealed when
sending the result to the first user with the recipient address to
which the result is sent comprising the first user's public key.
Further, even though the data is stored in the first database
associated with the public key of the first user, it will not be
possible to identify the respective `real person` that is
associated with said public key of the first user even when having
full access to said database.
[0032] Since with the continuously increasing amount of personal
data stored in databases with respect to individual persons, the
people consciousness regarding data privacy protection increases.
This results in the problem that on the one hand personal data has
to remain available for data analysis systems like the above
mentioned clear text processing sites, wherein the data analysis
system must be able to provide an analysis result to the owner of
the data. On the other hand, data privacy protection has to be
ensured. By providing the result of the analysis to the first user
by sending the result to said first user with the recipient address
to which the result is sent comprising the public key, this
conflict is solved in an elegant but safe manner.
[0033] It has to be noted here, that generally the `result of the
analysis` is understood as either the direct outcome of the
analysis performed on the first user's data, or as an indirect
outcome of the analysis performed on the first user's data. In a
practical embodiment, in case the analysis is performed with
respect to a determination if based on the user's data the user
qualifies for participation in a certain disease management program
(DMP), the result of the analysis may either be `qualified for DMP
xyz` which is considered as a direct outcome of the analysis, or
the result of the analysis may just be `please consult a medical
doctor`.
[0034] Similarly, in case the data is medical data, the result of
the analysis may comprise a laboratory value like `liver function
test results a value of 1234` or it may comprise an advice `stop
drinking alcohol`. However, the invention is not limited to medical
data but may comprise any kind of user data like personal
qualifications, personal documents, information about a user's
daily requirements regarding purchased convenience goods,
information about water and electricity consumption etc.
[0035] In accordance with an embodiment of the invention, the
recipient address corresponds to the public key of the first user.
However, the invention is not limited to this specification. For
example, it may be possible to add a domain name to the public key
of the first user such to make such kind of messaging compatible to
already existing internet messaging systems: the invention thus
either permits to direct a result directly to the address
`public_recipient_key` or to direct the result to for example the
email address `public_recipient_key@securedomain.com`.
[0036] In case of SMS messaging in mobile telecommunication
networks, for example the message comprising the result may be
directed to a central provider telephone number, wherein the body
of the SMS message may contain the public key of the first user and
the message `message` like for example `public_recipient_key
message`.
[0037] The skilled person will understand that there are further
possibilities to realize the basic idea according to the invention
in various messaging environments.
[0038] In accordance with an embodiment of the invention, said
result is sent encrypted with the public key to the first user.
Thus, again the public key of the first user has a double purpose:
the first purpose is the usage as anonymous recipient address and
the second purpose is the usage as encryption key. Since only the
recipient possesses the private key, only he will be able to
decrypt the message. Thus, in a highly convenient manner, secure
messaging can be performed in an anonymous manner, wherein only one
type of information is required to be known by the sender: the
public key.
[0039] In accordance with an embodiment of the invention, said
result is sent from a second user, wherein a sender asymmetric
cryptographic key pair is associated with the second user, said key
pair comprising a public sender key and a private sender key, the
method further comprising generating a signature of the result
using the private sender key and sending the signature to said
first user. This further enables the first user to verify the
authenticity of the result in a very convenient manner. Preferably,
the public sender key is available in a respective database such
that it is possible to also verify that the sender of the message
is an ordinary member of the group of participants which is allowed
to send results to the recipient. For example, the second user may
be the provider of the data processing component.
[0040] Referring back to the above mentioned example of providing
advices to the first user based on the outcome of the analysis of
his data, the use of digital signatures ensures that the first user
is protected from fake information or fake advices of third parties
which would confuse or even misguide the first user.
[0041] In accordance with an embodiment of the invention, the
message is a synchronous or asynchronous conferencing message. For
example, synchronous conferencing may comprise any kind of data
conferencing, instant messaging, Internet Relay Chat (IRC),
videoconferencing, voice chat, or VoIP (voice over IP).
Asynchronous conferencing may comprise email, Usenet, SMS or
MMS.
[0042] In accordance with an embodiment of the invention, the
message is an email message with the message being sent to the
first user by email, wherein the email address comprises the public
key of the first user. For example, in this case the public key of
the first user is comprised in the header of the email as the
recipient address to which the message is sent, wherein the message
is comprised in the body of the email. Variations like having the
public key of the first user being comprised in the body with the
message being sent to a central email service with central email
address are also possible.
[0043] In another aspect, the invention relates to a computer
implemented method for receiving a result from a second user of a
data processing component of a network cloud by a first user,
wherein an asymmetric cryptographic key pair is associated with the
first user, said key pair comprising a public key and a private
key, the method comprising receiving the result by said first user
with the recipient address at which the result is received
comprising the public key.
[0044] In another aspect, the invention relates to a computer
implemented method for storing data of a first user or a second
user in a database of a network cloud, the first user having a
private key, the method comprising calculating a set of public
keys, wherein the private key and each public key of the set of
public keys form an asymmetric cryptographic key pair, storing the
data encrypted with one of the public keys and assigned to the
first user in the second database or storing the data
pseudonymously in the first database with the data being assigned
to an identifier, wherein the identifier comprises one of the
public keys.
[0045] In accordance with a further embodiment of the invention,
the method further comprises generating a digital signature for the
data using the private key, wherein the digital signature is stored
into the database along with the data. This embodiment is
particularly advantageous because the digital signature for the
data allows authentication of the data. In this way the authorship
of the data can be verified.
[0046] In accordance with a further embodiment of the invention,
the method further comprises directly receiving the private key or
generating the private key, wherein generating the private key
comprises receiving an input value and applying a cryptographic
one-way function to the input value for generation of the private
key, wherein the cryptographic one-way function is an injective
function.
[0047] In accordance with an embodiment of the invention, the
method further comprises the step of depositing data into the first
database using the identifier. This embodiment is advantageous
because the identifier may be used to control access to the
database. Alternatively the identifier could be used as a pseudonym
for which data deposited into the database is referenced against.
This provides anonymity for a user. Thus, some embodiments of the
present invention are particularly advantageous as an extremely
high degree of protection of the informational privacy of users is
provided. This is because an assignment of the user's identity to
the user's pseudonym does not need to be stored and that no third
party is required for establishing a binding between the pseudonym
and the user's identity. Some embodiments of the present invention
enable to generate a user's pseudonym in response to the user's
entry of a user-selected secret whereby the pseudonym is derived
from the user-selected secret. As the user-selected secret is known
only by the user and not stored on any computer system there is no
feasible way that a third party could break the informational
privacy of the user, even if the computer system would be
confiscated such as by a government authority.
[0048] This enables to store sensitive user data, such as medical
data, in an unencrypted form in a publicly accessible database. The
user's pseudonym can be used as a database identifier, e.g. a
primary key or candidate key value that uniquely identifies tuples
in a database relation, for read and write access to data objects
stored in the database.
[0049] In accordance with a further embodiment of the invention,
the public key of the first user or the set of public keys is
calculated from the private key using elliptic curve cryptography,
wherein said calculation is performed by a variation of the domain
parameters used for performing the elliptic curve cryptography. For
the case of simplicity, only one parameter of the domain parameters
is varied here accordingly. For example, in a first step of
calculating a first public key the private key and a first base
point and a set of further domain parameters may be used. The first
public key is calculated using asymmetric cryptography which is
implemented using elliptical curve cryptography. Then, the first
base point is replaced by a second base point that is not inferable
from the first base point in an easy way in the domain parameters,
wherein the other domain parameters are kept unmodified. Finally, a
second public key is calculated by elliptic curve cryptography
using the private key, the second base point and the set of
unmodified further domain parameters.
[0050] However, the invention is not limited to a variation of base
points for calculating the set of public keys--any of the domain
parameters may be varied for this purpose. Nevertheless, a base
point variation is preferred since this provides a computationally
efficient way to compute multiple identifiers for a given user in a
secure way. Furthermore, it is by far more complicated to vary one
or more of the other domain parameters because doing this would
result in a different elliptic curve that would have to fulfill
many conditions in order to be considered valid.
[0051] This embodiment is advantageous because a single private key
has been used to generate a set of public keys to be used as
identifiers. This is particularly advantageous because the public
keys cannot be inferred from each other supposed their respective
base points cannot be either, yet only a single input value is
needed for all of them. In other words, in case of a base point
variation, knowledge of one of the public keys does not allow an
attacker to determine any other public key. The used public keys
are therefore not correlatable. However, all of the public keys are
determined by a single input value or private key. It has to be
noted that preferably the base points are meant to be public.
Nevertheless, an embodiment of the invention where the base points
are at the user's discretion may also be possible.
[0052] In accordance with a further embodiment of the invention,
the method further comprises either directly receiving the private
key or generating the private key, wherein generating the private
key comprises receiving an input value and applying a cryptographic
one-way function to the input value for generation of the private
key, wherein the cryptographic one-way function is an injective
function.
[0053] This embodiment has the advantage that a user may either
directly use a private key for the generation of the identifiers,
or alternatively he may use a certain input value from which the
private key may be calculated. The input value may be a
user-selected secret.
[0054] The term `user-selected secret` is understood herein as any
secret data that is selected by or related to a user, such as a
user-selected secret password or a secret key, such as a symmetric
cryptographic key. Further, the term `user-selected secret` does
also encompass a combination of biometric data obtained from the
user and a user-selected password or secret key, such as a
biometric hash value of the password or secret key.
[0055] In accordance with a further embodiment of the invention,
the method further comprises receiving the user-selected secret as
the input value, storing the user-selected secret in a memory,
computing the private key by applying an embedding and/or
randomizing function onto the secret, storing the private key in
the memory, computing the set of public keys using the private key
and erasing the secret and the private key from the memory.
[0056] The term `memory` as used herein encompasses any volatile or
non-volatile electronic memory component or a plurality of
electronic memory components, such as a random access memory.
Examples of computer memory include, but are not limited to: RAM
memory, registers, and register files of a processor.
[0057] The term `embedding function` or `embedding component` as
used herein encompasses any injective function that maps the
elements of an n-dimensional space onto elements of an
m-dimensional space, where n>m. For the purpose of this
invention, we focus on embedding functions where m=1. In accordance
with embodiments of this invention n is equal to 2 and m is equal
to 1 for combining two elements onto a single element. In one
embodiment, a user-selected secret and a public parameter are
mapped by the embedding function to the 1-dimensional space to
provide a combination of the user selected secret and the public
parameter, e.g. a single number that embeds the user selected
secret and the public parameter. This single number constitutes the
embedded secret. In another embodiment, a first hash value of the
user selected secret and a random number are mapped by the
embedding function to the 1-dimensional space to provide the
embedded secret.
[0058] A `randomizing function` or `randomizing component` as
understood herein encompasses any injective function that provides
an output of data values that are located within a predefined
interval and wherein the distribution of the data values within the
predefined interval is a substantially uniform distribution.
[0059] The term `embedding and randomizing function` as used herein
encompasses any function that implements both an embedding function
and a randomizing function.
[0060] Even though any known method for generation of asymmetric
cryptographic keys may be employed in order to carry out the
invention, the embodiment employing the user-selected secret for
generating the public key and the private key(s) is particularly
advantageous as an extremely high degree of protection of the
informational privacy of users is provided. This enables to store
sensitive user data, such as medical data, even in an unencrypted
form in a publicly accessible database. A user's public key can be
used as the database identifier, e.g. a primary key or candidate
key value that uniquely identifies tuples in a database relation,
for access to data objects stored in the database.
[0061] The usage of an embedding and/or randomizing function is
advantageous because the input value may be clear text or an easily
guessed value. By using an embedding and/or randomizing function a
pseudonym which is more difficult to decrypt may be
constructed.
[0062] In accordance with an embodiment of the invention, at least
one public parameter is used for applying the embedding and
randomization function. A public parameter may be the name of the
user, an email address of the user or another identifier of the
user that is publicly known or accessible. A combination of the
user-selected secret and the public parameter is generated by the
embedding component of the embedding and randomization function
that is applied on the user-selected secret and the public
parameter.
[0063] The combination can be generated such as by concatenating
the user-selected secret and the public parameter or by performing
a bitwise XOR operation on the user-selected secret and the public
parameter. This is particularly advantageous as two users may by
chance select the same secret and still obtain different
identifiers as the combinations of the user-selected secrets with
the user-specific public parameters differ.
[0064] In accordance with an embodiment of the invention, the
embedding component of the embedding and randomizing function
comprises a binary cantor pairing function. The user-selected
secret and the public parameter are embedded by applying the binary
cantor pairing function on them.
[0065] In accordance with an embodiment of the invention, the
randomizing component of the embedding and randomizing function
uses a symmetric cryptographic algorithm like the Advanced
Encryption Standard (AES) or the Data Encryption Standard (DES) by
means of a symmetric key. This can be performed by encrypting the
output of the embedding component of the embedding and randomizing
function, e.g. the binary cantor pairing function, using AES or
DES.
[0066] In accordance with an embodiment of the invention, the
symmetric key that is used for randomization by means of a
symmetric cryptographic algorithm is user-specific. If the
symmetric key is user-specific, the use of a public parameter can
be skipped, as well as embedding the user-selected secret and the
public parameter; the randomizing function can be applied then
solely on the user-selected secret. By applying a symmetric
cryptographic algorithm onto the user-selected secret using a
user-specific symmetric key embedding can be skipped and
randomization of the user-selected secret is accomplished. If the
symmetric key is not user-specific, the use of the public parameter
and embedding the user-selected secret and the public parameter are
necessary.
[0067] In accordance with an embodiment of the invention, the
embedding and randomizing function is implemented by performing the
steps of applying a first one-way function on the user-selected
secret to provide a first value, providing a random number,
embedding the random number and the first value to provide a
combination, and applying a second one-way function on the
combination to provide a second value, wherein the second value
constitutes the private key. This embodiment is particularly
advantageous as it provides a computationally efficient method of
implementing an embedding and randomization function.
[0068] In accordance with an embodiment of the invention, it is
determined whether the output of the embedding and randomizing
function fulfils a given criterion. For example, it is checked
whether the output of the embedding and randomization function is
within the interval between 2 and n-1, where n is the order of the
elliptic curve. If the output of the embedding and randomizing
function does not fulfill the criterion another random number is
generated and the embedding and randomization function is applied
again to provide another output which is again checked against the
criterion. This process is performed repeatedly until the embedding
and randomizing function provides an output that fulfils the
criterion. The output is then regarded as the private key that is
used to calculate the public key, by multiplying the private key
with the first base point.
[0069] In another aspect, the invention relates to a computer
program product comprising computer executable instructions to
perform any of the method steps described above.
[0070] In another aspect, the invention relates to a cloud network
for performing cloud computing on data of a first user, the cloud
comprising cloud components, the cloud components comprising a
first database and a data processing component, wherein an
asymmetric cryptographic key pair is associated with the first
user, said asymmetric cryptographic key pair comprising a public
key and a private key, the data being stored pseudonymously and
non-encrypted in the first database with the data being assigned to
an identifier, wherein the identifier comprises the public key,
wherein the data processing component is adapted for retrieving the
data from the first database, wherein retrieving the data from the
first database comprises receiving the identifier and retrieving
the data assigned to the identifier from the first database,
wherein the data processing component is further adapted for
processing the retrieved data and providing a result of the
analysis.
[0071] In another aspect, the invention relates to a computer
system for receiving a result from a second user of a network cloud
by a first user, wherein an asymmetric cryptographic key pair is
associated with the first user, said key pair comprising a public
key and a private key, the system comprising means for receiving
the result by said first user with the recipient address at which
the result is received comprising the public key.
[0072] The term `computer system` as used herein encompasses any
device comprising a processor. The term `processor` as used herein
encompasses any electronic component which is able to execute a
program or machine executable instructions. References to the
computing device comprising "a processor" or a "microcontroller"
should be interpreted as possibly containing more than one
processor. The term `computer system` should also be interpreted to
possibly refer to a collection or network of computing devices each
comprising a processor. Many programs have their instructions
performed by multiple processors that may be within the same
computing device or which may be even distributed across multiple
computing devices.
[0073] In accordance with an embodiment of the invention, the
system either comprises means for directly receiving the private
key or means for receiving an input value, wherein the processor
means are further operable for generating the private key, wherein
generating the private key comprises applying a cryptographic
one-way function to the input value for generation of the private
key, wherein the cryptographic one-way function is an injective
function.
[0074] In accordance with an embodiment of the invention, the input
value is a user-selected secret, the system further comprising a
memory for storing the user-selected secret and a private key and a
processor operable for executing instructions stored in the memory,
wherein the memory contains instructions for performing the steps
of: [0075] storing the user-selected secret in the memory; [0076]
computing the private key by applying an embedding and/or
randomizing function onto the secret; [0077] storing the private
key in the memory; [0078] computing the set of public keys using
the private key; and [0079] erasing the secret and the private key
from the memory.
[0080] In another aspect, the invention relates to a computer
system for storing data of a first user in a first or a second
database of a network cloud, the first user having a private key,
the system comprising: [0081] processor means for calculating a set
of public keys, wherein the private key and each public key of the
set of public keys form an asymmetric cryptographic key pair,
[0082] means for storing the data encrypted with one of the public
keys and assigned to the first user in the second database or
[0083] means for storing the data pseudonymously in the first
database with the data being assigned to an identifier, wherein the
identifier comprises one of the public keys.
BRIEF DESCRIPTION OF THE DRAWINGS
[0084] In the following embodiments of the invention are explained
in greater detail, by way of example only, making reference to the
drawings in which:
[0085] FIG. 1 is a block diagram of a first embodiment of a
computer system of the invention,
[0086] FIG. 2 is a flowchart being illustrative of an embodiment of
a method of the invention,
[0087] FIG. 3 is a block diagram of a further embodiment of a
computer system of the invention,
[0088] FIG. 4 is a flowchart being illustrative of a further
embodiment of a method of the invention,
[0089] FIG. 5 is a flowchart being illustrative of a further
embodiment of a method of the invention,
[0090] FIG. 6 is a flowchart being illustrative of a further
embodiment of a method of the invention.
DETAILED DESCRIPTION
[0091] Throughout the following detailed description like elements
of the various embodiments are designated by identical reference
numerals.
[0092] FIG. 1 shows a cloud network for performing cloud computing
on data objects. The cloud network comprises cloud components which
comprise a first pseudonymous database 138, a second database 156
and an analytic system 144. Further, the cloud components comprise
a computer system 100 of a user.
[0093] Without loss of generality, in the following the scenario is
assumed that user data comprises medical data, wherein the analytic
system 144 is part of a computer system of a medical doctor. As a
starting point, medical data objects 120 may have to be stored in
the database 156 in an encrypted manner.
[0094] For this purpose, the computer system 100 that has a user
interface 102 for a user's entry of a user-selected secret 112 that
is designated as s.sub.T in the following. For example, a keyboard
104 may be coupled to the computer system 100 for entry of s.sub.T.
Instead of a keyboard 104 a touch panel or another input device can
be coupled to the computer system 100 for entry of s.sub.T. In
addition, a sensor 106 can be coupled to the computer system 100
such as for capturing biometric data from a biometric feature of
the user. For example, the sensor 106 may be implemented as a
fingerprint sensor in order to provide biometric fingerprint data
to the computer system 100.
[0095] A public parameter, such as the user's name or email
address, can also be entered into the computer system 100 via the
keyboard 104 or otherwise. For example, a personal set V.sub.T,i
containing at least one user-specific public parameter, such as the
user's name or email address, is entered into the computer system
100 by the user T.sub.i.
[0096] The computer system 100 has a memory 108, such as a random
access memory, and at least one processor 110. The memory 108
serves for temporary storage of the user-selected secret s.sub.T
112, a combination 114 of s.sub.T 112 and V.sub.T,i, a private key
116, a public key 118 that constitutes an identifier for a database
and/or pseudonym of the user T.sub.i, and a data object 120, such
as a medical data object containing medical data related to the
user T.sub.i. Further, the memory 108 serves for computer program
instructions 122 to be loaded for execution by the processor
110.
[0097] The computer program instructions 122 provide an embedding
and randomizing function 126, a key generator 128 and may also
provide a database access function 130 when executed by the
processor 110. Further, the instructions 122 may provide data
encryption and decryption capabilities to the computer 100.
[0098] The embedding and randomizing function 126 may be provided
as a single program module or it may be implemented by a separate
embedding function 132 and a separate randomizing function 134. For
example, the embedding function 132 or an embedding component of
the embedding and randomization function 126 provides the
combination 114 by concatenating s.sub.T and the user's name or by
performing a bitwise XOR operation on s.sub.T and the user's
name.
[0099] In one implementation, the embedding and randomizing
function 126 implements symmetric encryption provided by a
symmetric cryptographic algorithm, e.g. AES, using a user-specific
symmetric key for encryption of the user-selected secret 112. This
provides randomizing of s.sub.T 112, while embedding can be
skipped.
[0100] In another implementation, the embedding function 132 is
implemented by a binary cantor pairing function for embedding
s.sub.T 112 and V.sub.T,i, and the randomizing function 134 is
implemented by AES encryption using a symmetric key that is the
same for the entire set of users T.
[0101] In still another embodiment the embedding and randomizing
function 126 is implemented by an embedding function, two different
hash functions and a random number generator (cf. the embodiment of
FIGS. 3 and 4).
[0102] The purpose of the embedding and randomizing function 126 is
to calculate the private key 116, as will be described in detail
with respect to FIG. 4.
[0103] The key generator 128 serves to compute public key 118 using
elliptic curve cryptography (ECC). The base point given by the
domain parameters of the elliptic curve is multiplied by the
private key 116 which provides the public key 118. By varying the
base point and leaving the other domain parameters of the elliptic
curve unchanged multiple identifiers and/or pseudonyms comprising
respective public keys can be computed for the user T.sub.i on the
basis of the same secret s.sub.T. Thus, this results in a set of
public keys.
[0104] The computer system 100 may have a network interface 136 for
coupling the computer system 100 to the database 156 via a
communication network 140, such as the Internet. The database
access function 130 enables to perform a write and a read access
for accessing the data object 120 stored in the database 156
encrypted with the public key 118, wherein the encrypted data
object is denoted by reference numeral 158.
[0105] The analytic system 144 comprises a component 146 for
analyzing data objects of the users T, such as by data mining or
data clustering. In one application the data objects contain
medical data of the various users. By analyzing the various data
objects using techniques such as data mining and/or data clustering
techniques, medical knowledge can be obtained. For example, data
clustering may reveal that certain user attributes contained in the
medical data increase the risk for certain diseases.
[0106] As mentioned above, it is assumed that the analytic system
144 is part of a computer system of a medical doctor. In detail,
the analytic system 144 is a clear text processing site which can
only analyze clear text, i.e. non encrypted data.
[0107] In case the encrypted data 158 comprises medical records of
a certain user, on a visit with a medical doctor the user may wish
to provide these medical records to the doctor in order to enable
him a detailed review on a course of disease of said user. For this
reason, the user may use his computer system 100 and read the
encrypted data 158 from the database 156 using the database access
function 130. Then, using the program 122 he may decrypt the
encrypted data 158 which results in the data object 120. Data
decryption is performed employing the user's private key 116 which
may be generated on demand by the computer 100 when receiving the
user selected secret 112 and the public parameter, as described
above.
[0108] In a subsequent step, the `clear text` data, i.e. the data
object 120 will be stored in the database 138. The network
interface 136 also serves for coupling the computer system 100 to
the database 138 via the communication network 140, wherein the
database access function 130 further enables to perform a write and
a read access for accessing the data object 120 to be stored in the
database 138 using the public key 118 as a database access key,
e.g. a primary key, candidate or foreign key value that uniquely
identifies tuples in a database relation. Alternatively, the public
key 118 may be used as a pseudonym, wherein the data object 120 is
to be stored associated with the pseudonym in the database 138.
[0109] As mentioned above, it is preferred not to store multiple
data objects 120 in the database 138 using only one identifier,
since data correlation analysis performed on said data objects 120
may yield information which may enable to identify the user, i.e.
owner of said data objects. Instead, the data objects are stored in
the database 138 in a distributed manner with different data
objects 120 being accessible with different public keys 118,
wherein the private key 116 and each public key of the set of
public keys form an asymmetric cryptographic key pair.
[0110] After having stored the data object 120 in the database 138
using the public key 118 as the database access key, the user may
provide the public key 118 to the analytic system 144. In turn,
using the public key 118, the system 144 is able to access the
database 138, retrieve the data object 120 stored with respect to
said public key 118 and perform an analysis on the data object 120
using its component 146 for analyzing the data objects.
[0111] In an embodiment, the analytic system 144 may be a decision
support system (DSS) or generally an inference engine. In one
application, the data objects stored in the database 138 contain
medical data of the various users. By analyzing the various data
objects using techniques such as data mining and/or data clustering
techniques, medical knowledge can be obtained. For example, data
clustering may reveal that certain user attributes contained in the
medical data increase the risk for certain diseases.
[0112] It has to be noted here, that generally the computer system
100 may be part of a computer system at a medical doctor. When
having a visit with the doctor, the patient, i.e. the user, may
enter his or her required parameters like the secret 112 in order
to permit the computer system to perform the steps of data
retrieval and pseudonymous data storage as described above. After
these steps are completed, the doctor may send a command to the
analytic system 144 for starting an analysis on the patient's data.
The analytic system may also be part of the computing system 100,
or it may be an external system implemented in hardware or software
in the network cloud.
[0113] The outcome of the analysis performed by the system 144 may
be provided by the analytic system 144 to the users T, i.e. the
owners of the records stored in the database 138. Sending the
result of the individual analysis of the records of each user to
the respective users may be performed by any kind of messaging, as
described above. Preferably, the recipient address to which a
respective analysis result is sent comprises the public key of the
respective user. Reception of the result may be performed using the
system 100 via its interface 136 and a dedicated reception
component implemented by hardware or software.
[0114] In an alternatively preferred embodiment, the outcome of the
analysis is stored again in the database 138, associated with the
user public key 118 as the database access key. Thereupon, the
outcome of the analysis may be read by the computer system 100,
encrypted using the public key 118 and stored in the database 156.
Finally, the data objects and analysis results stored with respect
to the public key 118 in the database 138 may be deleted from said
database 138, since they are not required any more.
[0115] It will be understood that the above described steps
constitute only one possible embodiment of the invention.
Variations are possible. For example, in case a multitude of public
keys is generated from a single private key, the public keys used
for temporal data storage in the database 138 may be keys of the
medical doctor owning the system 100. Thus, the system 100 may
serve two purposes, namely data encryption and decryption using the
patient's (user's) asymmetric cryptographic key pairs, as well as
temporal data storage in the database 138 and initiation of the
data mining process via the analytic system 144 using the medical
doctor's asymmetric cryptographic key pairs. Of course, this
requires that the patient (user) entrusts the medical doctor with
his medical data--which nevertheless is usually the case.
[0116] Referring back to the generation of the private key 116 and
the public key(s) 118, for generating a pseudonym p.sub.T,i for a
user T.sub.i based on the secret s.sub.T 112 and domain parameters
D.sub.i containing a base point for the elliptic curve cryptography
the following steps are executed by the computer system 100 in
operation:
[0117] The user T.sub.i enters his or her user-selected secret
s.sub.T 112 such as via the keyboard 104. In addition, the user may
enter at least one public parameter V.sub.T,i such as his name or
email address via the keyboard 104 or otherwise. Such a public
parameter V.sub.T,i may also be permanently stored in the computer
system 100.
[0118] The secret s.sub.T 112 is temporarily stored in the memory
108. Upon entry of the secret s.sub.T 112 the embedding function
132 or the embedding component of the embedding and randomizing
function 126 generates the combination 114 of the secret s.sub.T
112 and the public parameter V.sub.T,i. The resultant combination
114 is temporarily stored in the memory 108.
[0119] Next, the randomizing function 134 or the randomizing
component of the embedding and randomizing function 126 is invoked
in order to calculate the private key 116 on the basis of the
combination 114. The resultant private key 116 is temporarily
stored in memory 108. In the next step, the key generator 128 is
started for computing the public key 118 by multiplying the base
point contained in the domain parameters D.sub.i of the elliptic
curve being used by the private key 116.
[0120] The public key 118, which is the identifier, i.e. in one
embodiment the pseudonym p.sub.T,i, is stored in memory 108. The
secret s.sub.T 112, the combination 114 as well as the private key
116 as well as any intermediate result obtained by execution of the
embedding and randomizing function 126 and the key generator 128
are then erased from the memory 108 and/or the processor 110. As a
consequence, there is no technical means to reconstruct the
assignment of the resultant pseudonym to the user T.sub.i as only
the user knows the secret s.sub.T 112 that has led to the
generation of his or her pseudonym p.sub.T,i. A data object 120
containing sensitive data of the user T.sub.i, such as medical
data, can then be stored by execution of the database access
function 130 in the pseudonymous database 138 using the pseudonym
p.sub.T,i 118 as a database access key, e.g. a primary key or
candidate key value that uniquely identifies tuples in a database
relation.
[0121] The user-selected secret s.sub.T 112 may be obtained by
combining a user-selected password or secret key with biometric
data of the user T.sub.i that is captured by the sensor 106. For
example, a hash value of the user-selected password or secret key
is calculated by execution of respective program instructions by
the processor 110. In this instance the hash value provides the
user-selected secret s.sub.T 112 on which the following
calculations are based.
[0122] A plurality of users from the public set of enrolled
participants T may use the computer system 100 to generate
respective pseudonyms p.sub.T,i and to store data objects
containing sensitive data, such as medical information, in the
database 138 as it has been described above in detail for one of
the users T.sub.i by way of example.
[0123] For reading the data object of one of the users T.sub.i from
the database 138, the user has to enter the secret s.sub.T 112.
Alternatively, the user has to enter the user-selected password or
secret key via the keyboard 104 and an acquisition of the biometric
data is performed using the sensor for computation of a hash value
that constitutes s.sub.T 112. As a further alternative, the secret
key is read by the computer system from an integrated circuit chip
card of the user. On the basis of s.sub.T 112 the pseudonym can be
computed by the computer system 100.
[0124] The above mentioned steps may be repeated several times for
the generation of the set of identifiers from a single secret
s.sub.T 112 or a single private key 116, wherein preferably only
the base point is varied.
[0125] FIG. 2 shows a corresponding flowchart.
[0126] In step 200 the user T.sub.i enters his or her user-selected
secret s.sub.T and public parameter V.sub.T,i. In step 202 s.sub.T
and V.sub.T,i are combined to provide the first combination 114 by
the embedding function (cf. embedding function 132 of FIG. 1).
Next, the randomizing function (cf. randomizing function 134 of
FIG. 1) is applied on s.sub.T and V.sub.T,i in step 204 which
provides a private key. As an alternative, an embedding and
randomizing function 126 is applied on s.sub.T and V.sub.T,i which
provides the private key.
[0127] In step 206 a public key is computed using the private key
obtained in step 204 and the public key is used in step 208 as a
pseudonym of the user T.sub.i. For example the pseudonym may be
used as a database identifier, e.g. a primary key or candidate key
value that uniquely identifies tuples in a database relation for
storing a data object for the user T.sub.i in a database with
pseudonymous data (cf. database 138 of FIG. 1).
[0128] When carrying out step 206, the public key is calculated
from the private key using elliptic curve cryptography, wherein
said calculation is performed by a variation of the domain
parameters used for performing the elliptic curve cryptography. For
example, a base point variation may be performed for this
purpose.
[0129] Even though, the above description always speaks about using
the public key as pseudonym and using the pseudonym as a database
access key, the invention is not limited to this embodiment. For
example, the public keys generated using the steps above may only
be a part of respective database access keys or pseudonyms, i.e.
they may be comprised in the access keys or the pseudonyms. An
example may be that the public key is given by `FF06763D11A64`,
wherein the identifier used for accessing data in a database named
`xyz` may be given by `xyz-FF06763D11A64`.
[0130] FIG. 3 shows a further embodiment of computer system 100. In
the embodiment considered here the embedding and randomizing
function 126 comprises an embedding function 132, a random number
generator 148, a first hash function 150 and a second hash function
152. In the embodiment considered here the computation of the
private key 116 based on s.sub.T 112 may be performed as
follows:
[0131] The first hash function 150 is applied on the user-selected
secret s.sub.T 112. This provides a first hash value. Next, a
random number is provided by the random number generator 148. The
random number and the first hash value are combined by the
embedding function 132 to provide the combination 114, i.e. the
embedded secret s.sub.T 112.
[0132] The combination of the first hash value and the random
number can be obtained by concatenating the first hash value and
the random number or by performing a bitwise XOR operation on the
first hash value and the random number by the embedding function
132. The result is a combination on which the second hash function
152 is applied to provide a second hash value. The second hash
value is the private key 116 on which the calculation of the public
key 118 is based.
[0133] Dependent on the implementation it may be necessary to
determine whether the second hash value fulfils one or more
predefined conditions. Only if such conditions are fulfilled by the
second hash value it is possible to use the second hash value as
the private key 116 for the following computations. If the second
hash value does not fulfill one or more of the predefined
conditions, a new random number is provided by the random number
generator 14,8 on the basis of which a new second hash value is
computed, which is again checked against the one or more predefined
conditions (cf. the embodiment of FIG. 4).
[0134] The random number on the basis of which the private key 116
and thereafter the public key 118 has been computed is stored in a
database 154 that is coupled to the computer system 100 via the
network 140. The random number may be stored in the database 154
using the public parameter V.sub.T,i as the database identifier for
retrieving the random number for reconstructing the pseudonym at a
later point of time.
[0135] By means of the system 100, a set of identifiers comprising
different public keys 118 is generated using the single secret 112
or directly the single private key 116, wherein for example in case
of elliptic curve cryptography only the base point of a set of
domain parameters is varied for this purpose. Individual base
points 190 used for generation of the individual public keys 118
may also be stored in the memory 108 of the computing system 100.
Alternatively, the base points may be stored in the database 154 or
any other database external to the system 100.
[0136] Generated identifiers may be used for accessing the database
138 using the module 130, which was described with respect to FIG.
1.
[0137] The user T.sub.i may use the public key provided by the
computer system 100 for sending a message comprising data to an
address comprising the public key or to an address which consists
of the public key. For example, the user may send a message to the
database 138. The message may comprise the decrypted data object
120 of the user, wherein upon reception of the message by the
database 138, the database 138 may store the data object assigned
to the user's public key in a pseudonymous manner.
[0138] It has to be noted that knowledge of the user's public key
permits the user T.sub.i to send information to his messaging
account, as well as any other institution or device or person who
knows the user's public key 118 to send information to the user
T.sub.i. As discussed with respect to FIG. 1, the analytic system
144 may analyze content of the database 138, wherein data of a
certain user T.sub.i is stored data assigned to an identifier,
wherein the identifier comprises the public key of this user.
Analysis of the content of the database 138 may result in the
public key of the certain user T.sub.i, as well in an analysis
result. Thereupon, the analytic system 144 may send this result as
a message or in a message using a recipient address comprising the
determined public key of the certain user T.sub.i. The message will
then be received either directly by the user via the computing
system 100, or via a message provider.
[0139] In the general case when a user T.sub.1 wants to send a
message to user T.sub.2, this requires that user T.sub.1 is able to
obtain the messaging address of T.sub.2. For this purpose, he may
access a PKI (public key infrastructure) from which the address may
be obtained.
[0140] According to an embodiment, access to the PKI may be
performed by the user T.sub.1 by using a pseudonym of the user
T.sub.2. It has to be noted that this pseudonym is not to be
confused with the pseudonym comprising the public user key. Here,
the pseudonym may be any identifier which is associated in a
database of the PKI with the user's messaging address. Thus, the
user T.sub.2 may provide his pseudonym to user T.sub.1 which may
then access the PKI for retrieval of the respective messaging
address of user T.sub.2.
[0141] Generally, for reconstructing the public key the user has to
enter his or her user-selected secret s.sub.T 112 into the computer
system on the basis of which the first hash value is generated by
the hash function 150, and the combination 114 is generated by the
embedding function 132 or the embedding component of the embedding
and randomizing function 126 using the first hash value and the
random number retrieved from the database 154.
[0142] Depending on the implementation, the user may also need to
enter the user's public parameter V.sub.T,i. A database access is
performed using the user's public parameter V.sub.T,i as a database
identifier, e.g. a primary key or candidate key value that uniquely
identifies tuples in a database relation, in order to retrieve the
random number stored in the database 154.
[0143] In other words, the reconstruction of the private key 116 is
performed by applying the embedding function 132 on the first hash
value obtained from the user-selected secret s.sub.T 112 and the
retrieved random number which yields the combination 114. The first
hash value is combined with the random number retrieved from the
database 154 by the embedding function 132 to provide the
combination onto which the second hash function 152 is applied
which returns the private key 116, out of which the public key 118,
i.e. the identifier, can be computed. After the user T.sub.i has
recovered his or her identifier a database access for reading
and/or writing from or to the database 138 may be performed or the
user may log into an online banking system for performing online
banking transactions using his identifier as a TAN.
[0144] FIG. 4 shows a respective flowchart for generating a
pseudonym p.sub.T,i for user T.sub.i. In step 300 the user enters
the user-selected secret s.sub.T. In step 304 a first hash function
is applied on the user-selected secret s.sub.T which provides a
first hash value. In step 306 a random number is generated and in
step 308 an embedding function is applied on the first hash value
and the random number to provide a combination 114 of the first
hash value and the random number. In other words, the first hash
value and the random number are mapped to a 1-dimensional space,
e.g. a single number, by the embedding function. The combination
114 can be obtained by concatenating the random number and the
first hash value or by performing a bitwise XOR operation on the
first hash value and the random number.
[0145] In step 310 a second hash function is applied on the
combination which provides a second hash value. The second hash
value is a candidate for the private key. Depending on the
implementation the second hash value may only be usable as a
private key if it fulfils one or more predefined conditions. For
example, if ECC is used, it is checked whether the second hash
value is within the interval between 2 and n-1, where n is the
order of the elliptic curve.
[0146] Fulfillment of such predefined conditions is checked in step
312. If the condition is not fulfilled, the algorithm returns to
step 306. If the condition is fulfilled, then the second hash value
qualifies to be used as a private key in step 314 to compute a
respective public key providing an asymmetric cryptographic
key-pair consisting of the private key and the public key. In step
316 the public key computed in step 314 is used as an identifier
such as for accessing a pseudonymous database or other
purposes.
[0147] In case elliptic curve cryptography is used in step 314 for
generating the public key, in step 318 a single domain parameter is
varied, preferable a base point, wherein all other base points are
left unmodified. However, also more than one domain parameter may
be modified in step 318. Afterwards, using the modified domain
parameter (s), steps 314 and 316 are repeated which results in a
further public key which can be used as an identifier.
[0148] The method with steps 318, 314 and 316 may be repeated as
often as necessary in order to generate a desired set of
identifiers.
[0149] FIG. 5 shows a block diagram which illustrates an embodiment
of the method according to the invention. In step 500 an input
value is accessed. The input value may be stored in a computer
memory or computer storage device or the input value may be
generated. For example, the input value could be generated from a
user-selected secret. In step 502 an asymmetric cryptographic key
pair is calculated. The input value could be used to generate both
the public and private key, or the input value could also possibly
be the private key. In step 504 the public key of the cryptographic
key pair is outputted as the identifier.
[0150] In step 506, a domain parameter or a set of domain
parameters are varied in accordance to a predefined scheme. Then,
steps 502 to 504 are repeated using the same input value which
results in a further identifier. Again, this is followed by step
506 and the cyclic performance of steps 502 to 504.
[0151] FIG. 6 shows a further embodiment of the method according to
the invention as a block diagram. In step 600 an input value is
accessed. In step 602 an asymmetric cryptographic key pair is
calculated. In step 604 the public key of the cryptographic key
pair is outputted as the identifier. In step 606 a digital
signature for data which is to be deposited into a database is
generated using the private key of the cryptographic key pair. In
step 608 data is deposited along with the digital signature and
possibly the information which of the (variations of the) domain
parameter sets has been used to create the digital signature into a
database using the identifier. The identifier may be used to grant
access to the database or as a permission to write data into the
database or it may also serve as a database access key for the data
being deposited into the database. In step 610 the authenticity of
the data is verified using the identifier, even though this step
may alternatively performed at a later point in time. The
identifier is the complementary public key to the private key
considering the (variation of the) domain parameter set used to
create the digital signature. The private key was used to generate
the digital signature for the data and the public key can be used
to verify the digital signature.
[0152] Again, steps 602 to 608 and optionally step 610 may be
repeated for generation of different identifiers using a single
private key, i.e. a single input value. Different datasets may be
signed using the single private key, wherein the different datasets
and digital signatures are then deposited into the database
possibly along with the information which of the (variations of
the) domain parameter sets has been used to create the digital
signature using the respective identifiers. I.e., the datasets are
deposited in a distributed manner in the database.
MATHEMATICAL APPENDIX
1. Embedding Functions
[0153] There exist n-ary scalar functions
d.sub.1N.times. . . . .times.N.fwdarw.N
which are injective--and even bijective, where N is the set of
natural numbers. The function d( ) embeds uniquely an n-dimensional
space, i.e. n-tuples (k.sub.1, . . . , k.sub.n), into scalars, i.e.
natural numbers k.
2. The Binary Cantor Pairing Function
[0154] The binary cantor pairing function .pi. is an embodiment of
embedding function 132. The binary cantor pairing function is
defined as follows:
.pi. : N .times. N .fwdarw. N ##EQU00001## .pi. ( m , n ) = 1 2 ( m
+ n ) ( m + n + 1 ) + n ##EQU00001.2##
which assigns to each fraction m/n the unique natural number
.pi.(m, n)--thus demonstrating that there are no more fractions
than integers. Hence, if we map both s.sub.T and V.sub.T,i to
natural numbers and use the fact that all identities are distinct
then .pi.(s.sub.T, V.sub.T,i) yields a unique value for each
identity, even if there are equal personal secrets. To be more
precise, since this function does distinguish between e.g. 1/2, 2/4
etc, it assigns to each fraction an infinite number of unique
natural numbers.
3. Elliptic Curve Cryptography (ECC)
[0155] Let: [0156] p be a prime number, p>3, and |F.sub.p the
corresponding finite field [0157] a and b integers
[0158] Then the set E of points (x, y) such that
E={(x,y).epsilon.|F.sub.p.times.|F.sub.p|y.sup.2=x.sup.3+ax+b}
(F1)
defines an elliptic curve in |F.sub.p. (For reasons of simplicity,
we skip the details on E being non-singular and, as well, we do not
consider the formulae of elliptic curves over finite fields with
p=2 and p=3. The subsequent statements apply to these curves,
too.)
[0159] The number m of points on E is its order.
[0160] Let P,Q.epsilon.E be two points on E. Then the addition of
points
P+Q=R and R.epsilon.E (F2)
can be defined in such a way that E forms an Abelian group, viz, it
satisfies the rules of ordinary addition of integers. By
writing
P+P=[2]P
[0161] We define the k-times addition of P as [k]P, the point
multiplication.
[0162] Now EC-DLP, the elliptic curve discretionary logarithm
problem, states that if
Q=[k]P (F3)
then with suitably chosen a, b, p and P, which are known to the
public, and the as well known to the public point Q it is
computationally infeasible to determine the integer k.
[0163] The order n of a point P is the order of the subgroup
generated by P, i.e. the number of elements in the set
{P,[2]P, . . . ,[n]P} (F4)
[0164] With all this in mind we define an elliptic curve
cryptographic (ECC) system as follows.
[0165] Let: [0166] E be an elliptic curve of order m [0167]
B.epsilon.E a point of E of order n, the base point
Then
[0168] D={a,b,p,B,n,co(B)} (F5)
with
co ( B ) = m n ##EQU00002##
defines a set of domain ECC-parameters. Let now g be an integer
and
Q=[g]B (F6)
[0169] Then (g, Q) is an ECC-key-pair with g being the private key
and Q the public key.
[0170] For we rely on findings of Technical Guideline TR-03111,
Version 1.11, issued by the Bundesamt fur Sicherheit in der
Informationstechnik (BSI), one of the best accredited sources for
cryptographically strong elliptic curves, we can take that m=n,
i.e. co(B)=1, and hence reduce (F5) to
D={a,b,p,B,n} (F7)
[0171] Now we can define our one-way function. Let D be a set of
domain parameters concordant with (F7). Then
f:[2,n-1].fwdarw.E
k[k]B (F8)
i.e. the point multiplication (F6), is an injective one-way
function.
4. Implementing Key Generator Based on ECC
[0172] The key generator 128 (cf. FIGS. 1 and 3) can be implemented
using ECC.
DEFINITIONS
[0173] There are public sets of ECC-domain parameters D.sub.1,
D.sub.2, . . . concordant with (F7)
[0173] D.sub.i={a.sub.i,b.sub.i,p.sub.i,B.sub.i,n.sub.i} (F9)
[0174] There are public functions: an embedding function do, a
randomizing function r( ) and our one-way function f( ) defined by
(F8). [0175] There is a public set of enrolled participants
(users)
[0175] T={T.sub.1,T.sub.2, . . . } (F10) [0176] Note that a T.sub.i
does not necessarily possess any personally identifying details,
i.e. we assume that T resembles the list of participants in an
anonymous Internet-community, in which each participant can select
his name at his discretion as long as it is unique. [0177] Each
participant T.epsilon.T chooses at his complete discretion his
personal secret s.sub.T. In particular, for this secret is never
revealed to anybody else--it is the participant's responsibility to
ensure this--it is not subject to any mandatory conditions, such as
uniqueness. [0178] Our pseudonym derivation function is
[0178] h( )=f(r(d( )) (F11) [0179] with the following properties:
[0180] Given a T.epsilon.T with his s.sub.T, a D.sub.i and T,
D.sub.i.epsilon.V.sub.T,i;
[0180] r(d(s.sub.T,V.sub.T,i))=g.sub.T,i (F12) [0181] where
g.sub.T,i is a unique and strong, i.e. sufficiently random, private
ECC-key for D.sub.i. [0182] The pseudonym p.sub.T,i corresponding
to T, s.sub.T and D.sub.i is
[0182]
p.sub.T,i=f(g.sub.T,i,D.sub.i)=[g.sub.T,i]B.sub.i=(x.sub.T,i,y.su-
b.T,i) (F13) [0183] There is a public set of pseudonyms
[0183] P={p.sub.1,p.sub.2, . . . } (F14)
such that P comprises one or more pseudonyms for each participant
in T computed according to (F11). This wording implies that here is
no recorded correspondence between a participant in T and his
pseudonyms in P, i.e. each p.sub.T,i is inserted in an anonymous
way as p.sub.k into P.
[0184] Remarks: [0185] The use of multiple domain parameters
enables us to endow a single participant with a single personal
secret with multiple pseudonyms. This in turn enables a participant
to be a member of multiple pseudonymous groups such that data of
these groups cannot--for, e.g. personal or legal reasons--be
correlated. Therefore, attempts to exploit combined pseudonymous
profiles for unintended, possibly malicious purposes, are of no
avail. [0186] The distinction between two sets of domain parameters
D.sub.i and D.sub.j can be minor. In accordance with our principle
to use only accredited domain parameters, e.g. those listed in BSI
TR-03111, we can set
[0186] D.sub.i={a,b,p,B,n} (F15) [0187] by swapping B for a
statistically independent B.sub.2, i.e. by choosing a different
base point, we can set
[0187] D.sub.j={a,b,p,B.sub.2,n} (F16) [0188] For D.sub.i and
D.sub.j refer to the same elliptic curve we can have only one
function (F12) and introduce the crucial distinction with (F13).
This vastly simplifies concrete implementations--we select a
suitable curve and vary the base points only.
[0189] While the invention has been illustrated and described in
detail in the drawings and foregoing description, such illustration
and description are to be considered illustrative or exemplary and
not restrictive; the invention is not limited to the disclosed
embodiments.
[0190] Other variations to the disclosed embodiments can be
understood and effected by those skilled in the art in practicing
the claimed invention, from a study of the drawings, the
disclosure, and the appended claims. In the claims, the word
"comprising" does not exclude other elements or steps, and the
indefinite article "a" or "an" does not exclude a plurality. A
single processor or other unit may fulfill the functions of several
items recited in the claims. The mere fact that certain measures
are recited in mutually different dependent claims does not
indicate that a combination of these measures cannot be used to
advantage. A computer program may be stored/distributed on a
suitable medium, such as an optical storage medium or a solid-state
medium supplied together with or as part of other hardware, but may
also be distributed in other forms, such as via the Internet or
other wired or wireless telecommunication systems. Any reference
signs in the claims should not be construed as limiting the
scope.
LIST OF REFERENCE NUMERALS
[0191] 100 Computer system [0192] 102 User interface [0193] 104
Keyboard [0194] 106 Sensor [0195] 108 Memory [0196] 110 Processor
[0197] 112 User-selected secret [0198] 114 Combination [0199] 116
Private Key [0200] 118 Public Key [0201] 120 Data Object [0202] 122
Computer program instructions [0203] 126 Embedding and randomizing
function [0204] 128 Key Generator [0205] 130 Database access
function [0206] 132 Embedding function [0207] 134 Randomizing
function [0208] 136 Network interface [0209] 138 Database [0210]
140 Network [0211] 144 Analytic system [0212] 146 Data Analysis
Component [0213] 148 Random number generator [0214] 150 Hash
function [0215] 152 Hash function [0216] 154 Database [0217] 156
Database [0218] 158 Data Object [0219] 190 Set of base points
* * * * *