Let Me Make Note of My Random Thougts Before I Forget

Memo on the discussion for F2F of ORMS TC on Nov. 10, 2008

By Nat Sakimura

Mathematical Expression of Reputation calculation

As of Nov. 10, 2008, ORMS TC version of the definition of “Reputation” is as follows.

Reputation

Reputation is a collective evaluation of the behavior of an entity based on factual and/or subjective data about it, and is used as one of the factors for establishing trust on that entity for a specific purpose.

This could be further abstracted to be something like:

Reputation

Reputation is a subjective evaluation of the assertion about a subject being true based on factual and/or subjective data about it, and is used as one of the factors for establishing trust on that subject for a specific purpose.

Note that in the above version, the word “entity” was avoided and replaced with “subject” since there may be a reputation on an abstract idea etc. which is not an “entity”. Also, “the behavior” has been replaced with “the assertion” because the later may include the former and is broader.

To capture this notion of Reputation in more formal form, the attendees at the F2F have tried to come up with a mathematical representation of the above statement.

The first step was to define “Reputation Calculation.”

Reputation Calculation

Reputation calculation made by a Reputation Authority “i", on an assertion “A” about a subject “s” on criteria “c” being true is a subjective mapping Ri(A(s,c)) of a set of input data “x” within an input information set “X” to a set “y” within an reputation representation set “Y”. i.e.,

Ri(A(s,c)):X ->Y

where

s:= subject identifier

c:= criteria identifier

A(s,c):= An assertion about c on s being true.

X:={x | set of all input data including "y in Y}

Y:={y | set of all possible reputation calculation result.}

Ri(A(s,c)):= Reputor i's mapping of input data (multiple) to Y on the assertion A(s,c).

Reputation is then defined using the above notion.

Reputation

Reputation is the result of the reputation calculation, i.e., y in Y, that can be used as one of the factors for establishing trust on that entity for a specific purpose.

At a conceptual level, it can be any mapping of a set in the Domain to another set in the Range. In another word, it could be conceived as a projection of a set in a Domain hyperplane to the Range hyperplane. However, interpreting the end result for an arbitrary Range would be extremely difficult. Thus, the F2F participants propose the notion of “Reputation Score” as defined below, and define Y accordingly.

Reputation Score

A Reputation Score, z in Z, is a subjective probability assigned by the Reputation Engine for the assertion being true..

Thus, the Reputation Score will be in the range [0,1] or in percentage term [0,100]. i.e., Z:={z | Real number between 0 and 100}. This has the property that it is intuitively easy for the consumer and for two subjects m and n, if m is more likely to fit the assertion than n, then zm > zn where zm is the reputation score of m.

Then, we can create a representation of Reputation, y, as the set of information around z as described in the next section.

Reputation Result Portable Format Requirement

The reputation score described above is quite useless in itself as a portable reputation without some statistical qualities so that the reputation consumer can make inference on it. There are number of desired properties that should be transmit with the score.

From the discussion, resulting XML seems to require to have at least the following:

1. Reputation result XML needs to have an identifier of somebody being scored.
It may include PII (e.g., Social Security Number), so it may be wise to mandate that this be a hash(identifier, salt)?=>Protocol Consideration

2. The same for who is scoring.

3. For what criteria, this reputation score was made.

4. Input Data Range

5. For the reputation to be aggregatable, it has to have a distribution that we know about the aggregated distribution (such as normal distribution).

6. The information about the distribution, including what distribution, mean, and standard deviation must be published together with the score.

7. Display score should be intuitive for an average person.

8. Date that score was made

9. Signature by the score maker so that it will be tamper proof.

The above requirements with some others are captured in the table below.

It seems, to be a meaningful Portable Reputation Data representation, i.e., the representation of y to be transmitted out of the local reputation scope, the file should contain at least the following.

item / type / e.g. / Remarks
ReputationID / XRI/URI / @myRS/=nat/+goodSeller/20081110 / Unique ID of this file.
SubjectID / XRI/URI / =nat / Identifier of the Subject being scored
ReputationServiceID / XRI/URI / @myRS / Identifier of the Reputation Authority/Engine
AssertionID / XRI/URI / @myRS/+goodSeller / Identifier of the assertion template which would form an assertion (criteria) when SubjectID is combined with it.
RequesterID / XRI/URI / =john / Identifier of the requesting party. This is included to make the source detectable in the case of the leakage.
DisplayScore / float / 95.1 / Cumulative distribution in percentage form
Score / Float / 56.8 / Raw Score
ConfidenceInterval / Float, Float / 92.8, 96.8 / 5% confidence interval of Display Score
Distribution / XRI/URI / +distribution/normal / Identifier representing the distribution.
Mean / float / 50 / 50
StandardDeviation / float / 4.1 / 4.1
SampleSize / Integer / 100 / 100
SubjectPublicKey / PEM / -----BEGIN CERTIFICATE-----
MIIDaDCCAlCgAwIBAgIBHTANBgkqhkiG
...
-----END CERTIFICATE----- / PEM format version of the Public Key of the Subject. This can be used to find out later when the reputation consumer is making a transaction with the Subject to validate that the party he is talking to is really the one that this reputation file is referring to.
ReputorPublicKey / PEM / -----BEGIN CERTIFICATE-----
MIIDaDCCAlCgAwIBAgIBHTANBgkqhkiG
...
-----END CERTIFICATE----- / PEM format version of the Public Key of the Reputation Authority.
Date / XMLDATE / 2008-11-11T14:34:00Z / XML Date of the calculation
InputDataInfo / XRI/URI / @myRS/+goodSeller/=nat/+inputData
StartDate / XMLDATE / 2008-01-01T00:00:00Z / Start Date of the input data.
EndDate / XMLDATE / 2008-11-11T14:30:00Z / End Date of the input data.
Signature / String / af8afsld92dfjdsla…blah…blah… / af8afsld92dfjdsla…blah…blah…

Protocol Considerations

1. The reputation consumer should be able to obtain the reputation file by specifying the assertion including the subject identifier.

2. Since the reputation data itself is often an sensitive data including PII, it should have security considerations such as:

Ø SubjectID should be represented so that it cannot be traced back to the Subject. E.g., sha256(SubjectID, salt). This implies that the protocol should be a request-response protocol since otherwise the receiver cannot map the file to the Subject.

Ø To be able to make the source detectable in the case of the leakage, the file should contain the requester ID.

Ø To make the request unforgeable, the request should contain the signature of the requesting party.

Ø To make the evasdropping and MITM attack impossible, the response should be encrypted using a content encryption key (session key) which in turn is encrypted by the requesting party’s public key.

Ø Considering that a mere fact that an entity is requesting a reputation representation of the subject may be a privacy risk, the request probably should be encrypted in the same manner as the response, with reputation authority’s public key.