[cap-talk] Wiki with belief functions (was: Wiki with Reputation Tracking)

Sandro Magi smagi at naasking.homeip.net
Mon Dec 12 08:54:18 EST 2005


David Wagner wrote:
> Toby Murray <toby.murray at dsto.defence.gov.au> writes:
> 
>>I immediately took issue with this, thinking "You don't need to identify 
>>individual contributers. They can still be anonymous -- in the sense 
>>that you can't match them to an identity in the 'real world'. You just 
>>need to attach a reputation to each contributer, with unidentified 
>>contributers ("guests") having a reputation of zero. You then need to 
>>mark the edits with the repuation of the editor. People who view the 
>>page can then mark the quality of individual edits.

I've been thinking about this idea for awhile, although the system I 
came up with was a more complicated than a "reputation" system.

As I see it, the problem with Wikipedia is that any given article does 
not have 100% accurate information, but instead *implicitly* encodes a 
degree of belief (100% belief in articles and users). My idea was to 
make this degree of belief an explicit part of the system.

I initially started with Bayesian Confirmation Theory, and also 
researched other belief functions 
(http://ippserv.rug.ac.be/documentation/belief/belief.html). The system 
must encode the degree of belief in a particular article, the degree of 
belief that an edit might be accurate, and also the degree of belief 
that the user is benevolent/being truthful in his edits. The first two 
might seem redundant given the last, but they're not entirely.

My rough outline is that an edit must surpass a certain "belief 
threshold" before it is publicly visible or "believed" by the system. 
Each edit carries a certain weight which is a function of the user's 
"reputation" (for lack of a better term); a benevolent user's reputation 
may automatically surpass the threshold and be immediately visible. 
Anonymous or new users have a low reputation whose edits must be voted 
on by other users who add the weight of their own reputation. The degree 
of belief in an article is a function of the degrees of belief in the 
users who wrote it.

I've considered making an edit immediately visible depending on the 
extent of the change, ie. spelling corrections become visible 
immediately, but more extensive edits await approval of some kind. The 
degree of belief an edit is accurate is thus a function of a user's 
reputation, and the extent of the change.

Each successful edit improves our degree of belief in the 
user/reputation, each revert decreases the system's belief in the user. 
This seems overly harsh, given users make edits with the best knowledge 
they have at the time (but which may soon be obsolete), so I've 
considered the following ideas:

1. a time period after which a revert wouldn't count against one's 
reputation (still seems harsh, though less so)

2. separate malicious revert ("he's intentionally spreading FUD") from 
benevolent revert ("he couldn't have known better")

I'm still undecided on whether the system's belief in a user should 
retroactively affect their edits. A user may start out as benevolent, 
but turn malicious, so that argues "no". Then again, your attack below 
would be less effective if it were retroactive, so that suggests "yes".

As you can see, it's a work in progress. :-)

Thanks for pointing out those reputation systems. I think they'll make 
for some interesting research.

> (Let's say you allow any user with zero reputation to modify the Wikipedia
> entry, and it is removed immediately if anyone complains, but otherwise
> it stays up until someone complains.  No problem, an attacker will just
> create a new guest identity every time their initial comment is removed.
> Let's say you allow a zero-reputation guest to add an entry that will
> stay up for 24 hours.  No problem, an attacker just creates a new guest
> identity every 24 hours.)

Hence, low-reputation edits shouldn't be immediately visible, but placed 
in a queue for other users to "lend their belief". Also, I think it's 
important to distinguish the likelihood of malicious users and bored, 
anonymous vandals.

> This doesn't even get into all the difficulties where an attacker creates
> one permanent identity, arranges for it to accumulate some reputation,
> and then creates lots more throwaway identities and uses the permanent
> reputable identity to attest to the trustworthiness of the throwaways.

Accumulating that reputation would involve a great deal of work, and the 
eventual reverts would make this reputation worthless. I think most 
Wikipedians are concerned with anonymous vandals; there are 
significantly fewer malicious people.

Sandro


More information about the cap-talk mailing list