The International Correspondence Chess Federation
Sauerlandstrasse 8
Ratings CommissionerD-70794 Filderstadt
Gerhard BinderGermany
Gerhard BinderReport of the Ratings Commissioner to the Congress2015 in Cardiff page 1 of 4
Datum 12.07.2015
Report for the Congress at Cardiff 2015
Dear friends,
since last year’s Congress fourratinglists were published on time including special ratings for Chess960. Soon I will work on the list 2015/4 which will be published on September,15 and will be valid from 1st of October, business as usual, but I want to point out someimportant topics.
Do we have a new case of rating manipulation?
I detected that we had a new case of rating manipulation during last year. I do not like to call it an attempt to cheat because no rule is violated. But it is a try to abuse the current rules.
The new method:
- finishing 211 games between June 2014 and May 2015 over 4 periods
- moving down the own rating by finishing bad positionsduringthe first two periods (92 games, - 190 points)
- finishing 119 gamesin the 3rd and 4th period, mostly by draw and based on the artifical low own rating (+ 358 points)
- using the „gap“ between two rating periods to avoid the “max-games-limit in rule 16” and to inrease the advantage
Evidently the method worked already in 2013
Since the last improvements of the rating rules in 2011 the player’s last published rating is used for the calculation of the rating difference in a game. This should avoid the problem caused by Tritt to start many games with a low startrating and to finish them some periods later. So far this rules change worked well.
An Australian playerrecentlyfound a new hole in the calculation procedure. In March 2015 he finished 50 games (42 draws, 7 wins). In these games his own rating is not the high value of 2015/2 which is only valid for games finished at April, 1st or later. Instead the low value of 2015/1 has to be used for the evaluation – following the rules! The effect became obvious to other players and to myself with his forecast for 2015/3. It showed 2612.
This value is completely unrealistic and inappropriate. I analysed his performance in recent periods:
2015/1: score 20/49 against 2314 gives a performance of 2249
2015/2: score 36/57 against 2386 gives a performance of 2479
2015/3: score 31,5/54 against 2386 gives a performance of 2445
overall: score 364,5/637 against 2268 gives a performance of 2320
Something had to happen to avoid the value of 2612 in the ratinglist 2015/3. I tested on the server to use 2485 instead of 2254 for the 50 games finished in March. This lead to a new rating of 2460. After the change was visible in the forecast I got immediately complaints from the player and the Australian delegate. Of course it would have been against the rules and I cancelled the change again to avoid an appeal.
A second option would have been to overwrite the calculated value by the performance of the recent period (2445). Those corrections were already done in the past by Nol (Widmann) and by myself (Tritt). So it would be a kind of „common law“.Currently this way is covered by rule 16 but only if the number of games in a period is > 800/k. In none of the four periods this applied.
I started a long and friendly discussion with the player and he was willing to abstain from the GM-level. We found a compromise to set down his actual rating to 2552 by excluding the damaging effect of the development factor k.
I will carefully observe his future behaviour. Today it seems that he plays a “normal” number of games and with his high rating now it will be difficult for him to further increase it by playing draws.
Independant of the short-term action it is necessary to change the rules.
Proposals for changing the rules
The rule 16 was written at the time where we had one-year-periods. Now with quarter periods we need a lower value to avoid such differencies between a calculated result and the performance in the evaluated period.
The maximum of 800/k was proposed by the mathematican Nol with reference to Prof. Elo. I think it is not reasonable to make it depending from the development factor k. In some cases in the past only 40 games lead to this exception, in other cases it was 80 games.
I looked into the calculations of the recent six lists:
3 to 8 players were concerned by 800/k
2 to 4 players had more than 60 games
3 to 10 players had more than 50 games
8 to 22 players had more than 40 games.
In my opinion a value of 50 would be reasonable.
16 For each player whose rating was based on at least 30 games at the beginning of the period the new rating is calculated using the formula of item 6, except for those players who finished more than 800/k50 games in the current period. For those players a new rating is calculated, based on the formula 5 only for their games in that period. If the result of item 6 is obviously inappropriate the Rating Commissioner may replace it with the value of item 5. Such an exception has to be justified to the concerned player and his national delegate.
Another small problem which arised in recent years is the rounding if the calculation results in a new rating of xxxx.5. Players expect that such a value has to be rounded up to xxxx+1 as it was when the ratings were calculated with my local software. The server software often rounds down to xxxx. I asked our programmer Martin Bennedik for check and correction. He insisted that the rounding must be defined in the working rules before he can make a change. This affects rules 8 and 17. Both extensions (in red) are taken from the FIDE handbook.
8 The expected game result We is the percentage expectancy, obtained from item 4, based on the difference between the player’s rating and the opponent’s rating as defined in Tournament Rule 9.4. If this difference is > 350, it is snipped to this value for the evaluation.Values from item 4 are used precisely as shown, no extrapolations are made to establish a third significant figure.
17 The new rating for the next ICCF rating list is rounded to the nearest integerwhole number. The fraction 0.5 is rounded upward.
The change of rule 8 will also affect our normtables. For their creation and current values an extrapolation with a third significant figure was used as I could find out in an analysis following the remarks of Mariusz at last year’s Congress.
General review
Since 1987 we have several times adjusted our rating rules to consider special CC-problematics, to avoid abusing, to improve the presentation a.s.o. But the basical principles created by Elo and taken from FIDE were never touched: The rating scale is an arbitrary one with a class interval set at 200 points. The table of item 4 shows the conversion of difference in rating into scoring probability and is based on the normal probability function of statistical probability theory.
The probabilities in item 4 represent the situation of OTB chess in the 1960s and may still represent it for OTB nowadays. But CC has changed dramatically since then due to the use of computers and the tremendous growing number of draws. Looking on the results of high level tournaments today shows that it is ridiculous to expect a score of 76% between two players differentiated by one class or 64% for a rating difference of 100.
This situation requests a general review of the statistical base of our system including the development factor and the norm tables. During next months I intend to use the one million results collected on the server and my local database to examine the real distribution depending on rating difference, level and time. But that is not an easy task and maybe we will need professional statistical advice as proposed by the Service Committee.
Other ongoing tasks
- Forecast including running games with assumed results
- Improvement of the withdrawal wizzard and the Robo-TD
- Cleaning players’ database
- Importing more old tournaments from Eloquery (zonal, national, friendly matches)
Looking forward to seeing you in Cardiff !
Amici sumus
Gerhard Binder
ICCF Ratings Commissioner
Gerhard BinderReport of the Ratings Commissioner to the Congress2015 in Cardiff page 1 of 4