lichess.org
Donate

Exploring the python-chess module

ChessAnalysisChess engine
Lichess' accuracy metrics replication as pretext to explore the python-chess module.

Introduction

In my pevious blog Fitting an Elo model to Titled Tuesday games I use python to fit an Elo model to Titled Tuesday blitz games.

To read and parse my PGN files I used python-chess.

In this blog I will try to replicate Lichess' accuracy metrics as pretext to explore the python-chess module.

I decided to write this blog inspired by Ryan Wingate. Maybe what I have learned will also be usefull to someone. Another interesting blog that explores chess engine analysis using python-chess can be found here.

Before continuing a word of caution, although as a statistical researcher I program in python every day I am not a developer. This means that while typically, but not always, I manage to get things done my programming is not elegant, sophisticated or efficient.

My imports for this project are:

import pandas as pd
import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt
import chess
import chess.pgn
import chess.engine

Lichess accuracy metrics

White expected score

Lichess's equation that, for a given position, converts Stockfish centipawns (for those who don't know a centipawn is 1/100 of a pawn) into what Lichess calls winning chances is given by:

Win% = 50 + 50 * (2 / (1 + exp(-0.00368208 * centipawns)) - 1)

Which can be simplified to:

Win% = 100 / (1 + exp(-0.00368208 * centipawns))

More precisely this equation measures White's expected score after 100 games are played.

My White's expected score implementation is:

def lichess_white_expected_score(cp,p=-0.00368208):
    return 1.0/(1.0+np.exp(p*cp))

Note that I use 1 instead of 100, so my equation returns a number between 0 and 1, rather than between 0 and 100.

To evaluate, for instance the move 1.d4, a board with a move 1.d4 can be set-up using it's FEN (Forsyth–Edwards Notation) as follows:

board = chess.Board('rnbqkbnr/pppppppp/8/8/3P4/8/PPP1PPPP/RNBQKBNR b KQkq - 0 1')

To show this board:

display(board)

Alternatively, an empty board can be initiated and "a move can be made":

board = chess.Board()
move = chess.Move.from_uci("d2d4")
board.push(move)
display(board)

The difference is that in this way the move is shown on the board.

Getting the UCI (Universal chess interface) , SAN (Standard Algebraic Notation) and FEN notations for the current move/board can be done using:

move.uci() # gets the current move in UCI notation 
board.san(move) # gets the current move in SAN notation
board.fen() # gets current board's FEN

I can now evaluate this position, at a required depth, using my version of stockfish 16 engine:

engine = chess.engine.SimpleEngine.popen_uci('stockfish-ubuntu-x86-64-avx2')
info = engine.analysis(board,limit=chess.engine.Limit(depth=36))

Time taken to evaluate 1.d4 at depth of 36: 0 days 00:00:42.480189

Alternatively a time limit for the analysis can be set:

info = engine.analyse(board, limit=chess.engine.Limit(time=2.0)) # runs engine's analysis for 2 seconds

It is also possible to get the best moves:

info = engine.analyse(board, chess.engine.Limit(time=2.0), multipv=3)# gets the 3 best moves

In this case info will be a vector with 3 elements: info[0], info[1] and info[2].

info will contain useful information, for example:

info['depth'] # gets the deph of the analysis
info['score'] # gets the score from the player who played perpective
info['score'].white().score() # gets the score from the White's perpective
info['score'].white().is_mate() # true if the score is mate
info['score'].white().mate() # number of moves until mate (zero if mate is on the board)
info['score'].wdl().white() # Win/Draw/Loss distribution out of 1000 games
from White's perspective

You can get Black's perspective by replacing .white() for .black().

A couple of examples:

  • 1.d4 centipawns, from White perspective, at depth=36: 27
  • 1.d4 Win/Draw/loss distribution, from White perspective, at depth=36: Wdl(wins=21, draws=978, losses=1)

Consequently, White's 1.d4 expected scores are:

  • Lichess model (given my Stockfish 16 evaluation at depth 36): 1.0/(1.0+exp(-0.00368208 * 27)) = 0.5248
  • Lichess master database: 0.33+0.44/2 = 0.55 (see next figure)
  • Stockfish 16 model at depth 36: (21+978/2)/1000 = 0.51

The previous figure also shows that Lichess' Stockfish 16 White's advantage for move 1.d4 is 0.1 or 10 centipawns at a deph of 36.

Therefore:

  • Chess engines evaluations and reality don't necessarily agree. People don't play like engines.
  • My version of Stockfish 16 does not automatically gives the same evaluations as Lichess' Stockfish 16.
  • Running my engine twice does not necessarily yield the same result.

Engines don't necessarily agree with each other (or with themselves). It is important to have this in mind.

The following figure reproduces Lichess' White's expected score as a function of centipawns

Move accuracy

Lichess calculates move accuracy as follows:

Accuracy% = 103.1668 * exp(-0.04354 * (winPercentBefore - winPercentAfter)) - 3.1669

My implementation is:

def lichess_move_accuracy(score_100_games_diff,par = [103.1668,-0.04354,-3.1669]):
    accuracy = par[0] * np.exp(par[1] * score_100_games_diff) + par[2]
    return accuracy

As an example, let's imagine that after 1.d4 Black plays 1...a6 (which is the least popular response to 1.d4 in Lichess's master database)

The board is:

The evaluations are:

MoveCentipawnsWhite score
1.d4270.524834
1...a6570.552278

Acuracy of the move 1...a6 = 103.1668 * exp(-0.04354 * (55.23 - 52.48)) - 3.1669 = 88.38%

The following graph reproduces the Lichess' accuracy as a function of the score difference over 100 games.

Reading and parsing a PGN file

I use python-chess reading and parsing PGN files in 3 different ways:

  • reading multiple games from one file (by the way, reading large PGN files using this module is extremely slow):
pgn_headers = [
    'Event',
    'Site',
    'Date',
    'Round',
    'White',
    'Black',
    'Result',
    'ResultDecimal',
    'WhiteTitle',
    'BlackTitle',
    'WhiteElo',
    'BlackElo',
    'ECO',
    'Opening',
    'Variation',
    'WhiteFideId',
    'BlackFideId',
    'EventDate',
    'Annotator',
    'PlyCount',
    'TimeControl',
    'Time',
    'Termination',
    'Mode',
    'FEN',
    'SetUp',
    'Moves',
]

games = pd.DataFrame(columns=pgn_headers)
game_n = 0

with open(pgn_file_name) as f:

    while True:
        game = chess.pgn.read_game(f)
## If there are no more games, exit the loop
        if game is None:
            break

        games.loc[game_n] = pd.Series(game.headers)
        game_n = game_n + 1
  • reading a PGN file with just a game
game = chess.pgn.read_game(open('pgn_file_name'))
  • Create a PGN from moves.
moves = '1. e4 e5 2. Nf3 Nh6 3. Nxe5 d6 4. Nf3 Qf6 5. d4 Ng4 6. Bg5 Qg6 7. Bd3 Be7 8. Bxe7 Kxe7
9. e5 f5 10. exf6+ Qxf6 11. O-O Bf5 12. Re1+ Kf8 13. Bxf5 Qxf5 14. Qe2 Nc6 15. h3 Re8 16. Qxe8# '
game = chess.pgn.read_game(io.StringIO(moves))

These days I play on Lichess mostly anonymously but I still like to analyse my games. Hence I use this method for my anonymous games.

To copy the moves, on Lichess I right click the game's last move, on the analysis board, and choose "copy variation PGN" and then I paste it on to my python code, as you can see above.

Lichess' average accuracy and advantage chart

For this section I will use one of my games as example:

https://lichess.org/7O3mPKYY

Seeing chess from White's perspective is clear that logic demands that when White plays either White keeps it score (100% accuracy) or White's score gets worst (less than 100% accuracy). When it is Black's turn either White's score is unchanged (100% accuracy by Black) or White's score improves (less than 100% accuracy for Black).

Of course Engines are not perfect. Therefore it can happen that White's score given by the engine can improve after White's move or decrease after Black's move. In both these two events I set White's score differential to zero, and I call it "White's adjusted expected score differential".

Finally, what is the centipawns equivalent to, for instance, a mate in 4? This question is important because to calculate accuracy using Lichess' function it is necessary to know the score and to know the score it is necessary to know the centipawns.

Theoretically the score of mate in 4 is 1, meaning that whichever color has a mate in 4 will win the game. In practice chess players frequently miss these opportunities (we all have been there, right?), consequently the score of a mate in 4 is not 1.

Python-chess converts a mate in m turns to centipawns by subtracting m from an arbitrary large number:

chess.engine.Mate(m).score(mate\_score=1800) # converts mate in centipawns = mate\_score - m

As you can see above, I chose 1800, equivalent to 2 queens, as this arbitrary large number.

Given my choice, for instance mate in 11 is converted to 1800-11 = 1789 centipawns.

The following table shows:

  • the moves in UCI and SAN notations,
  • White's advantage in centipawns,
  • White's expected score,
  • White's expected score differential,
  • White's adjusted expected score differential and
  • the depth at which the engine calculated White's advantage in centipawns in 2 seconds of analysis.
TurnSANUCICPScoreDiffAdj_diffAccDepth
Whitee4e2e43052.760.46010025
Blacke5e7e53353.030.280.2898.7726
WhiteNf3g1f32051.84-1.19-1.1994.7727
BlackNh6g8h617365.4113.5713.5753.9824
WhiteNxe5f3e518966.731.32010025
Blackd6d7d618566.4-0.33010025
WhiteNf3e5f319567.220.82010025
BlackQf6d8f629174.497.277.277225
Whited4d2d430775.591.1010024
BlackNg4h6g438580.54.94.980.1625
WhiteBg5c1g529674.84-5.66-5.6677.4723
BlackQg6f6g6313761.161.1694.9224
WhiteBd3f1d324270.91-5.09-5.0979.5124
BlackBe7f8e739781.1810.2710.2762.824
WhiteBxe7g5e740881.790.61010024
BlackKxe7e8e741081.90.110.1199.5125
Whitee5e4e535178.46-3.45-3.4585.6324
Blackf5f7f540681.683.233.2386.4825
Whiteexf6+e5f630975.73-5.95-5.9576.4424
BlackQxf6g6f631976.40.670.6797.0325
WhiteO-Oe1g132076.460.07010026
BlackBf5c8f540581.635.165.1679.2325
WhiteRe1+f1e138680.55-1.07-1.0795.2924
BlackKf8e7f851686.996.436.4374.7925
WhiteBxf5d3f537980.15-6.84-6.8473.4224
BlackQxf5f6f539981.291.151.1594.9826
WhiteQe2d1e238180.26-1.03-1.0395.4823
BlackNc6b8c637179.67-0.59010025
Whiteh3h2h336479.25-0.42-0.4298.1324
BlackRe8a8e8179999.8720.6120.6138.88245
WhiteQxe8#e2e818001000.1301000

Average accuracy

To calculate White's and Black's average accuracy, as far as I understand, Lichess uses an harmonic mean, which can be easily calculated using the function hmean in the module scipy.stats.

The following table shows the average accuracy for White and Black as calculated by me and by Lichess.

WhiteBlack
Me9176
Lichess9177

The next table compares my and Lichess' average accuracy for a few random games.

WhiteBlack
Carlsen vs EsipenkoMe9792
Lichess9690
Firouzja vs NepoMe8895
Lichess8795
Game ve1epqzTMe8186
Lichess8387
Game lO13IlTfMe8060
Lichess7863
Game sLv7lVYFMe8190
Lichess7787

As it can be seen the accuracy numbers are not the same but close enough. Certainly close enough for my purpose.

Advantage chart

To replicate Lichess' advantage plot, White's expected score needs to be converted from the [0, 1] interval to [-1, 1] by calculating 2*lichess_white_expected_score-1.

The next figure reproduces Lichess' advantage chart for my game and also shows the actual Lichess graph for comparison.

Final thoughts

Just for fun, it is possible to make two engines play each other:

engine1 = chess.engine.SimpleEngine.popen_uci('your engine 1')
engine2 = chess.engine.SimpleEngine.popen_uci('your engine 2')

board = chess.Board()

while not board.is_game_over():
     result = engine1.play(board, chess.engine.Limit(time=1.0))
     board.push(result.move)
     display(board)
     result = engine2.play(board, chess.engine.Limit(time=1.0))
     board.push(result.move)
     display(board)

The moves played can be found in:

board.move_stack

I hope that this blog will help you start using python-chess, if you are interested in these sort of things, of course.