Exploring the python-chess module

JoaoTx21 Jan 2024745 viewsEnglish (US)

Lichess' accuracy metrics replication as pretext to explore the python-chess module.

Introduction

In my pevious blog Fitting an Elo model to Titled Tuesday games I use python to fit an Elo model to Titled Tuesday blitz games.

To read and parse my PGN files I used python-chess.

In this blog I will try to replicate Lichess' accuracy metrics as pretext to explore the python-chess module.

I decided to write this blog inspired by Ryan Wingate. Maybe what I have learned will also be usefull to someone. Another interesting blog that explores chess engine analysis using python-chess can be found here.

Before continuing a word of caution, although as a statistical researcher I program in python every day I am not a developer. This means that while typically, but not always, I manage to get things done my programming is not elegant, sophisticated or efficient.

My imports for this project are:

import pandas as pd
import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt
import chess
import chess.pgn
import chess.engine

Lichess accuracy metrics

White expected score

Lichess's equation that, for a given position, converts Stockfish centipawns (for those who don't know a centipawn is 1/100 of a pawn) into what Lichess calls winning chances is given by:

Win% = 50 + 50 * (2 / (1 + exp(-0.00368208 * centipawns)) - 1)

Which can be simplified to:

Win% = 100 / (1 + exp(-0.00368208 * centipawns))

More precisely this equation measures White's expected score after 100 games are played.

My White's expected score implementation is:

def lichess_white_expected_score(cp,p=-0.00368208):
    return 1.0/(1.0+np.exp(p*cp))

Note that I use 1 instead of 100, so my equation returns a number between 0 and 1, rather than between 0 and 100.

To evaluate, for instance the move 1.d4, a board with a move 1.d4 can be set-up using it's FEN (Forsyth–Edwards Notation) as follows:

board = chess.Board('rnbqkbnr/pppppppp/8/8/3P4/8/PPP1PPPP/RNBQKBNR b KQkq - 0 1')

To show this board:

display(board)

Alternatively, an empty board can be initiated and "a move can be made":

board = chess.Board()
move = chess.Move.from_uci("d2d4")
board.push(move)
display(board)

The difference is that in this way the move is shown on the board.

Getting the UCI (Universal chess interface) , SAN (Standard Algebraic Notation) and FEN notations for the current move/board can be done using:

move.uci() # gets the current move in UCI notation 
board.san(move) # gets the current move in SAN notation
board.fen() # gets current board's FEN

I can now evaluate this position, at a required depth, using my version of stockfish 16 engine:

engine = chess.engine.SimpleEngine.popen_uci('stockfish-ubuntu-x86-64-avx2')
info = engine.analysis(board,limit=chess.engine.Limit(depth=36))

Time taken to evaluate 1.d4 at depth of 36: 0 days 00:00:42.480189

Alternatively a time limit for the analysis can be set:

info = engine.analyse(board, limit=chess.engine.Limit(time=2.0)) # runs engine's analysis for 2 seconds

It is also possible to get the best moves:

info = engine.analyse(board, chess.engine.Limit(time=2.0), multipv=3)# gets the 3 best moves

In this case info will be a vector with 3 elements: info[0], info[1] and info[2].

info will contain useful information, for example:

info['depth'] # gets the deph of the analysis
info['score'] # gets the score from the player who played perpective
info['score'].white().score() # gets the score from the White's perpective
info['score'].white().is_mate() # true if the score is mate
info['score'].white().mate() # number of moves until mate (zero if mate is on the board)
info['score'].wdl().white() # Win/Draw/Loss distribution out of 1000 games
from White's perspective

You can get Black's perspective by replacing .white() for .black().

A couple of examples:

1.d4 centipawns, from White perspective, at depth=36: 27
1.d4 Win/Draw/loss distribution, from White perspective, at depth=36: Wdl(wins=21, draws=978, losses=1)

Consequently, White's 1.d4 expected scores are:

Lichess model (given my Stockfish 16 evaluation at depth 36): 1.0/(1.0+exp(-0.00368208 * 27)) = 0.5248
Lichess master database: 0.33+0.44/2 = 0.55 (see next figure)
Stockfish 16 model at depth 36: (21+978/2)/1000 = 0.51

The previous figure also shows that Lichess' Stockfish 16 White's advantage for move 1.d4 is 0.1 or 10 centipawns at a deph of 36.

Therefore:

Chess engines evaluations and reality don't necessarily agree. People don't play like engines.
My version of Stockfish 16 does not automatically gives the same evaluations as Lichess' Stockfish 16.
Running my engine twice does not necessarily yield the same result.

Engines don't necessarily agree with each other (or with themselves). It is important to have this in mind.

The following figure reproduces Lichess' White's expected score as a function of centipawns

Move accuracy

Lichess calculates move accuracy as follows:

Accuracy% = 103.1668 * exp(-0.04354 * (winPercentBefore - winPercentAfter)) - 3.1669

My implementation is:

def lichess_move_accuracy(score_100_games_diff,par = [103.1668,-0.04354,-3.1669]):
    accuracy = par[0] * np.exp(par[1] * score_100_games_diff) + par[2]
    return accuracy

As an example, let's imagine that after 1.d4 Black plays 1...a6 (which is the least popular response to 1.d4 in Lichess's master database)

The board is:

The evaluations are:

Move	Centipawns	White score
1.d4	27	0.524834
1...a6	57	0.552278

Acuracy of the move 1...a6 = 103.1668 * exp(-0.04354 * (55.23 - 52.48)) - 3.1669 = 88.38%

The following graph reproduces the Lichess' accuracy as a function of the score difference over 100 games.

Reading and parsing a PGN file

I use python-chess reading and parsing PGN files in 3 different ways:

reading multiple games from one file (by the way, reading large PGN files using this module is extremely slow):

pgn_headers = [
    'Event',
    'Site',
    'Date',
    'Round',
    'White',
    'Black',
    'Result',
    'ResultDecimal',
    'WhiteTitle',
    'BlackTitle',
    'WhiteElo',
    'BlackElo',
    'ECO',
    'Opening',
    'Variation',
    'WhiteFideId',
    'BlackFideId',
    'EventDate',
    'Annotator',
    'PlyCount',
    'TimeControl',
    'Time',
    'Termination',
    'Mode',
    'FEN',
    'SetUp',
    'Moves',
]

games = pd.DataFrame(columns=pgn_headers)
game_n = 0

with open(pgn_file_name) as f:

    while True:
        game = chess.pgn.read_game(f)
## If there are no more games, exit the loop
        if game is None:
            break

        games.loc[game_n] = pd.Series(game.headers)
        game_n = game_n + 1

reading a PGN file with just a game

game = chess.pgn.read_game(open('pgn_file_name'))

Create a PGN from moves.

moves = '1. e4 e5 2. Nf3 Nh6 3. Nxe5 d6 4. Nf3 Qf6 5. d4 Ng4 6. Bg5 Qg6 7. Bd3 Be7 8. Bxe7 Kxe7
9. e5 f5 10. exf6+ Qxf6 11. O-O Bf5 12. Re1+ Kf8 13. Bxf5 Qxf5 14. Qe2 Nc6 15. h3 Re8 16. Qxe8# '
game = chess.pgn.read_game(io.StringIO(moves))

These days I play on Lichess mostly anonymously but I still like to analyse my games. Hence I use this method for my anonymous games.

To copy the moves, on Lichess I right click the game's last move, on the analysis board, and choose "copy variation PGN" and then I paste it on to my python code, as you can see above.

Lichess' average accuracy and advantage chart

For this section I will use one of my games as example:

https://lichess.org/7O3mPKYY

Seeing chess from White's perspective is clear that logic demands that when White plays either White keeps it score (100% accuracy) or White's score gets worst (less than 100% accuracy). When it is Black's turn either White's score is unchanged (100% accuracy by Black) or White's score improves (less than 100% accuracy for Black).

Of course Engines are not perfect. Therefore it can happen that White's score given by the engine can improve after White's move or decrease after Black's move. In both these two events I set White's score differential to zero, and I call it "White's adjusted expected score differential".

Finally, what is the centipawns equivalent to, for instance, a mate in 4? This question is important because to calculate accuracy using Lichess' function it is necessary to know the score and to know the score it is necessary to know the centipawns.

Theoretically the score of mate in 4 is 1, meaning that whichever color has a mate in 4 will win the game. In practice chess players frequently miss these opportunities (we all have been there, right?), consequently the score of a mate in 4 is not 1.

Python-chess converts a mate in m turns to centipawns by subtracting m from an arbitrary large number:

chess.engine.Mate(m).score(mate\_score=1800) # converts mate in centipawns = mate\_score - m

As you can see above, I chose 1800, equivalent to 2 queens, as this arbitrary large number.

Given my choice, for instance mate in 11 is converted to 1800-11 = 1789 centipawns.

The following table shows:

the moves in UCI and SAN notations,
White's advantage in centipawns,
White's expected score,
White's expected score differential,
White's adjusted expected score differential and
the depth at which the engine calculated White's advantage in centipawns in 2 seconds of analysis.

Turn	SAN	UCI	CP	Score	Diff	Adj_diff	Acc	Depth
White	e4	e2e4	30	52.76	0.46	0	100	25
Black	e5	e7e5	33	53.03	0.28	0.28	98.77	26
White	Nf3	g1f3	20	51.84	-1.19	-1.19	94.77	27
Black	Nh6	g8h6	173	65.41	13.57	13.57	53.98	24
White	Nxe5	f3e5	189	66.73	1.32	0	100	25
Black	d6	d7d6	185	66.4	-0.33	0	100	25
White	Nf3	e5f3	195	67.22	0.82	0	100	25
Black	Qf6	d8f6	291	74.49	7.27	7.27	72	25
White	d4	d2d4	307	75.59	1.1	0	100	24
Black	Ng4	h6g4	385	80.5	4.9	4.9	80.16	25
White	Bg5	c1g5	296	74.84	-5.66	-5.66	77.47	23
Black	Qg6	f6g6	313	76	1.16	1.16	94.92	24
White	Bd3	f1d3	242	70.91	-5.09	-5.09	79.51	24
Black	Be7	f8e7	397	81.18	10.27	10.27	62.8	24
White	Bxe7	g5e7	408	81.79	0.61	0	100	24
Black	Kxe7	e8e7	410	81.9	0.11	0.11	99.51	25
White	e5	e4e5	351	78.46	-3.45	-3.45	85.63	24
Black	f5	f7f5	406	81.68	3.23	3.23	86.48	25
White	exf6+	e5f6	309	75.73	-5.95	-5.95	76.44	24
Black	Qxf6	g6f6	319	76.4	0.67	0.67	97.03	25
White	O-O	e1g1	320	76.46	0.07	0	100	26
Black	Bf5	c8f5	405	81.63	5.16	5.16	79.23	25
White	Re1+	f1e1	386	80.55	-1.07	-1.07	95.29	24
Black	Kf8	e7f8	516	86.99	6.43	6.43	74.79	25
White	Bxf5	d3f5	379	80.15	-6.84	-6.84	73.42	24
Black	Qxf5	f6f5	399	81.29	1.15	1.15	94.98	26
White	Qe2	d1e2	381	80.26	-1.03	-1.03	95.48	23
Black	Nc6	b8c6	371	79.67	-0.59	0	100	25
White	h3	h2h3	364	79.25	-0.42	-0.42	98.13	24
Black	Re8	a8e8	1799	99.87	20.61	20.61	38.88	245
White	Qxe8#	e2e8	1800	100	0.13	0	100	0

Average accuracy

To calculate White's and Black's average accuracy, as far as I understand, Lichess uses an harmonic mean, which can be easily calculated using the function hmean in the module scipy.stats.

The following table shows the average accuracy for White and Black as calculated by me and by Lichess.

	White	Black
Me	91	76
Lichess	91	77

The next table compares my and Lichess' average accuracy for a few random games.

		White	Black
Carlsen vs Esipenko	Me	97	92
	Lichess	96	90
Firouzja vs Nepo	Me	88	95
	Lichess	87	95
Game ve1epqzT	Me	81	86
	Lichess	83	87
Game lO13IlTf	Me	80	60
	Lichess	78	63
Game sLv7lVYF	Me	81	90
	Lichess	77	87

As it can be seen the accuracy numbers are not the same but close enough. Certainly close enough for my purpose.

Advantage chart

To replicate Lichess' advantage plot, White's expected score needs to be converted from the [0, 1] interval to [-1, 1] by calculating 2*lichess_white_expected_score-1.

The next figure reproduces Lichess' advantage chart for my game and also shows the actual Lichess graph for comparison.

Final thoughts

Just for fun, it is possible to make two engines play each other:

engine1 = chess.engine.SimpleEngine.popen_uci('your engine 1')
engine2 = chess.engine.SimpleEngine.popen_uci('your engine 2')

board = chess.Board()

while not board.is_game_over():
     result = engine1.play(board, chess.engine.Limit(time=1.0))
     board.push(result.move)
     display(board)
     result = engine2.play(board, chess.engine.Limit(time=1.0))
     board.push(result.move)
     display(board)

The moves played can be found in:

board.move_stack

I hope that this blog will help you start using python-chess, if you are interested in these sort of things, of course.

Discuss this blog post in the forum