lichess.org
Donate

Enable download of time-control specific databases (e.g. Blitz, Rapid, Classical databases)

Hello everyone,

First of all, I think that Lichess is really a bliss, both for the chess world as well as for a big data fanatic. I love the Lichess APIs as well as the monthly game databases.

Now, with the growth of Lichess the monthly game databases have become large. And with large, I mean LARGE. The compressed .zst-files are 30 GBs large, which ones decompressed and extracted easily amounts to more than 500 GBs. Per month. Unpacking them is a challenge in its own right, opening the PGNs with any tool is virtually impossible. One needs to split them into "digestible" sizes in order for SCID, for example, to handle them. And when talking about big data, we of course talk about Python and its wonderful libraries one can use to calculate regression curves, for instance. But streaming over >500 GB large text-files would really put huge stress on a normal computer.

I wonder if it would be possible to provide databases for each time control and maybe one for game variants. Surely, by far the Bullet games provide the vast majority of all games, despite being the ones a serious chess player would probably want to look at least. Also, for data analysis one probably wouldn't need all >500 GB of games but rather a smaller subset would suffice for some data fun.

Would that be possible in some way?

Thanks and keep up the great work!

This topic has been archived and can no longer be replied to.