Can ChatGPT really play chess? This was the question that motivated me to run a chess match between ChatGPT and my hybrid AI model, that is a chess expert bot. The first game was against GPT 3.5 and in this game I found several limitations of the OpenAI LLM model—it was really hard to play the match till its end because of the lack of understanding about the rules of chess coming from ChatGPT, many illegal moves, and wrong analysis.
This analysis is very important to understand the limitations of LLMs, their long-term reasoning, and analytical power. By knowing the model’s behavior well, we can find ways to resolve its flaws and increase its strengths. As AI engineers, we must always set up different experiments to analyze the real behavior of the models and plan to adapt and improve it in our projects. Large Language Models technology is still very recent and must be increasingly explored and studied to ensure its best use and understanding.
After beating ChatGPT 3.5 we now face a more difficult opponent, the evolution of its predecessor, the powerful ChatGPT 4.0. This challenge with a new and more powerful opponent led me to certain questions:
- Has GPT 4.0 really evolved compared to GPT 3.5 in complex analysis ?
- Will we be able to play an entire match now ?
- Will GPT 4.0 commit the same mistakes as GPT 3.5 ?
- Will GPT 4.0 be able to beat my Expert AI model ?