Thank you very much for your answers fellows. I think I am going to start with a less ambitious computer program, the one relate with how the cards are dealt and announced to the players. Thanks again!
Learn the basics of PyTorch and use a model from the huggingface transformers package.
Collect a small dataset and see if you can get a model to correctly identify the scenarios you give it.
Iterate from there.
You could just mainly use the OpenAI APIs. You want it to recognize and encode the board into text at the beginning by sending an image to the chat endpoint. You could use their Whisper API or Deepgram API for speech to text. You would probably need to send every speech utterance that has been transcribed to gpt-3.5. it would then output an updated encoded and marked board each time and a flag indicating a win or not. If it sets the flag then you can play an mp3 declaring the win or something.
As long as you can find the documentation for OpenAI and stick with it, you will be able to do it. You should be able to get into the docs with a Google search.