MuGo 模仿 AlphaGo 的简约围棋引擎

MuGo:模仿 AlphaGo 的简约围棋引擎

这是使用 TensorFlow 的纯 Python 实现的基于神经网络的围棋 AI 。

目前,人工智能仅由一个策略网络组成,有监督训练学习。 我已经实施了蒙特卡罗树搜索,但由于用 Python 编写,模拟太慢了。 我希望通过用一个价值网络替换模拟来完全绕过这个问题,这将需要一个NN评估。 (毕竟,随机模拟只是对函数的粗略近似,所以如果你有一个很好的价值函数,你将不需要一个实践…)

这个项目的目标是看看一个纯粹基于神经网络的 Go AI有多强大。 换句话说,基于 UCT 的树搜索与由策略网络驱动的移动,以及用于评估选择的价值网络。 一个明确的非目标是深入到优化蒙特卡罗模拟的微妙之中。



MuGo: A minimalist Go engine modeled after AlphaGo

This is a pure Python implementation of a neural-network based Go AI, using TensorFlow.

Currently, the AI consists solely of a policy network, trained using supervised learning. I have implemented Monte Carlo Tree Search, but the simulations are too slow, due to being written in Python. I am hoping to bypass this issue entirely by replacing the simulations with a value network which will take one NN evaluation. (After all, random simulations are but a crude approximation to a value function, so if you have a good enough value function, you won’t need a playout…)

The goal of this project is to see how strong a Go AI based purely on neural networks can be. In other words, a UCT-based tree search with moves seeded by a policy network, and a value network to evaluate the choices. An explicit non-goal is diving into the fiddly bits of optimizing Monte Carlo simulations.

Related posts

Leave a Comment