返回上一页 文章阅读 登录

周剑铭 柳渝:智能哲学:AlphaGo Zero与围棋文化

更新时间:2017-11-07 11:39:45
作者: 周剑铭   柳渝  

  


   *法国儒勒·凡尔纳公立综合大学(Université de Picardie Jules Verne, France),计算机系

  

   摘要:继AlphaGo完胜人类棋手后AlphaGo Zero完胜AlphaGo,恰恰表明了作为人工智能的围棋机器的技术性本质。中国古围棋在日本的职业化也是围棋的技术化,这是今日围棋机器完胜人类的必然。中国围棋的文化本质蕴含于棋艺和棋道之中。围棋的棋理只有在科学与人文和中、西文化的交叉视域中才能得到真正的阐释。

  

   AlphaGo以学习人类经验棋谱而战胜了人类棋手,成为了人工智能的时代标志,而AlphaGo Zero则以“白板”(tabular rasa)学习而再次成为头号新闻,英国经验主义哲学家洛克(John Locke,1632-1704)著名的“白板”说(theory of tabula rasa)认为,人出生时心灵像白板一样空白,通过人的经验心灵中才有了观念和知识,洛克认为经验是观念、知识的惟一来源。AlphaGo Zero的“白板”是指与人类经验棋谱相对的空棋盘,即从0开始的“学习”,但洛克的心灵“白板”是人从现实经验中认知或学习,两者的区别就在于AlphaGo Zero不需要人类的棋谱经验而是自己与自己在棋盘上对战的“经验”,这个区别的微妙之处就在于人类的经验与机器的“经验”有何本质的不同,这与AlphaGo对人类的伦理挑战不同,AlphaGo Zero的“白板”是对人类哲学问题的一个挑战,这些问题都深刻地与我们对人工智能的本质的理解和定义有关,实际上已经成为了今天我们对人的智能的基本认知理论的更新,其意义远超过AlphaGo Zero的成功。

  

   就AlphaGo Zero的具体情况来说,本文讨论1。AlphaGo Zero的“白板”与人类的心灵“白板”有何不同?2。AlphaGo Zero自我对弈的经验与人类的经验有何本质的不同?我们可以在智能哲学的论域中研究这些问题的深刻意义。

  

   一、AlphaGo Zero的“白板”学习与人工智能的“先天”性赋予

  

   DeepMind团队在“自然”杂志上发表的论文,推出了人工智能围棋程序的最新版本的更强大的“学习”能力, AlphaGo Zero:Mastering the game of Go without human knowledge (无需人类知识的围棋大师),据称,AlphaGo Zero以100 : 0的成绩击败李世乭版本的AlphaGo。(http://nature.com/articles/doi:10.1038/nature24270,中文介绍可见:http://mp.weixin.qq.com/s/68GTn-BaiRPmzi9F-0sCyw)最引人注意的地方是,“我们介绍一种单独基于强化学习方法的算法,无需人类数据、人类的指导,或超越围棋规则的领域知识。AlphaGo成为了它自己的老师,”(we introduce an algorithm based solely on reinforcement learning, without human data, guidance, or domain knowledge beyond game rules. AlphaGo becomes its own teacher)。

  

   这篇论文的第一作者、AlphaGo项目负责人DeepMind的David Silver在采访中这样解释说:

  

   ——AlphaGo Zero完全从“乱打”(随机)开始,不需要任何人类数据从最初原理开始而取得最高的综合棋艺水平。AlphaGo Zero最重要的理念就是它完全从无知状态开始学习,也就是从白板(tabular rasa)上开始,从自我对弈中领悟,不需要任何人类知识或人类数据,不需要任何人类经验、特征或人类的干预。它去发现如何从基本原理开始下围棋。因此白板学习对我们DeepMind的目标和雄心非常重要,因为如果你能得到白板学习,你就得到了一个代理,它可以从围棋移植到任何其它领域。你就从你所在的专业领域解放了出来,你得到了一个算法,它具有普遍性可以应用到任何地方。对于我们来说AlphaGo的意义不在于下棋战胜人类,而是去发现从事科学工作的意义,从程序的自我学习能力中了解知识是什么。我们开始发现,AlphaGo Zero不仅重新发现了人类下棋时的常用模式和开局,以及人类下在棋角上的定式,不仅是学习、发现这些而且最终放弃它们而采用自己的模式,其中有些甚至是人类不知道的或现在还没有用过的。因此我们可以说,事实上在短时间内AlphaGo Zero学到了人类上千年积累的围棋实战知识。AlphaGo Zero下棋中分析,靠自己发现更多的知识。有时候它的选择甚至超过这些,得到一些人类在这个时候尚未发现的东西,在不同的方式上发展出具有创意的新的知识点。

  

   (AlphaGo Zero which has learned completely from scratch, from first principles without using any human data and has achieved the highest level of performance overall. The most important idea in AlphaGo Zero is that it learns completely tabular rasa. That means it starts completely from a blank slate and figures out for itself only from self-play, without any human knowledge, without any human date, without any human examples or features or intervention from humans. It discovers how to play the game of Go completely from fist principles. So tabular rasa learning is extremely important to our goals and ambitions at DeepMind. And the reason is that if you can achieve tabula rasa leaning, you really have an agent that can be transplanted from the game of Go to any other domain. You untie yourself from the specifics of the domain you’re in and you come up with an algorithm which is so general that it can be applied anywhere. For us the idea of AlphaGo is not to go out and defeat humans, but actually to discover what it means to do science, and for a program to be able to lean for itself what knowledge is. So, what we start to see was that AlphaGo Zero not only rediscovered the common patterns and openings that human tend to play, these joseki patterns that human play in the corners. It also leaned them, discovered them and ultimately discarded them in preference for its own variants which humans don’t even know about or play at the moment. And so we can say that really what’s happened is that in a short space of time, AlphaGo Zero has understood all of the Go knowledge that has been accumulated by humans over thousands of years of playing. And it’s analyzed it and started to look at it and discover much of this knowledge for itself. And sometimes it’s chosen to actually to beyond that and come up with something which the human hadn’t even discovered in this time period. And developed new pieces of knowledge which were creative and novel in many ways. )

  

DeepMind强调AlphaGo Zero从白板上开始自我学习,这是指机器进入包括训练或实战状态时不从学习巨量的人类数据开始(People tend to assume that machine learning is all about big data massive amounts of computation),但这时的AlphaGo Zero本身并非白板(裸机),也并非只包含了“操作系统”的纯净机器,而是具有了强大的机器学习能力的机器,David Silver说 “但实际上我们从AlphaGo Zero中发现,算法比所谓计算或可用数据更重要,事实上我们在AlphaGo Zero上使用的计算(量)比过去在AlphaGo上要少一个数量级,这是因为我们使用了更多原理和算法。“(But actually what we saw in AlphaGo Zero is that algorithms matter much more than either compute or data availability. In fact in AlphaGo Zero, we use more than an order of magnitudes less computation than we used in previous versions of AlphaGo. And yet it was able to perform much higher level due to using much more principled algorithms than we had before.(点击此处阅读下一页)

本文责编:川先生
发信站:爱思想(http://m.aisixiang.com)
本文链接:http://m.aisixiang.com/data/106762.html
文章来源:爱思想首发,转载请注明出处(http://www.aisixiang.com)。
收藏