西西河

主题:【原创】围绕脑科学而发生的若干玄想 -- 鸿乾

共:💬461 🌺824 🌵2 新:
全看分页树展 · 主题 跟帖
家园 路径积分 vs 机器学习

when I get time, I will read and comment on the following:

"Path Integral Reinforcement Learning"

http://homes.cs.washington.edu/~etheodor/papers/LearningWorkshop11.pdf

1.

相关搜索- 上载www.pudn.com 程序员联合开发网

s.pudn.com/search_uploads.asp?k=lingo

轉為繁體網頁

... 最优,并在此基础上以路边约束、动态避障和路径最短作为适应度函数,提出(322KB, .... 72. jiqixuexi.rar - 这是博弈论算法全集第三部分:机器学习,其它算法将陆续推出. ..... 它所做的工作是将积分方程化为差分方程,或将积分方程中积分化为有限求和, ...

和机器学习和计算机视觉相关的数学- godenlove007的专栏- 博客频道 ...

blog.csdn.net/godenlove007/article/details/8510392

轉為繁體網頁

2013年1月16日 – 和机器学习和计算机视觉相关的数学之一(以下转自一位MIT牛人的空间文章, ... 而在统计学中,Marginalization和积分更是密不可分——不过,以解析形式把 ..... 目录的路径不能含有中文,不能含有空格,以字母开头,路径别太长。

科学网—《李群机器学习》李凡长等- 中国科大出版社的博文

blog.sciencenet.cn/blog-502977-684746.html

轉為繁體網頁

2013年4月28日 – 从历史经验看,研究机器学习应该“以认知科学为基础、数学方法为手段、 ... 途径,并沿着这样的路径来构建机器学习的理论、技术、方法和应用体系”.

一篇演讲By 浙江大学数学系主任刘克峰- bluenight专栏- 博客频道 ...

blog.csdn.net/chl033/article/details/4888555

轉為繁體網頁

2009年11月27日 – 物理学家学习数学的方式也许值得我们借鉴,Witten他们大概从来不做 ... 虽然Feynman的路径积分还缺少严格的数学基础,该理论因其物理上的 ...

机器学习前沿热点–Deep Learning - 大枫叶_HIT - 博客频道- CSDN ...

blog.csdn.net/datoubo/article/details/8596444

轉為繁體網頁

2013年2月20日 – 深度学习是机器学习研究中的一个新的领域,其动机在于建立、模拟人脑进行分析学习的神经网络 ... 这种流向图的一个特别属性是深度(depth):从一个输入到一个输出的最长路径的长度。 .... 访问:1889次; 积分:113分; 排名:千里之外 ...

2.

the following papar

"Path Integral Reinforcement Learning"

http://homes.cs.washington.edu/~etheodor/papers/LearningWorkshop11.pdf

Abstract—Reinforcement learning is one of the most fundamental

frameworks of learning control, but applying it to

high dimensional control systems, e.g., humanoid robots, has

largely been impossible so far. Among the key problems are

that classical value function-based approaches run into severe

limitations in continuous state-action spaces due to issues of

function approximation of value functions, and, moreover,

that the computational complexity and time of exploring high

dimensional state-action spaces quickly exceeds practical feasibility.

As an alternative, researchers have turned to trajectorybased

reinforcement learning, which sacrifices global optimality

in favor of being applicable to high-dimensional state-action

spaces. Model-based approches, inspired by ideas of differential

dynamic programming, have demonstrated some sucess if models

are accurate, but model-free trajectory-based reinforcement

learning has been limited by problems of slow learning and the

need to tune many open parameters.

In this paper, we review some recent developments of

trajectory-based reinforcement learning using the framework of

stochastic optimal control with path integrals. The path integral

control approach transforms the optimal control problem into

an estimation problem based on Monte-Carolo evaluations of a

path integral. Based on this idea, a new reinforcement learning

algorithm can be derived, called Policy Improvement with

Path Integrals (PI2). PI2 is surprising simple and works as

a black box learning system, i.e., without the need for manual

parameter tuning. Moreover, it learns fast and efficiently in very

high dimensional problems, as we demonstrate in a variety of

robotic tasks. Interestingly, PI2 can be applied in model-free,

hybrid, and model-based scenarios. Given its solid foundation in

stochastic optimal control, path integral reinforcement learning

offers a wide range of applications of reinforcement learning

to very complex and new domains.

全看分页树展 · 主题 跟帖


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河