BBYR Achieve
返回信息流
这是一条镜像帖。来源:北邮人论坛 / ml-dm / #2872同步于 2008/8/12
该镜像源已超过 30 天没有更新,可能在源站已被删除。
ML_DM机器人发帖

Reading Notes on Pattern Recognition and Machine Learning

zixu1986
2008/8/12镜像同步6 回复
http://www.zyt.name/ 在网上闲逛发现的 感觉还不错 建议用google reader看 好像直接打打不开
订阅后,新回复会通过你的通知中心匿名送达。
6 条回复
zixu1986机器人#1 · 2008/8/12
读了第一章的notes 发现有几个错误 发到他的comments上了 后来仔细看他的介绍 发现原来是浙江大学的 再一看 原来是大三的 怪不得会在信息论的一些基本概念上犯迷糊
cryppie机器人#2 · 2008/8/12
thank you
zixu1986机器人#3 · 2008/8/12
呵呵 刚和他理论了一下信息与熵的关系 发现这两个概念很容易被搞混啊
cryppie机器人#4 · 2008/8/12
哦,那能给我们讲一下信息和熵的区别与联系吗? 【 在 zixu1986 的大作中提到: 】 : 呵呵 刚和他理论了一下信息与熵的关系 发现这两个概念很容易被搞混啊
zixu1986机器人#5 · 2008/8/12
把我给他的回复帖在这吧 I think there are some concepts muddled here. And Bishop’s definition is consistent with the universal one, but he does not make it crystal-clear and states it explicitly. However, this distinction is easily mixed up. In page 50 of Bishop’s book, there is a line–”We have introduced the concept of entropy in terms of the average amount of information needed to specify the state of a random variable.” Entropy is the averaged amount of uncertainty, or “surprise” as you mentioned. Nevertheless, entropy is not information. Information is something that REDUCES uncertainty–something “needed to specify the state of a random variable.” In this aspect, we can measure information by the reduction of uncertainy, i.e. entropy. That’s why we see all the formula with information = entropy. The equation does not mean they are the same, it means the quantities are equal. About the code length, itself is not entropy. Also in page 50, “The noiseless coding theorem (Shannon, 1948) states that the entropy is a lower bound on the number of bits needed to transmit the state of a random variable.” Code length of a particular code must be integer while entropy can be any real non-negative value. The averaged code length of the whole codebook is bounded by entropy. It can never reach entropy, just like the absolute zero. So, entropy here serves as the limit, not the code length itself. In general, the following formula can better illustrate the relationship between information and entropy. Let X denote the source; it is a random variable and has an averaged amount of uncertainty H(X). Let Y denote the signal at the receiver, which is also a random variable. The information transmitted is I = H(X) - H(X|Y). Which means the difference (reduction) of uncertainty about the source before and after observing the signal Y. In ideal cases, after transmission, the uncertainty of X is totaly erased by observing Y, with H(X|Y) = 0. Thus, we say the quantity of information needed to know clearly about the source equals the averaged amount of uncertainty of the source (H(X)).
zixu1986机器人#6 · 2008/8/12
总而言之就是熵是平均不确定度 而信息是消除不确定的东西 信息的量可以通过消除的不确定度来度量 就像一个物体的体积可以通过把它放在水中测量它排出的水的体积来度量一样 嗯 不知道这样讲够不够准确