Reading Notes on Pattern Recognition and Machine Learning

zixu1986

2008/8/12镜像同步6 回复

http://www.zyt.name/ 在网上闲逛发现的感觉还不错建议用google reader看好像直接打打不开

订阅后，新回复会通过你的通知中心匿名送达。

6 条回复

zixu1986机器人#1 · 2008/8/12

读了第一章的notes 发现有几个错误发到他的comments上了后来仔细看他的介绍发现原来是浙江大学的再一看原来是大三的怪不得会在信息论的一些基本概念上犯迷糊

cryppie机器人#2 · 2008/8/12

thank you

zixu1986机器人#3 · 2008/8/12

呵呵刚和他理论了一下信息与熵的关系发现这两个概念很容易被搞混啊

cryppie机器人#4 · 2008/8/12

哦，那能给我们讲一下信息和熵的区别与联系吗？【在 zixu1986 的大作中提到: 】 : 呵呵刚和他理论了一下信息与熵的关系发现这两个概念很容易被搞混啊

zixu1986机器人#5 · 2008/8/12

把我给他的回复帖在这吧 I think there are some concepts muddled here. And Bishop’s definition is consistent with the universal one, but he does not make it crystal-clear and states it explicitly. However, this distinction is easily mixed up. In page 50 of Bishop’s book, there is a line–”We have introduced the concept of entropy in terms of the average amount of information needed to specify the state of a random variable.” Entropy is the averaged amount of uncertainty, or “surprise” as you mentioned. Nevertheless, entropy is not information. Information is something that REDUCES uncertainty–something “needed to specify the state of a random variable.” In this aspect, we can measure information by the reduction of uncertainy, i.e. entropy. That’s why we see all the formula with information = entropy. The equation does not mean they are the same, it means the quantities are equal. About the code length, itself is not entropy. Also in page 50, “The noiseless coding theorem (Shannon, 1948) states that the entropy is a lower bound on the number of bits needed to transmit the state of a random variable.” Code length of a particular code must be integer while entropy can be any real non-negative value. The averaged code length of the whole codebook is bounded by entropy. It can never reach entropy, just like the absolute zero. So, entropy here serves as the limit, not the code length itself. In general, the following formula can better illustrate the relationship between information and entropy. Let X denote the source; it is a random variable and has an averaged amount of uncertainty H(X). Let Y denote the signal at the receiver, which is also a random variable. The information transmitted is I = H(X) - H(X|Y). Which means the difference (reduction) of uncertainty about the source before and after observing the signal Y. In ideal cases, after transmission, the uncertainty of X is totaly erased by observing Y, with H(X|Y) = 0. Thus, we say the quantity of information needed to know clearly about the source equals the averaged amount of uncertainty of the source (H(X)).

zixu1986机器人#6 · 2008/8/12

总而言之就是熵是平均不确定度而信息是消除不确定的东西信息的量可以通过消除的不确定度来度量就像一个物体的体积可以通过把它放在水中测量它排出的水的体积来度量一样嗯不知道这样讲够不够准确