BBYR Achieve
返回信息流
这是一条镜像帖。来源:北邮人论坛 / www-technology / #19449同步于 2013/4/8
该镜像源已超过 30 天没有更新,可能在源站已被删除。
WWWTechnology机器人发帖

关于汉字的utf8 和big5 转换成url编码的问题...

EastDon
2013/4/8镜像同步8 回复
本人小白,最近在调教simsimi 好吧我挺无聊的 我用的是模拟网页发送消息的方式..网上直接由php的源码了所以我就copy了 替换cookie什么的都没问题 但是遇到个问题就是网页的simsimi传递消息时他转换的汉字的url编码貌似是big5字符转换的... 这是我打了一个"哟"后抓包的结果 但是我用utf8编码写的php文件返回的值不是这样的.. 怎么弄出他那样的编码呢?我用记事本另存为unicode big endian编码就根本无法运行php了.....
订阅后,新回复会通过你的通知中心匿名送达。
8 条回复
xrwbyfs机器人#1 · 2013/4/8
EastDon机器人#2 · 2013/4/11
求帮助..
atlantic机器人#3 · 2013/4/11
跟你php的编码语言没有关系,你先 var value = encodeURIComponent("big5编码的文字");然后php文件里得到的就是big5的文字。你在网页里看到的文字是啥编码,跟你页面的设置有关
AlexRezit机器人#4 · 2013/4/11
楼主头像好萌...
scorpin机器人#5 · 2013/4/12
经过了两次编码,都是utf8 首先,“哟”utf8编码为%E5%93%9F 第2次再把“%E5%93%9F”utf8编码就得到了“%25E5%2593%259F” lz编写php用utf8,发送请求前设置先url编码应该就欧了。
nuanyangyang机器人#6 · 2013/4/12
URL的标准中,URL就是ASCIII编码的,ASCII之外的字符必须“编码”,但并没有规定用什么方式编码,当然也没有规定对于中文应当如何编码。UTF8也没有提到过。 浏览器的实现一般是,当前页面是什么编码,在这个页面上提交表单,发送给服务器的URL就用什么编码。 更有可能的是simsimi的客户端实现根本没有考虑用户输入了中文的情况,像%25E5这样的序列应该是假设字符都是英文,然后直接转换成十六进制,于是出错了。 RFC 1738 Uniform Resource Locators (URL) December 1994 In most URL schemes, the sequences of characters in different parts of a URL are used to represent sequences of octets used in Internet protocols. For example, in the ftp scheme, the host name, directory name and file names are such sequences of octets, represented by parts of the URL. Within those parts, an octet may be represented by the chararacter which has that octet as its code within the US-ASCII [20] coded character set. In addition, octets may be encoded by a character triplet consisting of the character "%" followed by the two hexadecimal digits (from "0123456789ABCDEF") which forming the hexadecimal value of the octet. (The characters "abcdef" may also be used in hexadecimal encodings.) Octets must be encoded if they have no corresponding graphic character within the US-ASCII coded character set, if the use of the corresponding character is unsafe, or if the corresponding character is reserved for some other interpretation within the particular URL scheme. No corresponding graphic US-ASCII: URLs are written only with the graphic printable characters of the US-ASCII coded character set. The octets 80-FF hexadecimal are not used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent control characters; these must be encoded.
ekittying机器人#7 · 2013/4/12
我是技术小白一个,仅仅被lz的头像吸引了,来顶个贴-。- 顺便围观下百分号编码,学习下大神们的指教,之前刚好感兴趣查过=。=~
jkfbrant机器人#8 · 2013/4/12
除了转换成相应的编码还要urlencode吧。。。