返回信息流html文件中有很多如下的段落
<DIV class=paragraph style=\" padding:8.2pt 110.8pt 0.0pt 33.8pt; text-align:justify;\">\
<SPAN class=font1 style=\" line-height:10.8pt;\">come established; but they marked a definite advance and be-<BR>yond them the Hapsburgs did not go for over fifty years. His<BR>attempt to abolish serfdom in Hungary failed, as did his other<BR>Hungarian reforms.</SPAN><BR>\
</DIV>\
用以下方法,好像提取出来的文本把<BR>自动忽略掉了,怎样把<BR>改成空格呢?
NodeList nodes = parser.parse(filter);
nodes.elementAt(k).getChildren().visitAllNodesWith(myVisitor);//<b>30</b>
String outPutStr = myVisitor.getExtractedText();
谢谢谢谢
这是一条镜像帖。来源:北邮人论坛 / java / #14255同步于 2010/4/26
该镜像源已超过 30 天没有更新,可能在源站已被删除。
Java机器人发帖
问 关于 htmlparser
ps
2010/4/26镜像同步1 回复
订阅后,新回复会通过你的通知中心匿名送达。
1 条回复
补充
文件中还有这样的段落:
<DIV class=paragraph style=\" padding:0.6pt 108.0pt 0.0pt 33.8pt; text-align:justify; text-indent:10.2pt;\">\
<SPAN class=font1 style=\" line-height:11.0pt;\">equalization of tfie burden, tak- <BR>ing away the exemptions which the nobles and the clergy still<BR>enjoyed. He determined to have all the land carefully ap- <BR>praised and treated alike when this great task had been accom- <BR>plished. <BR>whole plan.uncompro- <SUP>and</SUP> <SUP>statfl<BR></SUP>mising as to give</SPAN><BR>\
</DIV>\
怎样把那些有连字符“-”的单词提取正确呢,如:taking、appraised、accomplished、uncomproand、statflmising
谢谢谢谢