MP Chinese Word Segmentation v0.02.0
http://www.phpclasses.org/browse/package/2508.html

Word segmentation class using the maximum probability approach. Still in verifying now.

PS, I found that when using the same lexicon, the difference between the result segmented by maximum probability and reverse maximum match approach is less than 1%.
Current language: English · also available in: Chinese (Simplified)
SCWS renamed
Simple Chinese Word Segmentation has been renamed to Fast Chinese Word Segmentation.

Whereafter, Fast Chinese Word Segmentation v0.05.2 was published.
Current language: English · also available in: Chinese (Simplified)
PHP Class: Simple Chinese Word Segmentation
http://www.phpclasses.org/browse/package/2431.html

Due to the maximum 300KB limit on phpclasses.org, the lexicon was divided into two parts.
Current language: English · also available in: Chinese (Simplified)
phpclasses.org 最近效率怎么这么低
我在北京时间 07/12/2005 中午就在 phpclasses.org 发布了我的 Simple Chinese Word Segmentation。现在都 07/16/2005 了,还没有审核完。
Current language: Chinese (Simplified)
第三次修改后中文分词程序的分词速度
程序改动:不使用 PHP 中的 Multi-Byte String 函数,自己判断汉字

分词方法:逆向最大匹配分词法,只对中文字符进行分词

词库大小:73,226 个词

编写语言:PHP

分词速度:

99% 中文 —— 211 KB —— 2s —— 105.50KB/s (+ 92.32KB/s)

45% 中文 —— 2, 100 KB —— 22s —— 95.45KB/s (+ 66.69KB/s)

00% 中文 —— 413 KB —— 6s —— 68.83KB/s (+ 9.83KB/s)
Current language: Chinese (Simplified)
More entries: [1] [2] [3]
« Previous page · Next page »