一道C语言题目，关于前缀配对的

2010/8/8镜像同步10 回复

题目是这样的： Description Maybe there are 750,000 words in English and some words are prefix of other words, for example: the word "acm" can be treat as one prefix of "acmicpc". What's more, most of such pairs of words have relationship between them. Now give you a dictionary, your work is to tell me how many such pairs. There may be other characters in the word, for example '_','-',and so on. Pay attention that 'A' and 'a' are not the same character! Input In the first line of the input file there is an Integer T, which means the number of test cases. Followed by T test cases. For each test case, in the first line there is an integer N(0<N<50000), followed N lines, each contains a word whose length is less than 30 and no space appears in any word. There are no same words in two different lines. Output For each test case, tell me how many such pairs. If the number is larger than 11519, just divide it by 11519 and only output the remainder. Sample Input 2 2 acmicpc acm 3 a abc ab Sample Output 1 3 我写的程序在自己电脑上跑了几个输入比较少的程序没问题，可是在online judge上ac不过去……程序见2楼希望高手指点下哈~

订阅后，新回复会通过你的通知中心匿名送达。

9 条回复

strangerz机器人#1 · 2010/8/8

#include <stdio.h> #include <stdlib.h> #include <string.h> struct wordtree { char c; int pass,end; struct wordtree *child,*brother; }; long insert(struct wordtree **head,char *word) { int len,i,result=0; struct wordtree *now; if (*head==NULL) { *head=(struct wordtree*)malloc(sizeof(struct wordtree)); (*head)->brother=NULL; (*head)->child=NULL; (*head)->c=word[0]; (*head)->pass=0; (*head)->end=0; } now=*head; len=strlen(word); for(i=0;i<len;i++) { while(now->brother!=NULL && now->c!=word[i]) now=now->brother; if (now->c==word[i]) { result+=now->end; if (i!=len-1) { now->pass++; if (now->child==NULL) { now->child=(struct wordtree*)malloc(sizeof(struct wordtree)); now=now->child; now->brother=NULL; now->child=NULL; now->c=word[i+1]; now->pass=0; now->end=0; } else now=now->child; } else { result+=now->pass; now->end++; } } else { if (now->brother==NULL) { now->brother=(struct wordtree*)malloc(sizeof(struct wordtree)); now=now->brother; now->brother=NULL; now->child=NULL; now->c=word[i]; now->pass=0; now->end=0; if (i==len-1) now->end++; else now->pass++; } } } return result; } main() { struct wordtree **head; char word[31]; int cases,words; long sum; head=(struct wordtree **)malloc(sizeof(head)); scanf("%d",&cases); while(cases--!=0) { sum=0; *head=NULL; scanf("%d",&words); while(words--!=0) { scanf("%s",word); sum+=insert(head,word); } printf("%ld\n",sum%11519); } }

jmpesp机器人#2 · 2010/8/9

理想化的、实践不可行的Hash，可以达到算法的下界O(n)，但先对单词排序时间O(nlogn) 然后采用二分查找，只要单词出现的比较随机可以使得时间复杂度达到近似O(nlogn)，但一般情况下，最坏时间复杂度依然O(n^2) 有没优于O(n^2)的算法？ n是单词个数

wks机器人#3 · 2010/8/9

C语言不太好的路过。试试这个算法，O(nlogn)的，n是单词数。瓶颈是排序，如果排好，就是O(n)了。受“奶牛排队”扫描算法启发。 #!/usr/bin/env ruby MOD_NUM = 11519 ncases = gets.to_i ncases.times do nwords = gets.to_i words = [] nwords.times do words << gets.chomp end # sort the list. this is important words.sort! relation_count = 0 # Maintain a stack prefix_stack = [] words.each do |word| # keep the stack in such a state that # every word in the stack is a prefix of the word upon it while not (prefix_stack.empty? or word.start_with? prefix_stack.last) prefix_stack.pop end # then those (and ONLY those) words in the stack are the # prefixes of the current word relation_count += prefix_stack.length # add the current word into the stack prefix_stack<<word end puts relation_count % MOD_NUM end

wks机器人#4 · 2010/8/9

这个大单词表可以用来试验附件(159KB) corncob_lowercase.zip

strangerz机器人#5 · 2010/8/10

感谢嗷，这个是用C++写的么，看不懂我在网上找了找算法啊，大体就是两个，排序还有做个字典树然后我感觉字典树的方法挺好的，貌似复杂度低些……不过我自己测试了一些比较少量的数据都没问题，但是就是ac不了，囧看来我还是先学习数据结构去好了咩…… 【在 wks 的大作中提到: 】 : C语言不太好的路过。 : 试试这个算法，O(nlogn)的，n是单词数。瓶颈是排序，如果排好，就是O(n)了。 : 受“奶牛排队”扫描算法启发。 : ...................

strangerz机器人#6 · 2010/8/10

貌似做个字典树是个好方法，看上去是O(n)的。。。？不过我写的程序可读性太差了……囧【在 jmpesp 的大作中提到: 】 : 理想化的、实践不可行的Hash，可以达到算法的下界O(n)，但 : 先对单词排序时间O(nlogn) : 然后采用二分查找，只要单词出现的比较随机可以使得时间复杂度达到近似O(nlogn)，但一般情况下，最坏时间复杂度依然O(n^2) : ...................

strangerz机器人#7 · 2010/8/10

问题已经解决了，程序里存在一些细微的错误，还是考虑的不够全面…… 谢谢上面两位同学这种方法比排序的方法效率高些

maijian机器人#8 · 2010/8/10

【在 strangerz 的大作中提到: 】 : 貌似做个字典树是个好方法，看上去是O(n)的。。。？ : 不过我写的程序可读性太差了……囧 : 【在 jmpesp 的大作中提到: 】 : ................... 请教：如果我在字典树中插入了acmp，那我怎么知道acm这个单词是否存在呢？

wks机器人#9 · 2010/8/10

字典树每个节点（不仅仅是叶子）都要做标记【在 maijian 的大作中提到: 】 : : 貌似做个字典树是个好方法，看上去是O(n)的。。。？ : : 不过我写的程序可读性太差了……囧 : : 【在 jmpesp 的大作中提到: 】 : ...................