当前页面: 开发资料首页 → J2EE 专题 → 求助:计算一个超长字符串中的子串个数
求助:计算一个超长字符串中的子串个数
摘要: 求助:计算一个超长字符串中的子串个数
需求如题
public int getSubCount(String Content, String Sub)
{
int Count=0;
.....
ruturn Count;
}
我想要一个效率高的,请大家给一些好的想法并完成方法,谢谢了啊
public int getSubCount(String Content, String Sub)
{
int Count=0;
if(Sub.length()!=0){
Count = Content.length();
Content = Content.replaceAll(Sub,"");
Count = Count - Content.length();
Count = Count / Sub.length();
}
ruturn Count;
}
public int getSubCount(String Content, String Sub)
{
return Content.split(Sub).length-1 ;
}
关注...
up
使用split方法会占用大量内存,AHUA1001(99)的算法我觉得很好,我会进一步进行测试,希望大家能给出更好的建议
正则
果然高效
public int getSubCount(String content, String sub)
{
int count=0;
Matcher m=Pattern.compile(sub).match(content);
while(m.find()){
count++;
}
ruturn count;
}
Matcher m=Pattern.compile(sub).match(content);
写错了,后面方法应该是matcher
应该是:
Matcher m=Pattern.compile(sub).matcher(content);
或者用
Matcher m=Pattern.matches(sub,content);
效果是一样的。
/*
* Created on 2006-6-7
*
*
*/
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class StringTest3 {
/**
* @param args
*/
public static void main(String[] args) {
final int n = 10000000;
StringBuffer sb = new StringBuffer(n);
for(int i=0;i<(n/10);i++){
sb.append("abcdefghij");
}
String s = sb.toString();
String q = "j"; //子串
long t1,t2;
t1 = System.currentTimeMillis();
int l1 = test1(s, q);
t2 = System.currentTimeMillis();
System.out.println("1 find count:" + l1 + "/ttime:" + (t2 - t1));
t1 = System.currentTimeMillis();
int l2 = test2(s, q);
t2 = System.currentTimeMillis();
System.out.println("2 find count:" + l2 + "/ttime:" + (t2 - t1));
t1 = System.currentTimeMillis();
int l3 = test3(s, q);
t2 = System.currentTimeMillis();
System.out.println("3 find count:" + l3 + "/ttime:" + (t2 - t1));
t1 = System.currentTimeMillis();
int l4 = test3(s, q);
t2 = System.currentTimeMillis();
System.out.println("4 find count:" + l4 + "/ttime:" + (t2 - t1));
}
public static int test1(String content, String sub){
int count = 0;
int p = content.indexOf(sub.charAt(0));
do{
next:
if(p > -1){ //查找第一个字符,如果相等则继续比较第二位
if(p <= (content.length() - sub.length()))
{
for(int j=1;j
if(content.charAt(p + j) != sub.charAt(j)){
break next;
}
}
count ++;
}
else{
break;
}
}
p = content.indexOf(sub.charAt(0), p + 1);
}
while(p > -1);
return count;
}
public static int test2(String content, String sub){
int count = 0;
if (sub.length() != 0) {
count =content.length();
content = content.replaceAll(sub, "");
count = count - content.length();
count = count / sub.length();
}
return count;
}
public static int test3(String Content, String Sub)
{
return Content.split(Sub).length-1 ;
}
public static int test4(String content, String sub)
{
int count=0;
Matcher m=Pattern.compile(sub).matcher(content);
while(m.find()){
count++;
}
return count;
}
}
q = "i";
1 find count:1000000time:78
2 find count:1000000time:1344
3 find count:999999time:1188
4 find count:999999time:1062
q = "abc"
1 find count:1000000time:94
2 find count:1000000time:1578
3 find count:1000000time:1250
4 find count:1000000time:1172
q = "ja"
1 find count:999999time:63
2 find count:999999time:1563
3 find count:999999time:1281
4 find count:999999time:1172
无论是正则也好,replace也好,其实都是调用charAt来查询、比较、替换,所以直接自己操作来得更快些!
AHUA1001的办法比较妙。
谢谢各位