当前页面: 开发资料首页 → J2EE 专题 → 求助:计算一个超长字符串中的子串个数

求助:计算一个超长字符串中的子串个数

摘要: 求助:计算一个超长字符串中的子串个数

需求如题
public int getSubCount(String Content, String Sub)
{
int Count=0;
.....
ruturn Count;
}

我想要一个效率高的,请大家给一些好的想法并完成方法,谢谢了啊

public int getSubCount(String Content, String Sub)
{
int Count=0;
if(Sub.length()!=0){
Count = Content.length();
Content = Content.replaceAll(Sub,"");
Count = Count - Content.length();
Count = Count / Sub.length();
}
ruturn Count;
}

public int getSubCount(String Content, String Sub)
{
return Content.split(Sub).length-1 ;
}

关注...

使用split方法会占用大量内存,AHUA1001(99)的算法我觉得很好,我会进一步进行测试,希望大家能给出更好的建议

正则

果然高效

public int getSubCount(String content, String sub)
{
int count=0;
Matcher m=Pattern.compile(sub).match(content);
while(m.find()){
count++;
}
ruturn count;
}

Matcher m=Pattern.compile(sub).match(content);
写错了，后面方法应该是matcher
应该是：
Matcher m=Pattern.compile(sub).matcher(content);
或者用
Matcher m=Pattern.matches(sub,content);
效果是一样的。

/*
* Created on 2006-6-7
*
*
*/
package test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class StringTest3 {

/**
* @param args
*/
public static void main(String[] args) {
final int n = 10000000;

StringBuffer sb = new StringBuffer(n);
for(int i=0;i<(n/10);i++){
sb.append("abcdefghij");
}

String s = sb.toString();
String q = "j"; //子串

long t1,t2;
t1 = System.currentTimeMillis();
int l1 = test1(s, q);
t2 = System.currentTimeMillis();
System.out.println("1 find count:" + l1 + "/ttime:" + (t2 - t1));

t1 = System.currentTimeMillis();
int l2 = test2(s, q);
t2 = System.currentTimeMillis();
System.out.println("2 find count:" + l2 + "/ttime:" + (t2 - t1));

t1 = System.currentTimeMillis();
int l3 = test3(s, q);
t2 = System.currentTimeMillis();
System.out.println("3 find count:" + l3 + "/ttime:" + (t2 - t1));

t1 = System.currentTimeMillis();
int l4 = test3(s, q);
t2 = System.currentTimeMillis();
System.out.println("4 find count:" + l4 + "/ttime:" + (t2 - t1));
}

public static int test1(String content, String sub){
int count = 0;

int p = content.indexOf(sub.charAt(0));
do{
next:
if(p > -1){ //查找第一个字符，如果相等则继续比较第二位
if(p <= (content.length() - sub.length()))
{
for(int j=1;j if(content.charAt(p + j) != sub.charAt(j)){
break next;
}
}
count ++;
}
else{
break;
}
}

p = content.indexOf(sub.charAt(0), p + 1);
}
while(p > -1);

return count;
}

public static int test2(String content, String sub){
int count = 0;
if (sub.length() != 0) {
count =content.length();
content = content.replaceAll(sub, "");
count = count - content.length();
count = count / sub.length();
}
return count;
}

public static int test3(String Content, String Sub)
{
return Content.split(Sub).length-1 ;
}

public static int test4(String content, String sub)
{
int count=0;
Matcher m=Pattern.compile(sub).matcher(content);
while(m.find()){
count++;
}
return count;
}
}

q = "i";

1 find count:1000000time:78
2 find count:1000000time:1344
3 find count:999999time:1188
4 find count:999999time:1062

q = "abc"

1 find count:1000000time:94
2 find count:1000000time:1578
3 find count:1000000time:1250
4 find count:1000000time:1172

q = "ja"

1 find count:999999time:63
2 find count:999999time:1563
3 find count:999999time:1281
4 find count:999999time:1172

无论是正则也好，replace也好，其实都是调用charAt来查询、比较、替换，所以直接自己操作来得更快些！

AHUA1001的办法比较妙。

谢谢各位

↑返回目录
前一篇: 高手请进：换应用服务器后怎样处理web.xml文件？
后一篇: jni 在web下是否可以使用