站内搜索: 请输入搜索关键词

当前页面: 开发资料首页J2EE 专题求助:计算一个超长字符串中的子串个数

求助:计算一个超长字符串中的子串个数

摘要: 求助:计算一个超长字符串中的子串个数


需求如题
public int getSubCount(String Content, String Sub)
{
int Count=0;
.....
ruturn Count;
}

我想要一个效率高的,请大家给一些好的想法并完成方法,谢谢了啊


public int getSubCount(String Content, String Sub)
{
int Count=0;
if(Sub.length()!=0){
Count = Content.length();
Content = Content.replaceAll(Sub,"");
Count = Count - Content.length();
Count = Count / Sub.length();
}
ruturn Count;
}


public int getSubCount(String Content, String Sub)
{
return Content.split(Sub).length-1 ;
}


关注...


up


使用split方法会占用大量内存,AHUA1001(99)的算法我觉得很好,我会进一步进行测试,希望大家能给出更好的建议


正则


果然高效


public int getSubCount(String content, String sub)
{
int count=0;
Matcher m=Pattern.compile(sub).match(content);
while(m.find()){
count++;
}
ruturn count;
}


Matcher m=Pattern.compile(sub).match(content);
写错了,后面方法应该是matcher
应该是:
Matcher m=Pattern.compile(sub).matcher(content);
或者用
Matcher m=Pattern.matches(sub,content);
效果是一样的。




/*
* Created on 2006-6-7
*
*
*/
package test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class StringTest3 {

/**
* @param args
*/
public static void main(String[] args) {
final int n = 10000000;

StringBuffer sb = new StringBuffer(n);
for(int i=0;i<(n/10);i++){
sb.append("abcdefghij");
}

String s = sb.toString();
String q = "j"; //子串


long t1,t2;
t1 = System.currentTimeMillis();
int l1 = test1(s, q);
t2 = System.currentTimeMillis();
System.out.println("1 find count:" + l1 + "/ttime:" + (t2 - t1));


t1 = System.currentTimeMillis();
int l2 = test2(s, q);
t2 = System.currentTimeMillis();
System.out.println("2 find count:" + l2 + "/ttime:" + (t2 - t1));

t1 = System.currentTimeMillis();
int l3 = test3(s, q);
t2 = System.currentTimeMillis();
System.out.println("3 find count:" + l3 + "/ttime:" + (t2 - t1));

t1 = System.currentTimeMillis();
int l4 = test3(s, q);
t2 = System.currentTimeMillis();
System.out.println("4 find count:" + l4 + "/ttime:" + (t2 - t1));
}

public static int test1(String content, String sub){
int count = 0;

int p = content.indexOf(sub.charAt(0));
do{
next:
if(p > -1){ //查找第一个字符,如果相等则继续比较第二位
if(p <= (content.length() - sub.length()))
{
for(int j=1;j if(content.charAt(p + j) != sub.charAt(j)){
break next;
}
}
count ++;
}
else{
break;
}
}

p = content.indexOf(sub.charAt(0), p + 1);
}
while(p > -1);

return count;
}

public static int test2(String content, String sub){
int count = 0;
if (sub.length() != 0) {
count =content.length();
content = content.replaceAll(sub, "");
count = count - content.length();
count = count / sub.length();
}
return count;
}


public static int test3(String Content, String Sub)
{
return Content.split(Sub).length-1 ;
}

public static int test4(String content, String sub)
{
int count=0;
Matcher m=Pattern.compile(sub).matcher(content);
while(m.find()){
count++;
}
return count;
}
}


q = "i";

1 find count:1000000time:78
2 find count:1000000time:1344
3 find count:999999time:1188
4 find count:999999time:1062

q = "abc"

1 find count:1000000time:94
2 find count:1000000time:1578
3 find count:1000000time:1250
4 find count:1000000time:1172


q = "ja"

1 find count:999999time:63
2 find count:999999time:1563
3 find count:999999time:1281
4 find count:999999time:1172


无论是正则也好,replace也好,其实都是调用charAt来查询、比较、替换,所以直接自己操作来得更快些!


AHUA1001的办法比较妙。


谢谢各位


↑返回目录
前一篇: 高手请进:换应用服务器后怎样处理web.xml文件?
后一篇: jni 在web下是否可以使用