Hi all...i'm currently trying to manipulate strings from tokens. may i know how to eliminate a char from a token without splitting it?Below is some code that i tried.How can i eliminate a certain word after the tokens are set?

package autotextsum;
import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
	public static void main(String[] args) {
		String s1;
                String sDelim = ".?!";

		StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);
		
		while(str1.hasMoreTokens()) {
                    s1=str1.nextToken();
                    s1=s1.trim();
                    System.out.println(s1+"\n");
		}
	}
}

from the above code, the sentences would be split into 3 tokens,which are:
i like you very much
i do not hate you
i do

how do i eliminate the word "do" from each of the tokens?without affecting the number of tokens involved?the output should be like this

i like you very much
i not hate you
i

Thanks in advance for your time^^

Recommended Answers

All 10 Replies

how do i eliminate the word "do" from each of the tokens? without affecting the number of tokens involved?

Not sure I understand what you are asking. 'do' is a token.
It is not contained within any other tokens.
Are you asking how to remove one token ('do') from a list of tokens returned by nextToken();
Removing implies that you are saving all the tokens somewhere and that there are some tokens you do not want to save.
In that case, skip the save step if the token is not one you want to save.

The next part is harder:

without affecting the number of tokens involved

If there are two tokens: 'i' and 'do' and you remove one token then there will not be two tokens left. Can you explain what you mean by the number of tokens?

so,every word is a token?
because i thought,by putting put the

StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",[B]sDelim[/B]);

the sentences are separated into 3 tokens, which are:
1)i like you very much
2)i do not hate you
3)i do

the line here:

s1=str1.nextToken();

is putting i like you very much as the 1st token, an i do not hate you as the 2nd one...no?do correct me if i'm wrong

Norm - he's using the punctuation as a delimiter. He's actually getting three tokens, as you'll see if you run the code. (put in a counter if you don't believe me!)

So the problem is, how do you remove a specified substring from a String. There are a few ways to do this. One way would be to use a String method to get the index of that substring, and build a new StringBuilder using the stuff before and after the index you find, and then check it again in case the substring you're looking for appeared twice, and ugh, how tedious.

Or you could use a split() on the substring and then join the resulting array back together - that would be a very perl-like solution.

According to NormR1's suggestion,
(1) Add one space character as an extra delimiter into the sDelim string in line 8:
String sDelim = ".? !";
(2)Insert the following line of code after line 14
if (s1.compareTo("do")!=0)

so that the "do" will not be printed on DOS

tong - he doesn't want to split on the space. Read the original post. He wants three tokens, three sentences, and he's getting them.
The posters aren't always wrong, guys! :)

to tong,
i tried your method...but i do not think that the line
if (s1.compareTo("do")!=0)
is having any effect on the output T_T

Thank you jon for reminding me about keanoppy's intention.
keanoppy, thank you for your try. You may probably put a semicolon after the if (...)line of code so it does not work. I have tested positively.
The code is printed as follows.
keanoppy, if you want to remove the "do" from the three sentenses, one has to write a method to remove the sub string "do" from any string, as I understand, since only single character should be a delimiter.

import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
	public static void main(String[] args) {
		String s1;
                String sDelim = ".?! ";
		StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);		
		while(str1.hasMoreTokens()) {
                    s1=str1.nextToken();
                    s1=s1.trim();
                    if (s1.compareTo("do")!=0)
                    System.out.print(s1+ " ");
		}
	}
}

since only single character should be a delimiter.

Check the API:

String[] split(String regex)
Splits this string around matches of the given regular expression.

Keanoppy - your code is actually pretty much right. All you need is to remove the substring "do" wherever it appears. As I said, split() is one way to do this. There's also a "replaceAll()" method in the String class that will do the job. Those are probably the two easiest ways I can think of to do this.

A nested while loop may do the job.
Here is the code:

import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
  public static void main(String[] args) {
	String s1,s2;
        String sDelim = ".?!";
	StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);		
	while(str1.hasMoreTokens()) {  // get 3 sentense
           s1=str1.nextToken();
           s1=s1.trim();                  
           StringTokenizer str2=new StringTokenizer(s1, " ");                   
             while(str2.hasMoreTokens()){ // check each word
                s2=str2.nextToken();
                s2=s2.trim();
                if (s2.compareTo("do")!=0) // check word "do"
                System.out.print(s2+ " "); // print/store the valid words
              }
           System.out.println();
	 }
	}
}

thanks all for helping me out^^

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.