StringTokenizer

Question

keanoppy 0 Light Poster

14 Years Ago

Hi all...i'm currently trying to manipulate strings from tokens. may i know how to eliminate a char from a token without splitting it?Below is some code that i tried.How can i eliminate a certain word after the tokens are set?

package autotextsum;
import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
	public static void main(String[] args) {
		String s1;
                String sDelim = ".?!";

		StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);
		
		while(str1.hasMoreTokens()) {
                    s1=str1.nextToken();
                    s1=s1.trim();
                    System.out.println(s1+"\n");
		}
	}
}

from the above code, the sentences would be split into 3 tokens,which are:
i like you very much
i do not hate you
i do

how do i eliminate the word "do" from each of the tokens?without affecting the number of tokens involved?the output should be like this

i like you very much
i not hate you
i

Thanks in advance for your time^^

java

4 Contributors
10 Replies
260 Views
8 Hours Discussion Span
Latest Post 14 Years Ago Latest Post by keanoppy

All 10 Replies

NormR1 563 Posting Sage

14 Years Ago

how do i eliminate the word "do" from each of the tokens? without affecting the number of tokens involved?

Not sure I understand what you are asking. 'do' is a token.
It is not contained within any other tokens.
Are you asking how to remove one token ('do') from a list of tokens returned by nextToken();
Removing implies that you are saving all the tokens somewhere and that there are some tokens you do not want to save.
In that case, skip the save step if the token is not one you want to save.

The next part is harder:

without affecting the number of tokens involved

If there are two tokens: 'i' and 'do' and you remove one token then there will not be two tokens left. Can you explain what you mean by the number of tokens?

jon.kiparsky 326 Posting Virtuoso

14 Years Ago

since only single character should be a delimiter.

Check the API:

String[] split(String regex)
Splits this string around matches of the given regular expression.

Keanoppy - your code is actually pretty much right. All you need is to remove the substring "do" wherever it appears. As I said, split() is one way to do this. There's also a "replaceAll()" method in the String class that will do the job. Those are probably the two easiest ways I can think of to do this.

tong1 22 Posting Whiz

14 Years Ago

A nested while loop may do the job.
Here is the code:

import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
  public static void main(String[] args) {
	String s1,s2;
        String sDelim = ".?!";
	StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);		
	while(str1.hasMoreTokens()) {  // get 3 sentense
           s1=str1.nextToken();
           s1=s1.trim();                  
           StringTokenizer str2=new StringTokenizer(s1, " ");                   
             while(str2.hasMoreTokens()){ // check each word
                s2=str2.nextToken();
                s2=s2.trim();
                if (s2.compareTo("do")!=0) // check word "do"
                System.out.print(s2+ " "); // print/store the valid words
              }
           System.out.println();
	 }
	}
}

Edited 14 Years Ago by tong1 because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

keanoppy 0 Light Poster · Answer 1 · 2010-08-17T08:38:52+00:00

so,every word is a token?
because i thought,by putting put the

StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",[B]sDelim[/B]);

the sentences are separated into 3 tokens, which are:
1)i like you very much
2)i do not hate you
3)i do

the line here:

s1=str1.nextToken();

is putting i like you very much as the 1st token, an i do not hate you as the 2nd one...no?do correct me if i'm wrong

jon.kiparsky 326 Posting Virtuoso · Answer 2 · 2010-08-17T08:40:05+00:00

Norm - he's using the punctuation as a delimiter. He's actually getting three tokens, as you'll see if you run the code. (put in a counter if you don't believe me!)

So the problem is, how do you remove a specified substring from a String. There are a few ways to do this. One way would be to use a String method to get the index of that substring, and build a new StringBuilder using the stuff before and after the index you find, and then check it again in case the substring you're looking for appeared twice, and ugh, how tedious.

Or you could use a split() on the substring and then join the resulting array back together - that would be a very perl-like solution.

tong1 22 Posting Whiz · Answer 3 · 2010-08-17T08:46:14+00:00

tong1 22 Posting Whiz

14 Years Ago

According to NormR1's suggestion,
(1) Add one space character as an extra delimiter into the sDelim string in line 8:
String sDelim = ".? !";
(2)Insert the following line of code after line 14
if (s1.compareTo("do")!=0)

so that the "do" will not be printed on DOS

AutoTextSumm.java (0.49 KB)

import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
	public static void main(String[] args) {
		String s1;
                String sDelim = ".?! ";
		StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);		
		while(str1.hasMoreTokens()) {
                    s1=str1.nextToken();
                    s1=s1.trim();
                    if (s1.compareTo("do")!=0)
                    System.out.print(s1+ " ");
		}
	}
}

Edited 14 Years Ago by tong1 because: n/a

jon.kiparsky 326 Posting Virtuoso · Answer 4 · 2010-08-17T08:50:14+00:00

tong - he doesn't want to split on the space. Read the original post. He wants three tokens, three sentences, and he's getting them.
The posters aren't always wrong, guys! :)

keanoppy 0 Light Poster · Answer 5 · 2010-08-17T08:53:08+00:00

to tong,
i tried your method...but i do not think that the line
if (s1.compareTo("do")!=0)
is having any effect on the output T_T

tong1 22 Posting Whiz · Answer 6 · 2010-08-17T09:30:40+00:00

Thank you jon for reminding me about keanoppy's intention.
keanoppy, thank you for your try. You may probably put a semicolon after the if (...)line of code so it does not work. I have tested positively.
The code is printed as follows.
keanoppy, if you want to remove the "do" from the three sentenses, one has to write a method to remove the sub string "do" from any string, as I understand, since only single character should be a delimiter.

import java.util.*;
import java.util.StringTokenizer;

public class AutoTextSumm {
	public static void main(String[] args) {
		String s1;
                String sDelim = ".?! ";
		StringTokenizer str1=new StringTokenizer("i like you very much.i do not hate you!i do.",sDelim);		
		while(str1.hasMoreTokens()) {
                    s1=str1.nextToken();
                    s1=s1.trim();
                    if (s1.compareTo("do")!=0)
                    System.out.print(s1+ " ");
		}
	}
}

keanoppy 0 Light Poster · Answer 7 · 2010-08-17T10:23:00+00:00

keanoppy 0 Light Poster

14 Years Ago

thanks all for helping me out^^

StringTokenizer

Recommended Answers Collapse Answers

All 10 Replies

Recommended Answers