how to use StringTokenizer

Question

linezero 1 Newbie Poster

11 Years Ago

I have code in a txt file

public class TestClass {
    static void main(String[] args) {
        int id;
        int number;
        id = 3;
        number = 33
        telNo = id
        int id;
    }
}

I know how to read the file using bufferedreader and get the line number but i would like to know hw the StringTokenizer can be used in here while reading the line by line and how to create a simple error System.out.println eg: telno is not equal to id because telno is not declared and int id repeated twice

java read textfile

5 Contributors
18 Replies
301 Views
1 Day Discussion Span
Latest Post 11 Years Ago Latest Post by JamesCherrill

All 18 Replies

Taywin 312 Posting Virtuoso

11 Years Ago

You need a string for StringTokenizer. Then you define what delimiter you want the string to be split into token. You could look at this link for simple examples of how to use the class.

PS: Your current main code would not pass compiler anyway because telNo is not declared anywhere but being used?

Edited 11 Years Ago by Taywin

Taywin 312 Posting Virtuoso

11 Years Ago

When you read the file, what data type did you obtain from the file? Did you read it line by line? Would each line comes in as string? Also, what format of the line data you are reading. Do you split the whole string or you need to split only a portion of the string? Could you give a sample data in the file? I mean how it lays out. You could use the code tag to make it readable.

Taywin 312 Posting Virtuoso

11 Years Ago

Your line variable is the string that can be used in StringTokenizer. You just need a proper delimiter to split that string in the way you want.

Taywin 312 Posting Virtuoso

11 Years Ago

Then could you give me a real example of one or a couple lines of data. Then explain which part you want and which part you don't. I need to see the text format. Not easy to follow what you said though...

Taywin 312 Posting Virtuoso

11 Years Ago

Oh I see... It is from a compiler class, I guess. Now I got it.

So you simply obtain each line first. Then split using white spaces. You should know that you are expecting certain format & keyword in the line.

I will give you a very simple example (not catch all but at least give you some ideas). This would work for variables only (no class or method name). Also, ignore all leading empty token if found.

If a line starts with int, float, double, or boolean, (they are all keywords) flag that you found a type of a symbol. The next token should be a symbol name, and the found line is the identifier. If, instead, you found another keyword, the symbol already exists in the table, or malformat of symbol name, it is an error. There are many other errors (such as symbol name format), but I will leave it to you.
If a line starts with a capitalized word and it is not in the symbol table, expect it as a new symbol type. Then the next token would be determined the same way as above (a symbol name).
If you want to add anything else, such as start with public, etc, you could use the same rule -- expected next token when you see the keyword.
Once you could finish this simple way of parsing, you may also recognize any token within the string line as well (such as public String s() { int a=7; } which is a one-line written style).

Now let see how it would look like below...

/*
public class TestClass {  // not found variable symbol
    static void main(String[] args) {  // not found variable symbol
        int id;      // found variable symbol 'id' type 'int' at line 3
        int number;  // found variable symbol 'number' type 'int' at line 4
        id = 3;      // not found
        number = 33  // not found
        telNo = id   // not found
        int id;      // flag error because of duplicated
    }
}
*/

PS: Remember that all keywords are case-sensitive. So you need to check with equals(), not equalsIgnoreCase().

Edited 11 Years Ago by Taywin

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

linezero 1 Newbie Poster · Answer 1 · 2012-10-12T18:49:29+00:00

You need a string for StringTokenizer. Then you define what delimiter you want the string to be split into token. You could look at this link for simple examples of how to use the class.

PS: Your current main code would not pass compiler anyway because telNo is not declared anywhere but being used?

Actually that piece of code is in the text file. I'm reading that text file using buffered reader and printing out in a table like for eg: the table has Symbol Name, Type and Identifier. Under Symbol name, id. Under Type, int. Under Identifier, 3. I managed to get line number with that text but couldnt manage to split them accordingly

linezero 1 Newbie Poster · Answer 2 · 2012-10-12T19:26:42+00:00

When you read the file, what data type did you obtain from the file? Did you read it line by line? Would each line comes in as string? Also, what format of the line data you are reading. Do you split the whole string or you need to split only a portion of the string? Could you give a sample data in the file? I mean how it lays out. You could use the code tag to make it readable.

have started from scratch. here is my code so far...i'm having trouble with splitting to form a table like as if i described..min.txt contains the code in first post. it more like trying to create a symboltable. here is the current code I'm working. I got line number

    public class Reader {

            public static void main(String[] args) throws IOException {
                FileReader f = new FileReader("min.txt");
                  LineNumberReader Lieader = null;
                    try {
                      Lieader = new LineNumberReader(f);
                      String line = "";

                      while ((line = Lieader.readLine()) != null) {
                          System.out.println("Line:  " + Lieader.getLineNumber() + ": " + line);
                      }
                    }
                  finally {
                  }
            }

        }

linezero 1 Newbie Poster · Answer 3 · 2012-10-12T20:09:20+00:00

Your line variable is the string that can be used in StringTokenizer. You just need a proper delimiter to split that string in the way you want.

and that's what im having trouble at >.<

linezero 1 Newbie Poster · Answer 4 · 2012-10-12T20:36:44+00:00

Then could you give me a real example of one or a couple lines of data. Then explain which part you want and which part you don't. I need to see the text format. Not easy to follow what you said though...

okay a text file(min.txt) contains java code as text which is

public class TestClass {
    static void main(String[] args) {
        int id;
        int number;
        id = 3;
        number = 33
        telNo = id
        int id;
    }
}

Now i have to read the min.txt file and generate a table like for eg: the table has Symbol Name, Type and Identifier. Under Symbol name, id. Under Type, int. Under Identifier(line number), 3. I managed to get line number with that text but couldnt manage to split them accordingly. So far I managed to get the line number using the following code below:

import java.io.*;
public class Reader {

    public static void main(String[] args) throws IOException {
        FileReader f = new FileReader("foo1.java");
          LineNumberReader Lieader = null;
            try {
              Lieader = new LineNumberReader(f);
              String line = "";

              while ((line = Lieader.readLine()) != null) {
                  System.out.println("Line:  " + Lieader.getLineNumber() + ": " + line);
              }
            }
          finally {
          }
    }
}

I have no idea hw to split the code as jst described which is what I am having trouble now >.< ..hope this explanaiton is better :D

linezero 1 Newbie Poster · Answer 5 · 2012-10-12T21:01:50+00:00

Oh I see... It is from a compiler class, I guess. Now I got it.

So you simply obtain each line first. Then split using white spaces. You should know that you are expecting certain format & keyword in the line.

I will give you a very simple example (not catch all but at least give you some ideas). This would work for variables only (no class or method name).

If a line starts with int, float, double, or boolean, flag that you found a type of a symbol. The next token should be the symbol name, and the found line is the identifier. If, instead, you found another keyword or the symbol already exists in the table, it is an error. There are many other errors (such as symbol name format), but I will leave it to you.

If a line starts with a capitalized word and it is not in the symbol table, expect it as a new symbol type. Then the next token would be determined the same way as above.

If you want to add anything else, such as start with public, etc, you could use the same rule -- expected next token when you see the keyword.

Could u provide an example of a symbol table. I have researched a lot abt this symboltable but i have no idea how to create one and one more question. if i manage to create this table will i be able to implement on the code i have managed to get the line number?

Taywin 312 Posting Virtuoso · Answer 6 · 2012-10-12T21:06:29+00:00

A symbol table should be a collection (or ArrayList) of symbol detail (which could be another class).... Somewhat as follows...

class SymbolDetail {
  String type;
  String name;
  int identifier;
  ...
}

// then use it as...
ArrayList<SymbolDetail> symbolTable = new ArrayList<SymbolDetail>();

// then if you collect all the info, you could simply add...
symbolTable.add(new SymbolDetail(type_value, symbol_name, identifier_line_number));

// you could even implement this table in another class instead,
// so it would look cleaner when you call to use it... In other words,
// you would have at least 3 classes in your program.
// 1)Test class to test the program
// 2)SymbolTable class for symbol table
// 3)SymbolDetail call for each symbol data

PS: I'm not sure how much knowledge you have in Java and also in Object Oriented (OO) programming. You need to think in OO or it would be difficult to deal with.

Starstreak 13 Light Poster · Answer 7 · 2012-10-12T21:21:33+00:00

From Java 6 javadoc:
"StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead."
http://download.java.net/jdk7/archive/b123/docs/api/java/util/StringTokenizer.html

linezero 1 Newbie Poster · Answer 8 · 2012-10-12T21:28:13+00:00

A symbol table should be a collection (or ArrayList) of symbol detail (which could be another class).... Somewhat as follows...

isn't it called symboltable? can be applied on the project im working on. i knw OOP. jst dont know how to create a symboltable and implement it out

Taywin 312 Posting Virtuoso · Answer 9 · 2012-10-12T21:59:23+00:00

No no. The SymbolTable is a customized class you need to implement. Everything should be customized with your own implementation. I just use the name as a class because it is related directly to what you are working on. You could use any name though. I am going to be away for some hours... Anyway, I would like you to think about it. Don't need to worry about coding yet, but the structural abstract of the assignment. I will come back and try to explain later. Sorry for now.

@Starstreak, it does not really matter in this case. This is obviously for practice purpose, not for real work. He could simply use split() method of String and use regex for the argument. However, for this simple assignment, the main issue would be how to parse the data and store in a customize class data.

stultuske 1,116 Posting Maven Featured Poster · Answer 10 · 2012-10-13T05:57:07+00:00

best way to use StringTokenizer: not at all.
here 's a quote straight from the StringTokenizer api:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
The following example illustrates how the String.split method can be used to break up a string into its basic tokens:

      String[] result = "this is a test".split("\\s");
      for (int x=0; x<result.length; x++)
          System.out.println(result[x]);

StringTokenizer

Starstreak 13 Light Poster · Answer 11 · 2012-10-13T11:37:29+00:00

I think the same applies to the old IO system, which was replaced by NIO. If people are new to Java, they should not be coding in the old way. Maybe if they see legacy code, then sure .. but not people learning from scratch.

Taywin 312 Posting Virtuoso · Answer 12 · 2012-10-13T17:32:09+00:00

The main issue for the requirement is NOT at all about spliting string but the parsing. The OP may use any spliting string method he/she wants. So sorry, I'm not going to emphasize on that.

Back to the OP, did you have any idea yet? If not, I can tell you that this assignment is divided into 2 parts -- parsing and storing data. The storing data is trivia because you could store it any form you like. I suggested you to create customized class because it is more flexible and convenient. Below is a simple design.

/*
The Test class may contain more methods for parsing & testing.
You could move the parsing part to the SymbolTable class by simply
  pass the file to be read from to the class. Though, I just keep
  it clean from the diagram.
+-------+  used  +---------------------------------+   
|  Test | <----  | SymbolTable Class               | // constructor does not need
| Class |        |  variable                       | // to accept any argument
+-------+        |  - table(ArrayList<SymbolData>) |
                 |  methods                        |
                 |  - void addSymbolData()         |
                 |  - boolean contains(SymbolData) |
                 |  - void display()               |
                 |  - String toString()            |
                 +---------------------------------+
                        ^
                        | used
                 +----------------------+
                 | SymbolData Class     | // constructor should accept arguments of
                 |  variables           | // all type, name, and identifier at once
                 |  - type (String)     |
                 |  - name (String)     |
                 |  - identifier (int)  |  // could be type of String
                 |  methods             |
                 |  - void display()    |
                 |  - String toString() |
                 +----------------------+
  */

Now, the main part is the parsing. I explained how to do it earlier but it seems that you didn't get the idea. So I will explain again. Please always keep in mind that you know exactly what the format of the incoming string will be. The String should be in a form of line-by-line code format. Each word of a string (i.e. "my post here" would have ["my", "post", "here"] as split result) in the result array is a string token.

When you parse, you will have multiple flag (boolean or whatever variable you want to keep track) of whatever you have seen so far. You need to keep a list of keyword you are expecting to see. Also, remember that it is case-sensitive.

The flag will be turned on if you see what you are expecting, and will be turned off regardless the next token is what you are expecting. However, you must handle each case of the token right after you see expected flag -- either store the expected data or deal with the error.

In your Reader class, your line variable in line 11 is storing each incoming line data. You could use StringTokenizer class to split it. Just to be clear, code portion below is a brute-force way to deal with...

// i.e. line = "int id;"
String token;
// after tokenized, there will be "int" and "id;" tokens
StringTokenizer stkn = new StringTokenizer(line, " ");
String keyword="";
while (stkn.hasMoreToken()) {
  token = stkn.nextToken();  // now this is each word
  if (token.isEmpty() || token.equals(" ")) { continue; } // ignore empty string
  if (keyword.isEmpty() && isAKeyword(token)) {
    keyword += token;
    continue;
  }
  if (keyword.equals("int")) {
    // seen "int" keyword before, expecting a symbol name
    if (isValidSymbolName(token)) {
      // create new SymbolData here because you have all data now
      // type is the keyword
      // name is the token
      // identifier is the line number
      // then add to your SymbolTable variable if it doesn't exist
      // (if it already exists in the table, either ignore or deal with the error)
    }
    else {
      // error, unexpected value.. You could ignore or do something here.
    }
    keyword = "";  // reset
    continue;
  }
  ...  // check with other keywords one-by-one
}

You could use split() method directly from String class, but you will use for-loop instead because the return type is a String[].

// use string "\\s+" instead of "\\s" because you do not expect
// an empty string and has no purpose to deal with anyway.
String[] tokens = line.split("\\s+");

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 13 · 2012-10-13T19:16:19+00:00

JamesCherrill 4,733 Most Valuable Poster

11 Years Ago

When parsing you may find this useful

how to use StringTokenizer

Recommended Answers Collapse Answers

All 18 Replies

Recommended Answers