my secondary project at the moment is a complex calculator program, a graphing calculator, actually. i want to write a parser for it by hand, as it needs to do very specific complex stuff, but i cant find anything on writing a parser. i have a start, it will split the string into the correct parts and make tokens out of them, but i don't know how i should take the tokens and actually get a value from them, parenthesis are also giving me a headache. any information, or documentation.

Recommended Answers

All 18 Replies

Go to google and do the following look-ups--

Java class Double
Java class Integer
Java class String

You'll find that you can do "parsing" much like this--

public class TestClass{

String myNumber = "5.2";
double value = (double)Double.parseDouble(myNumber);

public static void main(String[] args)
{

System.out.println(new TestClass().value);

}
}

What did we just do here?

We parsed the string value of myNumber into a valid double.

I did an unnecessary cast of class-type Double to primitive type double. The reason for this is because I wanted the example to remain intuitive for others that look at it. value is expecting the same type that it is, so I justified it with a valid type-cast.

In any case, parsing is very easy, however... exceptions can occur.

For now don't worry too much about it until something that requires you to handle exceptions is brought up.

oh no, god no!

big misunderstanding, sorry for not being clear i know what parsing is, and how to get a single number from a string (way past that kind of parsing, new here, not to java some of my work), i am trying to do something way more complex than that...

say i want to find the value of "2*(3+9x)" when previously x is defined as 2. for you and me it is easy the value is easily found to be 42, but making Java do it is complex and that is what i am trying to parse, i thought i explained well enough but maybe not

all of my google searches for "writing a math parser" turn up parser generating programs(especially if java is included with the search, if not it's all C++), but i want to write my own.

oh no, god no!

big misunderstanding, sorry for not being clear i know what parsing is, and how to get a single number from a string (way past that kind of parsing, new here, not to java [URL="http://matrixpeckham.googlepages.com/gravitysimulator"]some of my work[/URL]), i am trying to do something way more complex than that...

say i want to find the value of "2*(3+9x)" when previously x is defined as 2. for you and me it is easy the value is easily found to be 42, but making Java do it is complex and that is what i am trying to parse, i thought i explained well enough but maybe not

all of my google searches for "writing a math parser" turn up parser generating programs(especially if java is included with the search, if not it's all C++), but i want to write my own.

I don't see how that's any different from parsing a string or char...

In this case, you simply need to have a global scoped variable that "reads" x--

int x = 0;

--and if you're using this as a graph or differential, you simply need to make a variable alligned with x--

char xChar = 'x'--

and do a comparison after iterating through the characters in the string and locating x--

char[] myCharArray = "2*(3+9x)".toCharArray();

for(int i = 0; i < myCharArray.length; i++)
{
      if(myCharArray[i].equals(xChar))
      {
            myCharArray[i] = (char)x;
           //should cast the value of x into a char then throw into array
      }
}

Of course x might be a bigger number than 1-9 so you'll most likely have to take a similar technique using--

String.subString(int start, int end)

--and subbing out the part via iteration that has the variable in question, replacing it then running the new String threw an iteration to do the math via encountering different symbols in your expression.

You're going to have to find an api that handles that or create your own.

Actually this sounds like an interesting project, I think I'll work on it tonight.

i see your idea, but what about "2*sin(x+1)-log(90)"? that presents a mess of other problems, that i have slight idea on how to solve, but i am unsure of the way i should do it. and order of operations?

i see your idea, but what about "2*sin(x+1)-log(90)"? that presents a mess of other problems, that i have slight idea on how to solve, but i am unsure of the way i should do it. and order of operations?

PEMDAS is your best friend

Paranthesis first, then Exponents, then Multiplication/Division then Addition/Subtraction.

What you want to do is create a stack that accepts parenthesis characters and then check the entire char-array for parenthesis.

If there is an uneven amount of parenthesis on one side of the char array than the other, throw an exception...

otherwise...

find the innermost expression from the parenthesis and substring it. From there find out what your variables are and what they are suppose to be when you encounter them (i.e., you'll have to make a String cosine = "cos", String sine = "sin").

When you run into a parenthesis char and it isn't aligned with a sin or cos then do an evaluation of the expression, solve it (by replacing variables if possible) by parsing it as a double or int argument (or long, etc) then concatenate the value into a String and replace that argument in parenthesis that you evaluated with the new evaluated String.

Rinse and repeat.

most real math parser or even grammar parsers use tokens, and that is the way i started to work it out

i have this as a token:

package parser;

/**
 * Write a description of class term here.
 * 
 * @author (your name) 
 * @version (a version number or a date)
 */
public class Token {

    static final int T_CONSTANT = 0;
    static final int T_VAR = 1;
    static final int T_UNARY_OPP = 2;
    static final int T_BINARY_OPP = 3;
    static final int T_PARENTHISIS_OPEN = 4;
    static final int T_PARENTHISIS_CLOSE = 5;
    static final int T_FUNCTION = 6;
    static final int B_OPP_NOT_OPP = 0;
    static final int B_OPP_PLUS = 1;
    static final int B_OPP_MINUS = 2;
    static final int B_OPP_MULT = 3;
    static final int B_OPP_DIVIDE = 4;
    static final int B_OPP_POW = 5;
    static final int U_OPP_NOT_OPP = 0;
    static final int U_OPP_PLUS = 1;
    static final int U_OPP_MUNUS = 2;
    static final int F_NOT_FUNCTION = 0;
    static final int F_SIN = 1;
    static final int F_COS = 2;
    static final int F_TAN = 3;
    static final int F_ASIN = 4;
    static final int F_ACOS = 5;
    static final int F_ATAN = 6;
    static final int F_LOG  = 7;
    static final double NAN = 0D/0D;
    int type;
    int type_of_type;
    Token before = null;
    Token after = null;
    String content;
    double value = NAN;

    public ValReturn value () {
        if ( canHaveVal () ) {
            if ( type == T_CONSTANT ) {
                return new ValReturn ( true, Double.parseDouble ( content ) );
            }
        }
        return new ValReturn ( false, 0 );
    }

    public boolean canHaveVal () {
        if ( type == T_CONSTANT ) {
            return true;
        }
        else if ( type == T_FUNCTION ) {
            return true;
        }
        else if ( type == T_UNARY_OPP ) {
            return true;
        }
        else if ( type == T_BINARY_OPP ) {
            return true;
        }
        return false;
    }
    public void fillSelf(){
        if(testValOf(content)==NAN){
            if(content.equalsIgnoreCase("x") || content.equalsIgnoreCase("y") || content.equalsIgnoreCase("z")){
                type = T_VAR;
            }
        } else {
            type = T_CONSTANT;
            value = testValOf(content); 
        }
    }
    public static double testValOf ( String s ) {
        try {
            return Double.valueOf ( s );
        }
        catch ( NumberFormatException nfe ) {
            return 0D / 0D;
        }
    }
    public String toString(){
        String b,a;
        if(before!=null){
            b = before.content;
        } else {
            b="null";
        }
        if(after!=null){
            a = after.content;
        } else {
            a="null";
        }
        return "before " + b + " after " + a + " content " + content + " type " + type + " typeoftype " + type_of_type;
    }
}

getting the value of that exactly is something i have trouble with, scince this code ther are a few changes i made, there is now another class that extends this one but contains other a tokenizer of its own it is for parenthesis ie. "(x+2)" where as this class will hold something like "9090" or "+"

i am in the middle of a final project for an into course (knew java before the class, and figured i should get a grade for knowing what i already knew) which is a simple version of MS paint so this is kinda on the back burner

i just remembered one of the more powerful reasons that ordinary parsing will not work for this project.

say instead of predefining x i wanted the actual multiplied form of 2(x+2)(x+2)(2x^2+2x+2). (^==to power of) i do not believe that ordinary parsing would suffice here, as anything in the ordinary JDK cannot multiply x*x to x^2.

sorry about the *bump*

i just remembered one of the more powerful reasons that ordinary parsing will not work for this project.

say instead of predefining x i wanted the actual multiplied form of 2(x+2)(x+2)(2x^2+2x+2). (^==to power of) i do not believe that ordinary parsing would suffice here, as anything in the ordinary JDK cannot multiply x*x to x^2.

sorry about the *bump*

Thats why I referred to using PEMDAS and substring.

Substring from the String class returns a particular "fragment" of the String inquestion.

What you can do is make a recursive method that pulls a chunk of information out of your String, solves it then places that solved chunk back into the original String.

You'd have to do this in order of operations, of course.

Either way it shouldn't be hard.

Right now I'm working on a C++, C# and Java project. Once I have some free time I will supply some code to help you understand what I'm getting at.

Edit: Also you should most likely use an enum type for all of your constants listed above. Either that or a final array but the array would be more difficult since you don't have a way of knowing what the constant is for without making an adjacent String array.

i already have it breaking into smaller pieces, it's getting the pieces to do the right thing that i'm having trouble with.
i have a class for these pieces (see above post), and i am not sure how i should go about having it find a value.
say for example i have an array list of instances and the second (.get(1)) contains the substring "+", it would have a type of 3 and a type_of_type of 1, both constants at the top.
it also has an a reference to the instance before it in the array list and the instance after it (.get(0), and .get(2)).
so in the value method of it there would be an if statement for the type, and then the type of type, if the type is 3 (binary operator) it would have to preform the operation with the two surrounding it.
but if one of those happens to be a variable (x) then it cannot preform the operation unless a value is already given to the variable (x) and if that is the case (no value) than it should return the string ("x +" + instance after) and if that gets returned than the rest of the whole string would have to work with that instead of a number.

i want the program to be able to tell me that 6+6*6 is 42 (PEMDAS) but i also want it to tell me that (x+1)*(x+1) is x^2+2x+1

am i being clear at all? if not please say something so i can find a better example

i already have it breaking into smaller pieces, it's getting the pieces to do the right thing that i'm having trouble with.
i have a class for these pieces (see above post), and i am not sure how i should go about having it find a value.
say for example i have an array list of instances and the second (.get(1)) contains the substring "+", it would have a type of 3 and a type_of_type of 1, both constants at the top.
it also has an a reference to the instance before it in the array list and the instance after it (.get(0), and .get(2)).
so in the value method of it there would be an if statement for the type, and then the type of type, if the type is 3 (binary operator) it would have to preform the operation with the two surrounding it.
but if one of those happens to be a variable (x) then it cannot preform the operation unless a value is already given to the variable (x) and if that is the case (no value) than it should return the string ("x +" + instance after) and if that gets returned than the rest of the whole string would have to work with that instead of a number.

i want the program to be able to tell me that 6+6*6 is 42 (PEMDAS) but i also want it to tell me that (x+1)*(x+1) is x^2+2x+1

am i being clear at all? if not please say something so i can find a better example

No I think I understand now. I'm sorry for the ignorant previous reply.

In this case I'd suggest using a map for potential polynomial values...

Say you run into a long substring of (x^2 - 4) then you can index this and find a potential value that would also satisfy this expression.

Of course the only desirable other would be (x + 2)*(x -2) but if you do this (which may or may not take a lot of time) then it may save you more time compared to creating a generalized polynomial factor extractor where the value is preserved. Either approach is worth a shot though.

I'm going to work on this like I promised to do. I'll post my results as soon as I get them.

I wrote down the process of how the program must run efficiently. Tell me if anything is wrong with this--

User
*user needs to provide equation
*user needs to specify the different variables in the equation

Equation
->Check for syntax errors
->Perform Calculation in order of operations PEMDAS
->Combine like-terms
->Simplify (not solve) the equation

Here is what I have so far. Although it's not much it may be of some help.

i like the isSolvable method,
i also like the idea of pre-defined stuff, i could save a lot of processing time, but because it is impossible to predict what a user will think of, i think it would be interesting to have a log-type of file:
i think it would be cool to get an equation, and search the log for it and if it is there it will substitute in the simplified form in the file,
and if not simplifies it and adds it to the log.
thinking of it now, to populate the initial log, i could release the program (whenever i final figure it out) to a few or more people who will later give there logs back after a while, and combine them to get an original.

what i actually have is here:
equation calculator.zip

this actually contains the bluej project folder, the actual parsing is taken care of the stuff in the parser folder, creating a tokenizer with a string parameter will do what it needs to do, it will print out the parts

i like the isSolvable method,
i also like the idea of pre-defined stuff, i could save a lot of processing time, but because it is impossible to predict what a user will think of, i think it would be interesting to have a log-type of file:
i think it would be cool to get an equation, and search the log for it and if it is there it will substitute in the simplified form in the file,
and if not simplifies it and adds it to the log.
thinking of it now, to populate the initial log, i could release the program (whenever i final figure it out) to a few or more people who will later give there logs back after a while, and combine them to get an original.


what i actually have is here:[ATTACH]6339[/ATTACH] this actually contains the bluej project folder, the actual parsing is taken care of the stuff in the parser folder, creating a tokenizer with a string parameter will do what it needs to do, it will print out the parts

I've been looking into a few sites to see if group-parsing can be done in an easier fashion (i.e. parsing a group from the "deepest" paranthesized group then simplifying that expression and appending the simplification back to it's proper location in the String).

I found a few that are useful in case your idea doesn't work out. I've been studying and tinkering around with the "Pattern" and "Matcher" classes. They're pretty useful for String operations. Links are provided here--

http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html#matcher(java.lang.CharSequence)

http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplacement(java.lang.StringBuffer,%20java.lang.String)

http://java.sun.com/docs/books/tutorial/essential/regex/quant.html

http://www.roseindia.net/java/example/java/util/CaptureTextInGroup.shtml

Oh by the way... I tried using your Tokenizer class (I created main in the Tokenizer class) and I got an Array out-of-bounds error. Am I running it the wrong way?

commented: gave valid info +1

no that is an unresolved error, i didn't finish modifying it since adding the parentoken class.

it should have printed something before that happened, and that is all it should do at the moment. sorry for not mentioning it earlier, i forgot about that...

on a different not i found a book on the subject, google preview: http://books.google.com/books?id=uB-eYkcxImEC&printsec=frontcover&dq=parser+java&ei=VbNUSNaQC4KejgGB0f2BDA&sig=LCBoHCM9XBfwFNFXyVLuQ0pgRj0

unfortunately google cuts some of the most important stuff, so you have to buy the book, and the library here is behind the times... ganna try to buy it from amazon

thanks for the links its 2:17 AM now and i am too tired to read at the moment, but i will be sure to look at them tomorrow... later today... after sleep...

yes that is what i am trying to accomplish, but as said in the first post, i want to write my own by hand.

just ordered two books, one specifically in parsers with java, and one that includes but has many other things too.

my first book arrived today, it has the information i need, this code:

/*   
   This module contains the recursive descent  
   parser that uses variables.  
 */

// Exception class for parser errors.  
class ParserException extends Exception {
	String errStr; // describes the error 

	public ParserException(String str) {
		errStr = str;
	}

	public String toString() {
		return errStr;
	}
}

public class Parser {
	// These are the token types. 
	final int NONE = 0;
	final int DELIMITER = 1;
	final int VARIABLE = 2;
	final int NUMBER = 3;

	// These are the types of syntax errors. 
	final int SYNTAX = 0;
	final int UNBALPARENS = 1;
	final int NOEXP = 2;
	final int DIVBYZERO = 3;

	// This token indicates end-of-expression. 
	final String EOE = "\0";

	private String exp; // refers to expression string  
	private int expIdx; // current index into the expression  
	private String token; // holds current token  
	private int tokType; // holds token's type  

	// Array for variables.  
	private double vars[] = new double[26];

	// Parser entry point.  
	public double evaluate(String expstr) throws ParserException {
		double result;
		exp = expstr;
		expIdx = 0;

		getToken();
		if (token.equals(EOE))
			handleErr(NOEXP); // no expression present  

		// Parse and evaluate the expression. 
		result = evalExp1();

		if (!token.equals(EOE)) // last token must be EOE  
			handleErr(SYNTAX);

		return result;
	}

	// Process an assignment.  
	private double evalExp1() throws ParserException {
		double result;
		int varIdx;
		int ttokType;
		String temptoken;

		if (tokType == VARIABLE) {
			// save old token  
			temptoken = new String(token);
			ttokType = tokType;

			// Compute the index of the variable.  
			varIdx = Character.toUpperCase(token.charAt(0)) - 'A';

			getToken();
			if (!token.equals("=")) {
				putBack(); // return current token  
				// restore old token -- not an assignment  
				token = new String(temptoken);
				tokType = ttokType;
			}
			else {
				getToken(); // get next part of exp  
				result = evalExp2();
				vars[varIdx] = result;
				return result;
			}
		}

		return evalExp2();
	}

	// Add or subtract two terms.  
	private double evalExp2() throws ParserException {
		char op;
		double result;
		double partialResult;

		result = evalExp3();

		while ((op = token.charAt(0)) == '+' || op == '-') {
			getToken();
			partialResult = evalExp3();
			switch (op) {
				case '-':
					result = result - partialResult;
					break;
				case '+':
					result = result + partialResult;
					break;
			}
		}
		return result;
	}

	// Multiply or divide two factors.  
	private double evalExp3() throws ParserException {
		char op;
		double result;
		double partialResult;

		result = evalExp4();

		while ((op = token.charAt(0)) == '*' || op == '/' || op == '%') {
			getToken();
			partialResult = evalExp4();
			switch (op) {
				case '*':
					result = result * partialResult;
					break;
				case '/':
					if (partialResult == 0.0)
						handleErr(DIVBYZERO);
					result = result / partialResult;
					break;
				case '%':
					if (partialResult == 0.0)
						handleErr(DIVBYZERO);
					result = result % partialResult;
					break;
			}
		}
		return result;
	}

	// Process an exponent.  
	private double evalExp4() throws ParserException {
		double result;
		double partialResult;
		double ex;
		int t;

		result = evalExp5();

		if (token.equals("^")) {
			getToken();
			partialResult = evalExp4();
			ex = result;
			if (partialResult == 0.0) {
				result = 1.0;
			}
			else
				for (t = (int) partialResult - 1; t > 0; t--)
					result = result * ex;
		}
		return result;
	}

	// Evaluate a unary + or -.  
	private double evalExp5() throws ParserException {
		double result;
		String op;

		op = "";
		if ((tokType == DELIMITER) && token.equals("+") || token.equals("-")) {
			op = token;
			getToken();
		}
		result = evalExp6();

		if (op.equals("-"))
			result = -result;

		return result;
	}

	// Process a parenthesized expression.  
	private double evalExp6() throws ParserException {
		double result;

		if (token.equals("(")) {
			getToken();
			result = evalExp2();
			if (!token.equals(")"))
				handleErr(UNBALPARENS);
			getToken();
		}
		else
			result = atom();

		return result;
	}

	// Get the value of a number or variable.  
	private double atom() throws ParserException {
		double result = 0.0;

		switch (tokType) {
			case NUMBER:
				try {
					result = Double.parseDouble(token);
				}
				catch (NumberFormatException exc) {
					handleErr(SYNTAX);
				}
				getToken();
				break;
			case VARIABLE:
				result = findVar(token);
				getToken();
				break;
			default:
				handleErr(SYNTAX);
				break;
		}
		return result;
	}

	// Return the value of a variable.  
	private double findVar(String vname) throws ParserException {
		if (!Character.isLetter(vname.charAt(0))) {
			handleErr(SYNTAX);
			return 0.0;
		}
		return vars[Character.toUpperCase(vname.charAt(0)) - 'A'];
	}

	// Return a token to the input stream.  
	private void putBack() {
		if (token == EOE)
			return;
		for (int i = 0; i < token.length(); i++)
			expIdx--;
	}

	// Handle an error.  
	private void handleErr(int error) throws ParserException {
		String[] err = { "Syntax Error", "Unbalanced Parentheses",
			"No Expression Present", "Division by Zero" };

		throw new ParserException(err[error]);
	}

	// Obtain the next token.  
	private void getToken() {
		tokType = NONE;
		token = "";

		// Check for end of expression.  
		if (expIdx == exp.length()) {
			token = EOE;
			return;
		}

		// Skip over white space. 
		while (expIdx < exp.length()
				&& Character.isWhitespace(exp.charAt(expIdx)))
			++expIdx;

		// Trailing whitespace ends expression. 
		if (expIdx == exp.length()) {
			token = EOE;
			return;
		}

		if (isDelim(exp.charAt(expIdx))) { // is operator  
			token += exp.charAt(expIdx);
			expIdx++;
			tokType = DELIMITER;
		}
		else
			if (Character.isLetter(exp.charAt(expIdx))) { // is variable  
				while (!isDelim(exp.charAt(expIdx))) {
					token += exp.charAt(expIdx);
					expIdx++;
					if (expIdx >= exp.length())
						break;
				}
				tokType = VARIABLE;
			}
			else
				if (Character.isDigit(exp.charAt(expIdx))) { // is number  
					while (!isDelim(exp.charAt(expIdx))) {
						token += exp.charAt(expIdx);
						expIdx++;
						if (expIdx >= exp.length())
							break;
					}
					tokType = NUMBER;
				}
				else { // unknown character terminates expression 
					token = EOE;
					return;
				}
	}

	// Return true if c is a delimiter.  
	private boolean isDelim(char c) {
		if ((" +-/*%^=()".indexOf(c) != -1))
			return true;
		return false;
	}
}

from the book if you want to use it, you instantiate it, and call evaluate(String) with the string you want to use, accepts parentheses and the operators +-()*^%/ as well as assignment and use of the letters a-z as variables, not case sensitive

im going to end up modifying it (very heavily) so that it can do the algebra with strings, too. i also need to modify it to do trigonometric functions and such. a rather cool solution i wouldn't have thought of myself, recursion, something i hardly ever use but this is pretty cool.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.