Calculator using shunting-yard algorithm

L7Sqr 2 Tallied Votes 6K Views Share

Hello.

This code snippet is a basic calculator.

The general concept is that the calculator accepts infix expressions as strings, converts them to reverse polish notation by way of the shunting-yard algorithm and then evaluates the resulting expression.

I tried to encapsulate the functionality of each piece so that, in theory, they could be pulled out and used independently. The shunting-yard behavior is provided via class ShuntingYard; I used the visitor pattern for the Token; and RPNExpression could be built by hand (or some other process) and still be used with Calculator.

I removed almost all error checking and input validation for the sake of brevity and readibility. Code compiles with MSVC++ and GCC 4.6.3.

Any feedback is appreciated.

mike_2000_17 commented: Very nice! +13
/*
   Author: Jesse Brown
   Date: July 7, 2012
 */
 
#include <iostream>
#include <stack>
#include <vector>
#include <string>
#include <cstdlib>
#include <map>
#include <stdexcept>

class Calculator;

/*
   Represents a 'token' in an RPN expression. 
   An RPN expression looks something like:
     2 3 4 + *
   This class provides a common interface for handling both
   operators and operands.
 */
struct TokenBase { 
    virtual void evaluate (Calculator *) = 0; 
    virtual ~TokenBase() {}
};

/*
   Concrete 'token' of an RPN expression. 
   Operators are of type Token< char >
   Operands are of type Token< double >
 */
template< class T > class Token : public TokenBase {

    T token_;

public:

    /* Allow a calculator to consume this token */
    void evaluate (Calculator  *c);

    Token (T t) : token_(t) {}
};

/*
   Represents an expression in Reverse Polish Notation.
   This object basically acts as a FIFO queue of tokens
 */
class RPNExpression {

    std::vector< TokenBase* > stack_;

public:

    /* Add a token to the end of the expression */
    void push (TokenBase *t) { stack_.push_back (t); }

    /* Grab the next token from the front of the expression */
    TokenBase* pop () {
        TokenBase *t = stack_.front ();
        stack_.erase (stack_.begin ());
        return t;
    }

    bool empty () const { return stack_.empty (); }
};

/*
   Convert an expression in infix format to RPN format
 */
class ShuntingYard {

    const std::string expr_;
    RPNExpression rpn_;
    std::stack< char > op_stack_;
    mutable std::map< char, int > op_precedence_;

    /* Returns a precedence value for the given operator */
    int precedence (char op) const { return op_precedence_[op]; }

    /* Returns the precedence of the top item in the stack */
    int stack_precedence () const { 
        if (op_stack_.empty ()) { return -1; }
        return precedence (op_stack_.top ());
    }

    /* Reset precedence to allow for new scope */
    void handle_left_paren () { op_stack_.push ('('); }

    /* Consume all operators in current scope and restore previous scope */
    void handle_right_paren () {
        while ('(' != op_stack_.top ()) {
            rpn_.push (new Token< char >(op_stack_.top ()));
            op_stack_.pop ();
        }
        op_stack_.pop ();
    }

    /* Consume operators with precedence >= than op then add op */
    void handle_op (char op) {
        while (! op_stack_.empty () &&
                precedence (op) <= stack_precedence ()) {
            rpn_.push (new Token< char >(op_stack_.top ()));
            op_stack_.pop ();
        }
        op_stack_.push(op);
    }

    /* Convert infix to RPN via shunting-yard algorithm */
    RPNExpression convert(const std::string &infix) {

        const char * token = infix.c_str ();

        while (token && *token) {
            while (*token && isspace (*token)) { ++token; }
            if (! *token) { break; }
            if (isdigit (*token)) {
                char * next_token = 0;
                rpn_.push (new Token< double >(strtod (token, &next_token)));
                token = next_token;
            } else {
                char op = *token;
                switch (op) {
                    case '(':
                        handle_left_paren ();
                        break;
                    case ')':
                        handle_right_paren ();
                        break;
                    default:
                        handle_op (op);
                }
                ++token;
            }
        }
        while (! op_stack_.empty ()) {
            rpn_.push (new Token< char >(op_stack_.top ()));
            op_stack_.pop ();
        }
        return rpn_;
    }

public:

    ShuntingYard (const std::string& infix) : expr_(infix) {
        op_precedence_['('] = -1;
        op_precedence_['+'] = 2; op_precedence_['-'] = 2;
        op_precedence_['*'] = 3; op_precedence_['/'] = 3;
    }

    RPNExpression to_rpn () { return convert (expr_); }

};
/*
   A calculator that evaluates expressions by first converting them to
   reverse polish notation then processing the result.
 */
class Calculator {

    std::stack< double > operands_;

    double pop () { 
        double d = operands_.top ();
        operands_.pop ();
        return d; 
    }

    void push (double d) { operands_.push (d); }

    /* Returns the most recent operation result (top of the operand stack) */
    double result () const { return operands_.top (); }

    /* Empty the operand stack */
    void flush () { 
        while (! operands_.empty ()) { operands_.pop (); }
    }

protected:

    /* Process an operand token from the input stream */
    void consume(double value) { push (value); } 

    /* Process an operator token from the input stream */
    void consume(char op) { 
        switch (op) {
            case '+':
                push (pop () + pop ());
                break;
            case '*':
                push (pop () * pop ());
                break;
            case '-':
                {
                    double right = pop ();
                    push (pop () - right);
                }
                break;
            case '/':
                {
                    double right = pop ();
                    push (pop () / right);
                }
                break;
            default:
                throw std::domain_error("Unknown Operator");
        }
    } 

public:

    /* 
        Evaluate expression 
        Note: Expression is expected to be in infix form.
     */
    double calculate (const std::string& expr) { 
        ShuntingYard shunting(expr);
        RPNExpression rpn = shunting.to_rpn ();
        flush ();
        while (! rpn.empty ()) { 
            TokenBase * token = rpn.pop ();
            token->evaluate (this);
            delete token;
        }
        return result ();
    }

    /* Expose the consume() methods to the Tokens */
    template< class T > friend class Token;
};

template< class T > 
void Token< T >::evaluate (Calculator * c) { c->consume (token_); }

int main () {
    Calculator c;
    std::cout << c.calculate ("(20+10)*3/2-3") << std::endl;
    return 0;
}
mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster

This is very nice! It's a neat little piece of designed code to solve a somewhat simple problem but in a fairly generic way. That's cool, it's the kind of code I like to see.

As usual, I could point out a few things to improve. At least, there is one thing that could be a nice improvement to this design and other similar designs (as this is a somewhat recurring design pattern).

That one thing in particular is that dependency between the TokenBase and the Calculator. I don't see how that is necessary, in fact, I think the code would be better without it (and also, because I kind of despise the OOP Visitor Pattern, it's horrible to maintain and it doesn't scale, I consider it a last resort if all else fails). Let's say you defined your TokenBase class as follows:

struct TokenBase { 
    virtual void evaluate (std::stack<double>&) = 0;   // note std::stack here. 
    virtual ~TokenBase() {}
};

This way, you could have token classes that actually do the evaluation by picking elements from the stack, without having to go through the "consume" function, as so:

class BinaryOperator : public TokenBase {
    std::function<double(double,double)> func_;
public:
    void evaluate (std::stack<double>& operands);
    BinaryOperator(std::function<double(double,double)> aFunc) : func_(aFunc) {}
};

void BinaryOperator::evaluate(std::stack<double>& operands) {
  double right = operands.top(); operands.pop();
  double left  = operands.top(); operands.pop();
  operands.push( func_(left, right) );
};

class LiteralValue : public TokenBase {
    double value_;
public:
    void evaluate (std::stack<double>& operands) { operands.push(value_); };
    LiteralValue(double aValue) : value_(aValue) { };
};

And so on for other operator arities. Then, you use a factory function to create them:

TokenBase* make_new_token(double aValue) {
  return new LiteralValue(aValue);
};

TokenBase* make_new_token(char op) {
  switch (op) {
    case '+':
        return new BinaryOperator(std::plus<double>());
    case '*':
        return new BinaryOperator(std::multiplies<double>());
    case '-':
        return new BinaryOperator(std::minus<double>());
    case '/':
        return new BinaryOperator(std::divides<double>());
    default:
        throw std::domain_error("Unknown Operator");
  };
};

A further improvement on that, to get rid of this ugly switch-case in the factory function is to use a singleton object that stores a std::map of archetypical tokens for each operator character. Then, each new operator class that you create can just register its archetype to this repository. Something like this (this is just a global variable, not a singleton, for simplicity):

std::map<char, TokenBase*> supportedOpTokens;

TokenBase* make_new_token(char op) {
  std::map<char, TokenBase*>::const_iterator it = supportedOpTokens.find(op);
  if(it != supportedOpTokens.end())
    return it->second->clone();  // say you add a "clone" function to TokenBase.
  throw std::domain_error("Unknown Operator");
};

int main() {
  supportedOpTokens['+'] = new BinaryOperator(std::plus<double>());
  supportedOpTokens['-'] = new BinaryOperator(std::minus<double>());
  supportedOpTokens['*'] = new BinaryOperator(std::multiplies<double>());
  supportedOpTokens['/'] = new BinaryOperator(std::divides<double>());
  supportedOpTokens['%'] = new BinaryOperator(std::modulus<double>());

  Calculator c;
  std::cout << c.calculate ("(20+10)*3/2-3") << std::endl;

  for(std::map<char, TokenBase*>::iterator it = supportedOpTokens.begin();
      it != supportedOpTokens.end(); ++it)
    delete it->second;
  return 0;
};

Of course, in reality the map would be wrapped in a singleton that would take care of the deletion (and would allow static initialization too). Imagine you had a bigger parser system with tons of operators of different kinds and arity, and you wanted the set of supported operators to be extendable and reconfigurable, it is much nicer to be able to register (and unregister) supported operators from anywhere, and not have to re-code the visiting function every time.

In any case, the code you posted will make a great contender for the code-snippet contest!

Member Avatar for iamthwee
iamthwee

Nice snippet.

Mike makes some really good points about the semantics of the language you have used to program this in.

I'm going to make a few suggestions about the practical use of this... Bear in mind I haven't tested your code completely at all so other issues may exist.

Your code falls flat on it's face when unary operators are considered.

Take for example:
-1+2

One such solution might be:
http://stackoverflow.com/questions/2431863/infix-to-postfix-and-unary-binary-operators

See my snippet:

http://www.daniweb.com/software-development/cpp/code/216831/expression-evaluator

Not quite as aesthetically pretty as yours, by any stretch of the imagination but slightly more complete nonetheless.

L7Sqr 227 Practically a Master Poster

@mike_2000_17:
Thank you for taking the time to provide the feedback you did! I agree with decoupling the Calculator and TokenBase and just didn't consider it as I was writing the code.
As for the reduction of the switch to a more maintainable construct - I really like the approach you suggested. I toyed with a way to make that better but gave up as each solution was more complex than what is there. I wanted to keep it pretty simple - also, I'm still pretty much a C++ n00b.

@iamthwee:
Yes. I focused more on the basic algorithm and less on a fully compliant solution. I also neglected to address exponentiation, modulus and a number of other features one would expect from a real calculator.
Supporting the 'transparantly adding a zero' could easily be added as a pass over the input. I see this as an easy way to add any number of verification steps: balanced paranthesis, unary operators, valid constructs, etc.

TrustyTony 888 pyMod Team Colleague Featured Poster

I think better approach would be to just 'transparentely add +'. I mean that you do not have substraction routine but enter adding with negation flag set for number scanner. Then you just toggle the negation flag for minus ignoring plusses. That could be handled by suffiently competent number parser.

Not quite as aesthetically pretty as yours, by any stretch of the imagination but slightly more complete nonetheless.

Looks like that other snippet is not also complete:

4+--4--4--6
Expression invalid retard!

Sorry, other language but I posted the code in Python to clarify what I meant: http://www.daniweb.com/software-development/python/code/427548/simple-expression-cleanup

happyuk 0 Newbie Poster

Nice article!

For coding up the Shunting Yard algorithm I personally found the following Wiki pages very useful:

http://en.wikipedia.org/wiki/Shunting-yard_algorithm
http://en.wikipedia.org/wiki/Reverse_Polish_notation

So long as you're OK with using stacks and queues, then I found it reasonably straightforward
to implement the algorithms as described in those Wiki pages.

I made a few modifications of my own, especially with regard to the tokenization of the RPN, ensuring it can distinguish minus signs from unary operators, so that it can handle calculations like -8+3, or 11 ^ -3, for example. That plus other sanity checks like ensuring there are no mismatched parentheses etc, and adding additional methematical operators like sin, cos, log etc...

However, I still need to clean up my own attempt a little and adopt the object oriented approach as you have ably done.

Explanation and code downloadable from here:

http://www.technical-recipes.com/2011/a-mathematical-expression-parser-in-java/

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.