| | |
c++ count html tags
Please support our C++ advertiser: Intel Parallel Studio Home
![]() |
•
•
Join Date: Apr 2009
Posts: 8
Reputation:
Solved Threads: 0
Hi guys.
Im doing an assignment, I am getting on ok with it but I am sadly really stuck at the moment.
The part I am stuck on involves counting html tags in a text file. I have thought of a method of doing this but unfortunately I have no idea how to implement it into the code. My idea is to look for the start symbol of the tag( <) then the contents, then the end symbol (>) Does anyone know how I could do this in c++?
Many thanks
Mat
Im doing an assignment, I am getting on ok with it but I am sadly really stuck at the moment.
The part I am stuck on involves counting html tags in a text file. I have thought of a method of doing this but unfortunately I have no idea how to implement it into the code. My idea is to look for the start symbol of the tag( <) then the contents, then the end symbol (>) Does anyone know how I could do this in c++?
Many thanks
Mat
You could just go through the file and when your program comes across a '<' it should just ignore everything after it until it comes across a '>', at that point you've to count a tag ...
Last edited by tux4life; Apr 5th, 2009 at 6:11 pm.
"Never argue with idiots, they just drag you down to their level and then beat you with experience."
probably something like this:
C++ Syntax (Toggle Plain Text)
std::string str = "<html>"; if( str.find("<") != string::npos && str.find(">") != string::npos) { // most likely an html tag }
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
•
•
Join Date: Apr 2009
Posts: 8
Reputation:
Solved Threads: 0
C++ Syntax (Toggle Plain Text)
// assignment program // read file and copy to another // count amount of charecters,lines, comments and tags // change Xhtml tags from upper case to lower case // place in new file #include <iostream> #include <fstream> #include <string> #include <cstring> #include <iomanip> #include <cctype> using namespace std; int main() { string file1,file2; string str = "<>"; ifstream ipfile; ofstream opfile; char c; int amountline = 0; int amountcha = 0; int amounttag = 0; int amountcomment = 0; cout << "Please enter the name of the file you wish to check" << endl; cin >> file1; ipfile.open(file1.c_str()); if (!ipfile.is_open()) { cout << "Oops! Couldn't open " << file1 << "!\n"<<endl; return 1; } { cout << " Please enter the file you wish the edited contents to be copied to" << endl; cout << " This will be created if it does not already exist"<< endl; cin >> file2; } opfile.open(file2.c_str()); while (!ipfile.eof()) { ipfile.get(c); opfile << c; if(c!='\n' && !ipfile.eof() && c!=' ') { amountcha++; } if(c=='\n') { amountline++; } if( str.find("<") != string::npos && str.find(">") != string::npos) { amounttag++; } if ( c == '!') { amountcomment++; } cout << " This file contains :" << amountline << " lines" << endl; cout << " This file contains :" << amountcha << " charecters" << endl; cout << " this file has : " << amountcomment<< " comments " << endl; cout << " This file contains : " << amounttag << "tags" << endl; cout << " Copy complete, edited code located in " << file2 << endl; return 0; }
thats all my code so far.
why not just use getline() to read an entire line at one time?
I don't do html coding, but I think any given tag must be on one line, such as "<html>" can not be split between lines, so it doesn't make any sense to read the html file one character at a time.
C++ Syntax (Toggle Plain Text)
std::string line; while( getline(ipfile, line) ) { // blabla }
I don't do html coding, but I think any given tag must be on one line, such as "<html>" can not be split between lines, so it doesn't make any sense to read the html file one character at a time.
Last edited by Ancient Dragon; Apr 5th, 2009 at 7:16 pm.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
you use getline() to read an entire line that is terminated with '\n'. Then use string::find() to look for < and > characters as shown in previous example code.
Also note that ipfile.eof() is not needed in my loop because the loop stops on error or end-of-file.
Also note that ipfile.eof() is not needed in my loop because the loop stops on error or end-of-file.
Last edited by Ancient Dragon; Apr 5th, 2009 at 7:19 pm.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
![]() |
Similar Threads
- HTML tags validator (C)
- Single Field across two columns? (ASP)
- Code tags whine-a-thon (DaniWeb Community Feedback)
- Games in Geek's Lounge (DaniWeb Community Feedback)
- get html element value using php (PHP)
- printing webpages (JavaScript / DHTML / AJAX)
- Expandable html? (JavaScript / DHTML / AJAX)
- This ought to be simple - extra spaces (PHP)
- Use Java to remove a block of html from a number of files? (Java)
Other Threads in the C++ Forum
- Previous Thread: Simple Quadratic Equation Solver - C++ - Do You Have Any Advices Would Like to Give?
- Next Thread: fstream to int
| Thread Tools | Search this Thread |
api array beginner binary bitmap c++ c/c++ calculator char char* class classes coding compile compiler console conversion count data database delete desktop developer directshow dll download dynamic email encryption error file forms fstream function functions game getline google graph gui homeworkhelper iamthwee ifstream input int integer java lib linkedlist linker linux loop looping loops map math matrix memory multiple news node number numbertoword output parameter pointer problem program programming project proxy python random read recursion recursive reference return rpg sorting string strings struct template templates test text text-file tree unix url vector video visualstudio win32 windows winsock word wordfrequency wxwidgets







i'll try these ideas out.