Hi everybody
I have a file(newfolder.html). I want to do preprocessing on its content. Some operations like tokenization, deleting stop words, counting the number of words. I know how to do these operations if I have a text file(.txt) .but now I have to do it with a html file.
How can I do it?
Thanks
aseeman
0
Newbie Poster
Recommended Answers
Jump to PostYou just put what you have on the text file, using the HTML tags.
Jump to PostI know how to do these operations if I have a text file(.txt)
An HTML file is text as well, so haven't you tried using the same approach as if it was a text file. HTML is plain text, just saved with a .html exention so the computer …
All 6 Replies
blackmiau
0
Junior Poster
aseeman
0
Newbie Poster
JorgeM
958
Problem Solver
Team Colleague
Featured Poster
aseeman
0
Newbie Poster
aseeman
0
Newbie Poster
hericles
289
Master Poster
Featured Poster
Be a part of the DaniWeb community
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.