954,173 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Use Java to remove a block of html from a number of files?

The background to this question is that I have discovered a virus on one of my computers that adds a block of encrypted activex code to html and php pages. Unfortunately all the virus cleanup tools delete the infected files... being a webdeveloper this means a lot of my 'working files' would be deleted. Luckily none of them are current projects, but I'd really hate to lose my records of past developments.

Since I'm learning java as part of my degree, I was wondering, could java be used to either search my hard drive for php/html files and then remove the block of activex code from them, or could it take a text file input of all the file names (one of the antivirus programs gave me a list of all the infected files as a csv), and remove the block from those files?

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

Sure it could.
If you can find something that uniquely identifies the start and end of the block you can use that as search criteria for the block and filter it all out.

jwenting
duckman
Team Colleague
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
 

Thanks for your response.

Well I think my best idea there is:
I always use valid html with only one html tag. The virus is tagged on the end of the page between two html tags, so if I can count the first html tag with a temp variable I can start and end deleting on the second ones.

Can you give me any example files where I could find the 'search through the hard drive' or 'go down a list of files opening them', and also the 'deleting a block of code'? I'm fairly new to java, so I've done opening one file and reading the contents with the command line, but never anything with multiple files... or deleting from a file.

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

valid html with only one tag? Valid html requires ALL tags to be closed so if you don't use closing tags your html is invalid by definition ;)

Does the virus code always start with a fixed and unique sequence? If so, you can just load the files line by line and write everything to a memory buffer up to the first line with that sequence.
Then keep reading (but don't write) until the first line that's not part of the virus code.

When you're done, close the file and reopen it for writing, clearing the file content (File has methods for this). Move the entire buffer into the file and close it.

Of course if the virus doesn't (always) put itself on new lines of the files but inserts itself into existing lines you need to do a bit more.

jwenting
duckman
Team Colleague
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
 

yes but it's one tag and the other is a

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

yes, delete anything below the first

jwenting
duckman
Team Colleague
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
 

Do you know where I could find some examples that look through a file and delete parts of it? I don't really know where to start...

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

experiment and you'll learn. I basically told you what to do in an earlier reply, all you have to do is translate that into code.

jwenting
duckman
Team Colleague
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
 

Happily Merijn has sent me a program that solved my problem. However I would still ike to try and make this program. Do you have any suggestions as to what commands I need? At present my knowledge of java commands is limited to drawing circles and seeing if they collide... well, maybe a bit more than that.

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

Just a thought, ANTIVIRUS????? any antivirus should have a "remove virus from file" option. all that i have seen anyway... have you tried Symantec AntiVirus version 7/8/9?

1o0oBhP
Posting Pro in Training
445 posts since Dec 2004
Reputation Points: 16
Solved Threads: 6
 

antivirus deletes the entire file... as a web developer this nearly wiped out all my files... So no, antivirus is not the solution ;)

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

again Symantec can remove it from the file - ive witnessed the effect myself! maybe it depends on which anti virus you use? I know its not your degree but a c/c++ program would be VERY easy to make using the standard libraries

1o0oBhP
Posting Pro in Training
445 posts since Dec 2004
Reputation Points: 16
Solved Threads: 6
 

yeah some of the files it cleans, some it deletes...

DaveSW
Master Poster
769 posts since Jul 2004
Reputation Points: 54
Solved Threads: 20
 

OIC. i got symantec free from uni anyway so i cant complain :) i still recommend c++ as there is a good file manipulation tutorial on daniweb c/c++ tutorials forum - and the fact that i have done next to no java programming...

1o0oBhP
Posting Pro in Training
445 posts since Dec 2004
Reputation Points: 16
Solved Threads: 6
 

Anything you can do with C++ you can do with Java (at least where file handling is concerned ;) ) .

Norton is probably the WORST antivirus product on the market, far from being the best.
It has just about the worst detection ratio I've ever seen, during one test I performed a 2 week old McAfee found 5 virusses a Norton that had been updated missed completely in just a few minutes...

jwenting
duckman
Team Colleague
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
 

well its done me fine so far. im a bit hesitant to upgrade as i have no idea how much it will cost, especially as the stores in my area are notorious for rip-off prices for software! and i got mine free! im interested in the comparison of these two antiviruses as surely it comes down to the definitions and functionality in the end?

1o0oBhP
Posting Pro in Training
445 posts since Dec 2004
Reputation Points: 16
Solved Threads: 6
 

Just about all AV products cost pretty much the same which is between 60 and 80 Euro a year.
I've used (over the years) quite a few. McAfee is good but (with Norton) a bit too much worldwide standard and therefore target for clever evasion tactics on the part of virus authors.

I'm now using Kaspersky which is good. Some false positives but I'd rather have those than false negatives like Norton is prone to giving.

jwenting
duckman
Team Colleague
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You