Use Java to remove a block of html from a number of files?

Reply

Join Date: Jul 2004
Posts: 764
Reputation: DaveSW is on a distinguished road 
Solved Threads: 17
DaveSW's Avatar
DaveSW DaveSW is offline Offline
Master Poster

Use Java to remove a block of html from a number of files?

 
0
  #1
Dec 22nd, 2004
The background to this question is that I have discovered a virus on one of my computers that adds a block of encrypted activex code to html and php pages. Unfortunately all the virus cleanup tools delete the infected files... being a webdeveloper this means a lot of my 'working files' would be deleted. Luckily none of them are current projects, but I'd really hate to lose my records of past developments.

Since I'm learning java as part of my degree, I was wondering, could java be used to either search my hard drive for php/html files and then remove the block of activex code from them, or could it take a text file input of all the file names (one of the antivirus programs gave me a list of all the infected files as a csv), and remove the block from those files?
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,144
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Use Java to remove a block of html from a number of files?

 
0
  #2
Dec 23rd, 2004
Sure it could.
If you can find something that uniquely identifies the start and end of the block you can use that as search criteria for the block and filter it all out.
Reply With Quote Quick reply to this message  
Join Date: Jul 2004
Posts: 764
Reputation: DaveSW is on a distinguished road 
Solved Threads: 17
DaveSW's Avatar
DaveSW DaveSW is offline Offline
Master Poster

Re: Use Java to remove a block of html from a number of files?

 
0
  #3
Dec 23rd, 2004
Thanks for your response.

Well I think my best idea there is:
I always use valid html with only one html tag. The virus is tagged on the end of the page between two html tags, so if I can count the first html tag with a temp variable I can start and end deleting on the second ones.

Can you give me any example files where I could find the 'search through the hard drive' or 'go down a list of files opening them', and also the 'deleting a block of code'? I'm fairly new to java, so I've done opening one file and reading the contents with the command line, but never anything with multiple files... or deleting from a file.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,144
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Use Java to remove a block of html from a number of files?

 
0
  #4
Dec 23rd, 2004
valid html with only one tag? Valid html requires ALL tags to be closed so if you don't use closing tags your html is invalid by definition

Does the virus code always start with a fixed and unique sequence? If so, you can just load the files line by line and write everything to a memory buffer up to the first line with that sequence.
Then keep reading (but don't write) until the first line that's not part of the virus code.

When you're done, close the file and reopen it for writing, clearing the file content (File has methods for this). Move the entire buffer into the file and close it.

Of course if the virus doesn't (always) put itself on new lines of the files but inserts itself into existing lines you need to do a bit more.
Reply With Quote Quick reply to this message  
Join Date: Jul 2004
Posts: 764
Reputation: DaveSW is on a distinguished road 
Solved Threads: 17
DaveSW's Avatar
DaveSW DaveSW is offline Offline
Master Poster

Re: Use Java to remove a block of html from a number of files?

 
0
  #5
Dec 23rd, 2004
yes but it's one <html> tag and the other is a </html> tag, so if I was searching for either string it would only appear once wouldn't it?

The virus always inserts a new block of code within <html> tags below the page content. I have a few example files in a passworded zip if you want to examine them in notepad? I can also change the file extensions to prevent them being executed if that helps?

edit: the body tag that is inserted also has a constant onload attribute, so yes I guess there is a sequence of things there.
Alternatively it could just delete from after the first </html> tag downwards, and that would always remove all the junk.

ps: the reason I have them in a zip is because I asked merijn and he said he'd see if he could help, and he wanted two samples of each file type in a zip.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,144
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Use Java to remove a block of html from a number of files?

 
0
  #6
Dec 23rd, 2004
yes, delete anything below the first </html> tag, that should remove all illegal content (no sane person would allow for more than one html block per page anyway as it's not allowed under the html standards, and any validating parser will barf over it.
Reply With Quote Quick reply to this message  
Join Date: Jul 2004
Posts: 764
Reputation: DaveSW is on a distinguished road 
Solved Threads: 17
DaveSW's Avatar
DaveSW DaveSW is offline Offline
Master Poster

Re: Use Java to remove a block of html from a number of files?

 
0
  #7
Dec 23rd, 2004
Do you know where I could find some examples that look through a file and delete parts of it? I don't really know where to start...
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,144
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Use Java to remove a block of html from a number of files?

 
0
  #8
Dec 23rd, 2004
experiment and you'll learn. I basically told you what to do in an earlier reply, all you have to do is translate that into code.
Reply With Quote Quick reply to this message  
Join Date: Jul 2004
Posts: 764
Reputation: DaveSW is on a distinguished road 
Solved Threads: 17
DaveSW's Avatar
DaveSW DaveSW is offline Offline
Master Poster

Re: Use Java to remove a block of html from a number of files?

 
0
  #9
Dec 23rd, 2004
Happily Merijn has sent me a program that solved my problem. However I would still ike to try and make this program. Do you have any suggestions as to what commands I need? At present my knowledge of java commands is limited to drawing circles and seeing if they collide... well, maybe a bit more than that.
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 445
Reputation: 1o0oBhP is an unknown quantity at this point 
Solved Threads: 6
1o0oBhP's Avatar
1o0oBhP 1o0oBhP is offline Offline
Posting Pro in Training

Re: Use Java to remove a block of html from a number of files?

 
0
  #10
Dec 23rd, 2004
Just a thought, ANTIVIRUS????? any antivirus should have a "remove virus from file" option. all that i have seen anyway... have you tried Symantec AntiVirus version 7/8/9?
http://sales.carina-e.com

no www
no nonsense

coming soon to a pc near you! :cool:
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Other Threads in the Java Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC