944,103 Members | Top Members by Rank

Ad:
  • Java Discussion Thread
  • Unsolved
  • Views: 6573
  • Java RSS
You are currently viewing page 1 of this multi-page discussion thread
Dec 22nd, 2004
0

Use Java to remove a block of html from a number of files?

Expand Post »
The background to this question is that I have discovered a virus on one of my computers that adds a block of encrypted activex code to html and php pages. Unfortunately all the virus cleanup tools delete the infected files... being a webdeveloper this means a lot of my 'working files' would be deleted. Luckily none of them are current projects, but I'd really hate to lose my records of past developments.

Since I'm learning java as part of my degree, I was wondering, could java be used to either search my hard drive for php/html files and then remove the block of activex code from them, or could it take a text file input of all the file names (one of the antivirus programs gave me a list of all the infected files as a csv), and remove the block from those files?
Reputation Points: 54
Solved Threads: 20
Master Poster
DaveSW is offline Offline
765 posts
since Jul 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

Sure it could.
If you can find something that uniquely identifies the start and end of the block you can use that as search criteria for the block and filter it all out.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

Thanks for your response.

Well I think my best idea there is:
I always use valid html with only one html tag. The virus is tagged on the end of the page between two html tags, so if I can count the first html tag with a temp variable I can start and end deleting on the second ones.

Can you give me any example files where I could find the 'search through the hard drive' or 'go down a list of files opening them', and also the 'deleting a block of code'? I'm fairly new to java, so I've done opening one file and reading the contents with the command line, but never anything with multiple files... or deleting from a file.
Reputation Points: 54
Solved Threads: 20
Master Poster
DaveSW is offline Offline
765 posts
since Jul 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

valid html with only one tag? Valid html requires ALL tags to be closed so if you don't use closing tags your html is invalid by definition

Does the virus code always start with a fixed and unique sequence? If so, you can just load the files line by line and write everything to a memory buffer up to the first line with that sequence.
Then keep reading (but don't write) until the first line that's not part of the virus code.

When you're done, close the file and reopen it for writing, clearing the file content (File has methods for this). Move the entire buffer into the file and close it.

Of course if the virus doesn't (always) put itself on new lines of the files but inserts itself into existing lines you need to do a bit more.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

yes but it's one <html> tag and the other is a </html> tag, so if I was searching for either string it would only appear once wouldn't it?

The virus always inserts a new block of code within <html> tags below the page content. I have a few example files in a passworded zip if you want to examine them in notepad? I can also change the file extensions to prevent them being executed if that helps?

edit: the body tag that is inserted also has a constant onload attribute, so yes I guess there is a sequence of things there.
Alternatively it could just delete from after the first </html> tag downwards, and that would always remove all the junk.

ps: the reason I have them in a zip is because I asked merijn and he said he'd see if he could help, and he wanted two samples of each file type in a zip.
Reputation Points: 54
Solved Threads: 20
Master Poster
DaveSW is offline Offline
765 posts
since Jul 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

yes, delete anything below the first </html> tag, that should remove all illegal content (no sane person would allow for more than one html block per page anyway as it's not allowed under the html standards, and any validating parser will barf over it.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

Do you know where I could find some examples that look through a file and delete parts of it? I don't really know where to start...
Reputation Points: 54
Solved Threads: 20
Master Poster
DaveSW is offline Offline
765 posts
since Jul 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

experiment and you'll learn. I basically told you what to do in an earlier reply, all you have to do is translate that into code.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

Happily Merijn has sent me a program that solved my problem. However I would still ike to try and make this program. Do you have any suggestions as to what commands I need? At present my knowledge of java commands is limited to drawing circles and seeing if they collide... well, maybe a bit more than that.
Reputation Points: 54
Solved Threads: 20
Master Poster
DaveSW is offline Offline
765 posts
since Jul 2004
Dec 23rd, 2004
0

Re: Use Java to remove a block of html from a number of files?

Just a thought, ANTIVIRUS????? any antivirus should have a "remove virus from file" option. all that i have seen anyway... have you tried Symantec AntiVirus version 7/8/9?
Reputation Points: 16
Solved Threads: 6
Posting Pro in Training
1o0oBhP is offline Offline
445 posts
since Dec 2004

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Java Forum Timeline: JarOutputStream
Next Thread in Java Forum Timeline: Failed to load image into Jpanel





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC