| | |
Use Java to remove a block of html from a number of files?
![]() |
The background to this question is that I have discovered a virus on one of my computers that adds a block of encrypted activex code to html and php pages. Unfortunately all the virus cleanup tools delete the infected files... being a webdeveloper this means a lot of my 'working files' would be deleted. Luckily none of them are current projects, but I'd really hate to lose my records of past developments.
Since I'm learning java as part of my degree, I was wondering, could java be used to either search my hard drive for php/html files and then remove the block of activex code from them, or could it take a text file input of all the file names (one of the antivirus programs gave me a list of all the infected files as a csv), and remove the block from those files?
Since I'm learning java as part of my degree, I was wondering, could java be used to either search my hard drive for php/html files and then remove the block of activex code from them, or could it take a text file input of all the file names (one of the antivirus programs gave me a list of all the infected files as a csv), and remove the block from those files?
Thanks for your response.
Well I think my best idea there is:
I always use valid html with only one html tag. The virus is tagged on the end of the page between two html tags, so if I can count the first html tag with a temp variable I can start and end deleting on the second ones.
Can you give me any example files where I could find the 'search through the hard drive' or 'go down a list of files opening them', and also the 'deleting a block of code'? I'm fairly new to java, so I've done opening one file and reading the contents with the command line, but never anything with multiple files... or deleting from a file.
Well I think my best idea there is:
I always use valid html with only one html tag. The virus is tagged on the end of the page between two html tags, so if I can count the first html tag with a temp variable I can start and end deleting on the second ones.
Can you give me any example files where I could find the 'search through the hard drive' or 'go down a list of files opening them', and also the 'deleting a block of code'? I'm fairly new to java, so I've done opening one file and reading the contents with the command line, but never anything with multiple files... or deleting from a file.
valid html with only one tag? Valid html requires ALL tags to be closed so if you don't use closing tags your html is invalid by definition 
Does the virus code always start with a fixed and unique sequence? If so, you can just load the files line by line and write everything to a memory buffer up to the first line with that sequence.
Then keep reading (but don't write) until the first line that's not part of the virus code.
When you're done, close the file and reopen it for writing, clearing the file content (File has methods for this). Move the entire buffer into the file and close it.
Of course if the virus doesn't (always) put itself on new lines of the files but inserts itself into existing lines you need to do a bit more.

Does the virus code always start with a fixed and unique sequence? If so, you can just load the files line by line and write everything to a memory buffer up to the first line with that sequence.
Then keep reading (but don't write) until the first line that's not part of the virus code.
When you're done, close the file and reopen it for writing, clearing the file content (File has methods for this). Move the entire buffer into the file and close it.
Of course if the virus doesn't (always) put itself on new lines of the files but inserts itself into existing lines you need to do a bit more.
yes but it's one <html> tag and the other is a </html> tag, so if I was searching for either string it would only appear once wouldn't it?
The virus always inserts a new block of code within <html> tags below the page content. I have a few example files in a passworded zip if you want to examine them in notepad? I can also change the file extensions to prevent them being executed if that helps?
edit: the body tag that is inserted also has a constant onload attribute, so yes I guess there is a sequence of things there.
Alternatively it could just delete from after the first </html> tag downwards, and that would always remove all the junk.
ps: the reason I have them in a zip is because I asked merijn and he said he'd see if he could help, and he wanted two samples of each file type in a zip.
The virus always inserts a new block of code within <html> tags below the page content. I have a few example files in a passworded zip if you want to examine them in notepad? I can also change the file extensions to prevent them being executed if that helps?
edit: the body tag that is inserted also has a constant onload attribute, so yes I guess there is a sequence of things there.
Alternatively it could just delete from after the first </html> tag downwards, and that would always remove all the junk.
ps: the reason I have them in a zip is because I asked merijn and he said he'd see if he could help, and he wanted two samples of each file type in a zip.
Do you know where I could find some examples that look through a file and delete parts of it? I don't really know where to start...
Happily Merijn has sent me a program that solved my problem. However I would still ike to try and make this program. Do you have any suggestions as to what commands I need? At present my knowledge of java commands is limited to drawing circles and seeing if they collide... well, maybe a bit more than that.
![]() |
Other Threads in the Java Forum
- Previous Thread: JarOutputStream
- Next Thread: Failed to load image into Jpanel
| Thread Tools | Search this Thread |
-xlint add android api applet application applications array arrays automation bank bi binary blackberry bluetooth chat class clear client code compile compiler component database development digit eclipse equation error event formatingtextintooltipjava fractal freeze functiontesting game gameprogramming givemetehcodez graphics gui health html hyper ide idea image infinite int integer j2me java javame javaprojects jetbrains jni jpanel jtable julia learningresources linux list main map method methods mobile myregfun netbeans nonstatic notdisplaying openjavafx pearl problem program project qt recursion repositories scanner screen scrollbar server set sms sort sorting spamblocker sql sqlserver state storm string superclass swing system thread threads tree variablebinding windows xor






