i have a couple of servers that are changing perms on /dev/null so that "other" can't access it, at random (well, i'm sure SOMETHING'S causing it, but the timing is random), so i wrote a script to go in the inittab so it'll respawn if it dies (bash shell)

the script exports different variables, such as the mail recipient, gets the servername from a hostname -s (unless that's null, and then it gets it from some other places), and then simply loops:

i=1
while [ "$i" -lt "2" ]
do
export MINUTE=`date +%M`
if [ "$MINUTE" = "58" ]
then
 #echo "minute is $MINUTE. going to exit, should respawn" >/tmp/minute.txt
  sleep 50
  exit 0
fi
export MODE=`stat /dev/null | grep Access | head -1 | awk -F\( '{print $2}' | awk -F/ '{print $1}'`
if [ "$MODE" = "0660" ]
then
        echo "Changing /dev/null to global rw at `date`"
        /bin/chmod 666 /dev/null
        echo "done"

        echo "$SERVER: had to reset permissions on /dev/null at `date`" > $MAIL_FILE
        send_mail;
        sleep $SLEEP
        i=1
fi
done

where send_mail is just another function to send mail if the mail recipient is defined.

the problem is that this thing is driving the load up to 3 and taking 12% of the cpu, and consistently shows as the top value in top -S

this isn't showing up to the end-user perspective yet, but since one server is a sql db server and the other is a web server, i'm sure that at some point the degradation will creep in.

does anyone have any suggestions
a) as to why the load is driving so high
b) possibly an easy way around it?

if i know the "why" i can probably fix something up.

it sleeps only for 10 seconds. i thought about having it check to see if "file a" exists before starting, but then it'd overspawn from the inittab and flood /var/log/messages i think...

thanks!

Hey there,

If you're running syslog-ng, look into that. It has a history of doing exactly what you're experiencing.

Otherwise, since you know this action is recurring (something is chmod'ing your /dev/null) you could just run this in cron with one line and just "assume" the permissions have been changed.

59 * * * * /bin/chmod 666 /dev/null >/dev/null 2>&1

and run a separate smaller shell script to just run

ls -ld /dev/null
lsof|grep /dev/null
etc...

or whatever info you want to grab every 5 minutes or so for a day and then go over that and see if you can find some likely suspects. lsof should show you what processes/users are tapping /dev/null constantly.

Another crazy thing you could do - if you don't think it'll get you in trouble - would be to lock down /dev/null and, hopefully, make the process that's goofing with it go nuts ;)

If my suggestion sounds glib, I apologize. I'm just thinking you could figure this out using an alternate method and stick with the no-pain quick-fix until you do. Since the script doesn't attempt to find what program changed the perms, you don't need to make it so complicated.

Best wishes,

Mike

Hey there,

If you're running syslog-ng, look into that. It has a history of doing exactly what you're experiencing.

Otherwise, since you know this action is recurring (something is chmod'ing your /dev/null) you could just run this in cron with one line and just "assume" the permissions have been changed.

and run a separate smaller shell script to just run

or whatever info you want to grab every 5 minutes or so for a day and then go over that and see if you can find some likely suspects. lsof should show you what processes/users are tapping /dev/null constantly.

Another crazy thing you could do - if you don't think it'll get you in trouble - would be to lock down /dev/null and, hopefully, make the process that's goofing with it go nuts ;)

If my suggestion sounds glib, I apologize. I'm just thinking you could figure this out using an alternate method and stick with the no-pain quick-fix until you do. Since the script doesn't attempt to find what program changed the perms, you don't need to make it so complicated.

Best wishes,

Mike

the original one was a straight loop that always just chmod'd it :) there was no iterative loop in that one, just a standard "while / do"
i might have to go back to that one. i was hoping to have it notify me and a coworker when it changes so we can find some semblance of timing (it might even only do it after a reboot, not sure)

i would run it in a cron to run once a minute except if the oracle web application drops during that one minute it won't restart properly, all because of this permissions thing; then yoou have to get in, stop the procs via a script, check to make sure it shut everything down properly and removed the files that store the process name and associated pids, make sure apache and only the related http processes are down,

on a side-note, i added nice -n 19 sleep $SLEEP instead, thinking that maybe sleep is running it high (obviously at this point i'm grasping), but i haven't had a free moment (other than this one while i'm eating!) to log in and check how it's going. i... don't remember if i checked the syslog or not. i'm thinking i must not have if i don't remember!

as for what you and comatose say about alternates, i've also thought about scripting it in tcsh, which is pretty much the only other syntax i know; i'm not familiar enough with perl to even try doing it that way. i thought about doing something like the lsof to check, but the problem is i only see the effects of whatever it is changing it (i'll only know after the perms are changed :( )

i have more or less free reign on these machines (although reboots have to be scheduled), so i could lock it down; how do i go about doing that?

thanks so far!

as root, you should be able to basically do something like:

chmod 722 /dev/null

would give the owner (root) full permission to do whatever with it, and let other programs write to it, but not modify the permissions. Then, though, you break other functional applications. Something is clearly wrong though, if some kind of program is randomly modifying permissions that you set....

That's an excellent suggestion, comatose. I also didn't realize what a huge pain this whole thing was for you, glamiss.

I think leaving the nice value at 0 (default) will make your program hog cpu less, but I wouldn't recommend rewriting shell script in tcl, since shell script is, essentially, just using the shell instead of having to add another layer of application on top.

As I mentioned, and agree with comatose, I would be looking at a solution that broke the problem. It might be easier to pick out when its not working properly (if it's a proper program, it may even complain) than to try and find it amidst a ps-sea of random proc's

Best of luck to you!

, Mike

This article has been dead for over six months. Start a new discussion instead.