100% Pure Java OS Process Control

One of the things that frustrates me to no end is the lack of real process management facilities in Java. Sure, it’s a hard thing to do cross platform but for the love of “Bob” why on earth can’t you at least provide adequate models for the stuff that has existed since – I don’t know – THE BEGINNING OF TIME?
Okay, okay. Slack.
One of the fallouts of the lack of apparent support for the management of underlying operating system processes is that you now get C based management forced down your throat. This inherently sucks because now you have platform based dependencies and – even worse – you invariably have the people doing this who are, shall we say, non-OO experts (that is, C++ people who understand Java syntax only because it’s so “C-like”). Having railed against this issue numerous times and gotten nowhere, I finally have had the need to actually produce a 100% pure Java OS process management framework for the project I’m currently working on.
One of the things that never ceases to amaze me about the architecture and design of new systems is the process of shedding preconceptions that happens once you finally get sucked into the system after batting it about from a distance while you’re sizing it up. It’s something I dearly love about this profession in that the process is a transformation of your mind set as well as an act of creation.

Previously, the problem in my mind has always been getting around the sucky java.lang.Process model. Lot’s of excellent people’s time has been applied to this issue and pretty much the best thing that comes out of it is represented by the java.lang.ProcessBuilder, which is very similar to the model that the fine folks working on the Ant project came up with. But, again, this doesn’t actually solve the actual problem I have, which is process control, not configuration.
The problem is that – as the C folks are wont to remind me – that the semantics around java.lang.Runtime.exec aren’t consistent between platforms. Worse, the mechanisms used to exec the process are spotty and leaves a lot to be desired. Also, once the parent Java process dies, the java.lang.Process model is useless. So, if you have a need to manage processes beyond the life of the process which created the process, you’re simply SOL. Combine this with all the subtlety of inheriting handles and such even on the mainstream platforms and you simply are lead to the conclusion that you must do this in C.
Fair enough. However, I started wondering what exactly are the requirements here. I started to look at what we were doing internally with process control and saw that we’re not doing anything fancy what so ever. Inherited channels? Pshaw! There really isn’t a requirement for this. Sure, it’d be nice to do what inetd does, but since I really don’t need to replace inetd it’s not really something that needs consideration. Seriously, one of the important lessons I learned early on is that while we can do anything, the problem is really that we want to do something specific. As computer programmers, we tend to over generalize and do so at the very beginning. In this case, the trick is to figure out what you need to do and forget about all the stuff that you don’t.
So, forget about handle inheritance – it’s a red herring. What’s left? Basically, we need to manage the process life cycle independent of the life cycle of the process which created it. We also need to manage the two channels we can’t ignore – i.e. STD OUT and STD ERR.
If we scope the problem down to this level, things already are starting to look up. The only real sticky issue here seems to be STD_IN – how on earth are we going to manage that? Again, I looked around trying to find a use case for STD_IN and I couldn’t find one. Yes, it would be nice but since I couldn’t find any actual use cases, I’m just throwing this functionality under the bus. [Clarification: note that this doesn’t really mean that we can’t effectively deal with STD IN. The functionality here is based around the generic requirement for managing processes over which we have no control (i.e. we cannot change). Creating a JavaProcess abstraction above the mechanism described here is straight forward and once you have that, wrapping any existing Java program such that you can set up STD IN (as well as any other thing you may happen to need to do) is quite trivial.]
So now we simply have:

  • Life cycle management independent of creating process
  • Handling STD OUT
  • Handling STD ERR

Which seems like it could actually be doable.
My first flurry of activity now focused on obtaining the PID of the process. Oddly enough, in java.lang.UnixProcess, you can actually get the PID of the process. Using reflection, you can even do this without any help from Sun what so ever. But as I was basking in the reflection of a newly discovered clever hack, it became pretty obvious that while I now had the PID, I still had the issues with STD OUT and STD ERR to deal with, not to mention all the slings and arrows I would have to withstand regarding the aforementioned spotty mechanisms which fork the process used in Runtime.exec – issues that would be there regardless of how clever I was about life cycle management.
This got me to thinking and I created a few prototypes and in one of my many Google searches I found that under the Bourne shell, one can get the PID of the last background process using the variable $!. Suddenly I had a way to completely divorce the issue of process creation from Runtime.exec. What I would do is generate a script which would then be executed – of course – by Runtime.exec. But the process I was interested in – i.e. the process I wanted to manage – would be a process forked from the process running this script. Anyone that argues that the shell fork is screwed up and “spotty” needs to have their reality distortion field checked. It’s used all over the industry, and pretty much all of the infrastructure in the world relies on it.
So now things quickly fell into place. The generated script, under any platform which supports the Bourne shell is (in template form):

exec 1> ${std.out}
exec 2> ${std.err}
nohup ${command} &
echo $! > ${pid.file}

Note that I’m not actually using environment variables in the script, rather these ${…} variables are used as in Ant for macro expansion.
After this, it was pretty easy to build up a nice framework which uses the ProcessBuilder and all the nice machinery existing in the JDK to generate and execute this script. Killing the process is pretty easy to do now that I have the PID. There’s a bit of command execution parsing from things ps to determine status – e.g. when killing a process, I first try “kill -2″ so that things are given a chance to shutdown. If, after the paramaterized wait time for the process to shut down in an orderly fashion, it still hasn’t terminated according to psI then issue terminate with extreme prejudice – i.e. “kill -9″.
Dealing with STD OUT and STD ERR is simply done through redirecting through files. Reading this output from within Java is simply the straight forward creation of InputStreams for these files. True, there is the issue that the files simply grow without limit – I haven’t found a way to truncate this – but again, I’m erring on the practical side. One simply shouldn’t be dumping reams of crap to STD OUT and STD ERR. One should be using what everyone in the 21st century uses – something called logging. Anything that does manage to get to STD OUT and STD ERR should be stuff that slips through the cracks. Maybe this will be an issue in the end and I’ll have to invent something to work around it. But given that the entire framework is script based, it really won’t be an actual problem because – worst case – is that I’ll have to expand and modify how I redirect these output streams in the script. Something that no one should be able to argue isn’t possible and should definitely be straight forward.
Once you start fleshing out this model, It’s also to nice to now add some nice features from the banal – i.e. “is the process active” – to the more interesting – i.e “what are the stats of the process” such as CPU, memory, handles open And since the only thing that is kept about the process is the PID, I can transfer this between processes and keep it in persistent store so I can manage these processes from processes other than the one which created the process in the first place – even getting STD OUT and STD ERR is trivially accomplished.
No, you’re probably saying to yourself “Hal, this is wonderful stuff. I see how you can do this for any modern operating system that supports the Bourne shell but I see there’s a fly in the ointment that you have failed to address.” That elephant in the room is, of course, Microsoft Windows.
Well, luckily, and something that shouldn’t really be a surprise, the analogs of all the mechanisms used above exist in the Windows environment as well. For example, a few seconds of Googling turned up this fantastic post documenting quite clearly how to accomplish everything I’ve done under god’s platform, Unix, under that bastard operating system spawned from Satan himself.
Thus, the only real challenge in the end is simply coming up with a nice framework that allows you to share as much functionality between the various platform specific implementations and allow these to provide the concrete mechanisms for the manipulation.
I’m pretty sure that this all could have been accomplished seven years ago as scripting is a well established and cherished management technique – i.e. if you can’t do it in scripts then you probably don’t need to do it. And this holds regardless of the OS platform because the same people are managing them all.
Anyways, hope you enjoyed this – even though there isn’t any code here. The actual Java implementation is – as they say – just a matter of turning the crank once this basic functionality has been laid out. Straight forward and generally uninteresting once the problem has been solved.

One thought on “100% Pure Java OS Process Control

Comments are closed.