Tuesday, September 18, 2007

New career time?

My last few weeks have been a hazy blur of working long hours and not getting enough sleep. My workload is rising to panic-inducing levels as a second customer elbows their way into my schedule. My manager wants to move me from a relatively sane customer to a really horrid demanding one, and while that would give me a lot to draw on for this blog really, I'd like a peaceful life. Honestly. I agree with Pratchett that 'May you live in interesting times' is one of the worst curses I can think of.

One incident stands out. There's a project that I'm not officially part of but I occasionally get phone calls or emails from those who are asking for a bit of help. One such phone call came in as I was trying to eat my breakfast at my desk. It seemed that one of the gentlemen down on $project, I'm going to call him Basil, has rendered a system non-booting.

Apparently Basil did this by editing the fstab to add a new mount. Add the entry, reboot the system (I'm not sure why this was necessary) and bang, Unable to mount root fs. While wandering around the office kitchen making a cup of miso and peeling my mandarin I tried talking him through recovering the system. First there was the appending boot options to grub drama. Then there was teaching him how to navigate when all he had was the initrd. I thought everyone knew that you could:

echo *

If you don't have a working ls.

Around the office people were smirking at my phone converstion which went something like this:

"Ok, so you mentioned LVM in the boot options so I guess your root filesystem is in LVM? Right, have a look in /dev/mapper to see what you have there. No, we already established that you don't have ls. Right, either try to tab complete or use 'echo *'. e-c-h-o... got it? Yep. Cool. So now lets try to mount your root filesystem. No, you don't have an fstab so you can't just type 'mount /'. You'll need to type mount, then the full path of the device you've found in /dev/mapper, then a mount point.... right, yep, then a mount point.... where are you up to ? Ok, now you type a mount point.... ok, just type '/mnt' for me? Ok. Good. Now hit enter. What's wrong? What error does it give you? I understand it's not working but can you please tell me what the mount command printed on your screen?"

"... ah yes, so correct spelling is not optional."

We eventually got a root filesystem mounted and he commented out the new mount he'd added to the fstab and managed to get the system booted. To this day I still can't figure out though how he broke it. He said he'd typoed the name of the mount point but I just can't see how, unless the typo was / $name, with a space. That would do it.

Basil isn't as stupid as this post makes him sound - he's actually a pretty smart guy, but in completely the wrong role on $project, which gives him plenty of opportunity to look extremely dumb. I think we'll be seeing more of Basil on here before the project is over.

Outage Window

I have a 2-hour outage window. Another company also needs to make changes at the same time, because I have control of half of the thing, and they have control of the other half. It's clear that we have two hours to make the changes. The outage begins, we both make our changes.

When I call to rollback the changes, inside the outage window, they announce they've gone home. And it'll be half an hour by car to get in to undo the changes.

Sometimes, it would just be easier if this was all done yourself.

Monday, September 3, 2007

One of those weeks.

It's been one of those days where every time I get up to go to the bathroom, I come back to 3 missed calls from 3 different people all wondering why their work isn't done yet (Hint: It's all the time I spend talking to you on the phone! If you left me alone think of all the extra time I'd have to do your work in.)

It's taken me 2 days so far to get access to an SSH gateway that allows me (eventually) into a certain customer's environment. For various reasons I need to get a 3GB database dump back to my local machine from a system nested behind 3 layers of NAT and only accessible through a certain chain of about 6 systems by SSH.

After constructing one of the most arcane ssh command lines I have ever seen, I discover that one server in the chain wont let me forward a port.

AllowTcpForwarding no

I think I'm going to burn someone. This particular server in the chain is really causing some grief for me given how ridiculously tightly it's locked down.

-bash: /bin/vi: Operation not permitted

Thanks guys. I really appreciate the way you help me do my job.