Friday, January 20, 2006

The Storm Before the Calm


The myMason Portal, my first project since coming to GMU, goes live on Monday. Everything is in good shape except for a performance problem with campus and personal announcements. That information is written to and read from an Oracle database, and the JDBC connection keeps timing out. We've been playing with configuration settings and db indexes (on an empty table) since October trying to get this issue solved, but today we discovered something so sexy that I may need 5 minutes to myself:
error 604 detected in background process
OPIRIP: Uncaught error 447. Error stack:
ORA-00447: fatal error in background process
ORA-00604: error occurred at recursive SQL level 1
ORA-00904: "DBMS_AQADM_SYSCALLS"."GET_OWNER_INSTANCE": invalid identifier

Unless you're an Oracle DBA, you're looking at that error and you say: "double-u tee eff? I have no clue what the fuck that means." It means that your database has Full Blown AIDS, and if you don't put on a latex jumpsuit it's going to share. ORA-604 is the error message no DBA ever wants to see.

How did it get AIDS, you might ask? Did it open up its legs and let every other little program in the building take its turn, trying to make a giant binary cream pie? No, think 80s blood transfusion -- our wonderful, hardworking database administrators applied a software update to Oracle but didn't read the directions. You're probably thinking "I never read the directions either.", but I'm betting you would if half-a-million dollars in software on hardware almost as expense gave you post installation instructions.

The technote for this problem reads like this:

Cause

You applied a patchset and did not run Catpatch.sql on the database
The versions reported in dba_registry should be in sync
with the version of software being used.

Fix

Follow the post-install instructions in the patchset readme.

Yes, please follow the post-install instructions. If I take my car to jiffy lube, and they drain my oil but forget to put oil back in, that's not cool.

!!!

A fitting end for a short week with a humorless beginning. Apparently every server we have crashed over the long weekend. Why? Well, if you have several hundred servers hooked to a giant uninterruptable power supply, all hooked to a generator, how much run time do you need on the UPS? Well, if the generator was tested in the past six months, and you knew it was busted, probably more than 5 minutes.

On a more positive note, I finished my Luminis User Management system and documented it in a two part series on my Luminis Development blog. | Part 1 | Part 2 |

2 comments:

Anonymous said...

check it, yo:

http://sites.gizoogle.com/index2.php?url=http%3A%2F%2Fthealphajohn.blogspot.com%2F

Anonymous said...

What I think you really need is a process to manage, and than you need some machine generated COBOL to run everything.....and than, and ONLY than can you go get yourself a pandadoodle