Oracle DB: Oracle Archive Log Full – ORA-00257

ORA-00257: archiver error. Connect internal only, until freed.

Smart Panda - DatabaseIf you are loading large volumes of data and you have archive logging turned on sometimes you will run into the ORA-00257 error.  This error occurs when the upper limit of the archive log dedicated space is hit.  You need to clean up some/all of the logs in order to set the database free to work again.

A relatively straightforward way to do this is to use the rman tool:

. oraenv (set your sid/oracle_home)
rman target /
crosscheck archivelog all;  (you should see all the logs)
delete archivelog all; (removes all the logs)
crosscheck archivelog all; (you should see no logs)
exit;

Now you should be able to access the database again without the ORA-00257 error.  Sometimes it maybe necessary to turn the archive logging off during these large loads, see the post on turning off archive logging.

Smart Panda - IdeaNote: When in ARCHIVELOG mode, the database will make copies of all online redo logs after they are filled. These copies are called archived redo logs. The archived redo logs are created via Oracle’s Archive Process (ARCH). This process copies the archived redo log files to one or more archive log destination directories. In order for this process to be used additional setup is required primarily the redo log destination directories. In non-production environments there can be some major down sides to running in ARCHIVELOG mode.  The most common error you will encounter with Archive Logging turned on is the ORA-00257 error.  It should also be noted that absolutely nothing will process within the system until the archive logs are freed.  This can be a serious issue to an environment when doing maintenance, upgrades, or large data loads.  In those scenarios it is recommended to make backups of the environment and turn archive logging off for the duration of the high traffic work, then again backup and turn archive logging back on if needed.

Renaming Tables (Dealing with Duplicates)

Sometimes it is helpful to be able to rename a table in PeopleSoft. I find when dealing with duplicates in tables during upgrades, you can rename a table to a temporary table and then re-create the table in question and then simply do an insert into the new table from the temporary table using a select distinct statement.

SQL SERVER:
sp_rename ‘TABLENAME’, ‘NEWTABLENAME’

ORACLE:
alter table tablename rename to newtablename

Missing or invalid version of SQL library PSORA

This error is just one that drives you crazy. PeopleSoft has been making the move from a 32bit application to a 64bit application. Almost everywhere you will find the that the application runs in 64bit mode, except for in windows. The client application is still a 32-bit client. So if you are using Oracle for example, you need to ensure that the Oracle client software on the windows machine is 32-bit. This is also true with the tuxedo service, when downloading patches make sure that you download the 32-bit patches.

So if you encounter this error, make sure that the 32-bit libraries are installed and are being referred to for the client. DBBIN variable specifically, and if necessary you may need to add the binary path to the add to path variable in the configuration.

Runaway Queries

When PeopleSoft runs sql queries and it runs longer then the service timeout for the service operation, the service will terminate its connection to that query, however, it may stay running on the database. This can result in long running queries on the database draining valuable CPU/memory from the database server. Oracle has the ability to check for “dead” connections and terminate them.

Adding the parameter sqlnet.expire_timeout to the sqlnet.ora file that is housed on your database server ($ORACLE_HOME/network/admin directory), will send a probe to verify that connections are active versus dead due to some client termination. If a connection is determined to be terminated or no longer in use, an error is returned which will cause the server process to exit. The parameter is specified in a time interval of minutes, and PeopleSoft recommends a 10 minute interval.

sqlnet.expire_timeout=10

Some considerations to this are you must be using TPC/IPC connections for this to work, so if you are using the parameter UseLocalOracleDB=1 in your configurations you are not using TPC/IPC. Most client use TPC/IPC so this should not be an issue. This will also put some additional overhead on the server to execute the probes, however, if you have large/long running sql queries running against the database, the performance hit should be offset by the gains in killing the unnecessary (dead) connection.

Unique Index missing producing slow performance

I ran into a odd situation yesterday.  I got handed a performance problem that had been going on for months, the developers were blaming the database people, the database people were producing useless reports and statistics and around and around it went.  All the end user wanted to do was view a journal entry in the finance system from the web tier.

Well, my initial thought most likely the journal line table was really large and there may be need for another index.  The journal line table had 1.1 million rows, but the row count SQL (select count(1) from ps_jrnl_ln) was taking a really long time to return, when I selected the data by the first two key fields to return a specific journal (select count(1) from ps_jrnl_ln where business_unit = ‘SHARE’ and journal_id = ‘PAY1234’).  The RED flag is waving madly at this point. So here is what I did:

1) Opened the Record: JRNL_LN in application designer and did an alter table build (w/even if no change flag on).  The build produced the following set of errors:

ORA-12801: error signaled in parallel query server xyz
ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found

How in the world did that happen?  Good question, can this really happen?  Not under normal operation, but it has happened so how do we fix it:

select business_unit, journal_id, journal_date, unpost_seq, journal_line, ledger, count(1)
from ps_jrnl_ln
group by business_unit, journal_id, journal_date, unpost_seq, journal_line, ledger
having count(1) > 1;

This identified 31 rows having a count greater than 1.  So I looked at a couple and found they were completely identical rows. So, how to get rid of them.  I used the returned data from the previous select and built and export script in datamover that basically exported all the duplicate rows to a data file.

set log l:\data\logs\export_bad_data.log;
set output l:\data\bad_data.dat;
export jrnl_ln where business_unit = 'xyz' and journal_id = '123';
rem ..... (fill in all the key fields from the above select);
export jrnl_ln where business_unit = 'xyz' and journal_id = '128'
rem ..... so on until all the rows are exported;

Now, I exited datamover, and went back into datamover in bootstrap mode. I imported the data using ignore_dups and the as command, to import the data into a temporary table.

set log l:\data\logs\import_bad_data.log;
set input l:\data\bad_data.dat;
set ignore_dups;
import jrnl_ln as temp_jrnl_ln;

Now, I have the 31 unique rows in a temporary table.  I delete all the duplicate rows from the jrnl_ln table, using the results from the first select, I turn the results into delete statements by all the keys.  This deletes a total of 62 rows from the jrnl_ln table. I then do a insert from the temp table, which puts the 31 unique rows back into the jrnl_ln table, and I drop the temp table.

Rebuild the index, and it returns a success with no warnings or errors.  On the web, I open up the journal in question in a couple of seconds.   !!!!!!!!!!!