DBD: Data Replication System - Maintenance and Error Handling
1: Guide to the Data Replication log file
The Data Replication log files are automatically written to the DB Distributor data directory "/usr/lib/pvx/tf2d" on your DB Distributor server. In that directory, you will find files with time-stamped names like "DR2SRV1.20150307" (e.g. for the log file for March 7, 2015).
A new log is automatically created every day by each running Data Replication server process. You may see two very similarly named files: "DR2SRV1.20150307" and "DR2SRV2.20150307" (notice the single digit change after "DR2SRV"). This indicates that two separate Data Replication server processes are running for the same Data Replication queue. The log file name is automatically adjusted to indicate a first queue ("DR2SRV1"), a second queue ("DR2SRV2"), and so on.
The format for log file names is as follows:
“DR2SRV” +
- Server number +
- “.” +
- digit Year +
- 2-digit Month +
2-digit Day
Example: DR2SRV2.20150307
The following example illustrates entries in a typical Data Replication log file (no errors are shown):
Each log entry records a single replication transaction. As shown, normal log entries specify the table name involved in the transaction, the data and time, whether it was a WRITE or DELETE action, the table key, an error field (if any occurred), and a simple count of the number of records recorded since the last server restart.
Sometimes a log record will show an error. The example below illustrates the information associated with a write error during an SQL update operation. Pay special attention to this useful information shown in the log:
- timestamp of the occurrence
- an error message and the associated error code
- the related SQL statement
Several other types of error are illustrated below. This log file excerpt shows a number of different instances of “Date Format Errors” that occurred at different times and in various tables (the first green arrow). A date format error might occur when there is a problem storing a date time field, perhaps because a Date/Time function in the SQL statement returned a date type, but the table column expects a string type.
The errors “Error on open of output port: 15” (the second green arrow) indicate that the database connection failed in some way. Connection failures can occur for several reasons: the database server itself might have been offline; or the database table simply doesn’t exist because it was never created; or the user/password credentials may have been incorrect.
2: Database Connection Errors
When the Data Replication processes can't connect to the database for a particular target, note that it is pretty common to find simple syntax problems or typographic errors in the database connection string that was specified when the Target was originally defined.
Review your database connection parameters very carefully.
Don't assume the connection information is valid – TEST IT !
3: IDLE messages during a Data Replication session
While executing, the Data Replication server program DR2SRV normally displays an IDLE message to the screen once every sixty seconds. The clock that tracks idle time is reset every time a record is replicated, and the message is triggered whenever the clock ticks past sixty seconds. The clock is then reset and starts tracking the server idle time again.
The screenshot shown below is from a live data replication server session running in a Providex/WindX/telnet session. Note that the IDLE messages are only written to the console, not to the log file.
The IDLE message helps to maintain a constant flow of TCP packets between the Providex server and the Providex client processes and helps to avoid inactive timeout connection terminations that are otherwise possible when running the Data Replication process from a remote connection across a firewall or via a VPN.
An example of "IDLE" messages displayed on the screen is shown below:
4: Required Maintenance
Data Replication will normally continue to work silently with very few issues, but you should still perform the following maintenance chores on the Data Replication servers and server log files:
- Periodically examine the log files for errors
- Periodically remove/rotate log files
- Verify that Data Replication service processes are still running
- Review the length of the queue for each Data Replication server process