Create a Fast-Start Failover Observer Service on Linux

January 30, 2023 Sean Scott

Oracle Data Guard protects critical database environments with exact, physical standbys, or copies, of production databases. Business continuity and disaster recovery are a matter of switching or failing over to the standby database when the primary system goes offline.

Fast-Start Failover (FSFO) automates these activities. FSFO monitors participants in Data Guard configurations and, when it detects that the primary is unavailable, performs the switch automatically. It’s more responsive and durable than manual intervention. The time it takes for a DBA to acknowledge a page, log in to the environment, assess the situation, and make the decision to switch to the standby represents lost revenue and productivity. As an automated solution, FSFO is more efficient for dealing with multiple databases, too.

The Observer is the critical component in a FSFO solution. It’s responsible for monitoring the environment, detecting events, and triggering a switch. The Observer is really just a Data Guard Broker client session. It connects to the Data Guard topology and reads the status. Ideally, the Observer (or, better still, multiple Observers) is on a dedicated machine, and not located on the primary or standby database host.

A typical Observer setup involves running a script that starts a Data Guard Broker client session, connects to the primary database, and starts the observer. The process runs in the background (and in older versions never returned control). But the Observer is monitoring mission-critical databases, and there should be some intelligence built around it. A lot could go wrong with the Observer. How do you know if it’s still running properly? If it stops, what’s the mechanism for restarting it?

So, we often see scripts with some diagnostic capabilities, called by a cron job, checking the health and activity of the Observer.

This functionality is already built into the systemd process on Linux systems. cron runs on top of systemd, and its service architecture has embedded restart features, so why not cut out the middleman and just run the Observer process as a service?

Configure an Observer Host

I’ll demonstrate this setup on Oracle Cloud Infrastructure, using an Always-Free eligible compute instance running Oracle Enterprise Linux 8. The Observer will run from an Oracle 19c Database Client home but might just as easily use a full Oracle Database installation. And, while I’m using a 19c client, the examples here will also work with older versions (looking at you, 11g) built before the “fire-and-forget” start observer in background was added!

The first step after provisioning the VM is preparing the environment and installing the software. I used the preinstall RPM for this, just as I would for a database installation:

Configure an Observer Host

Client Software Installation

Configure and Test Networking

Script Observer Startup

Configure a Service

Start and Test the Service