One of the side effects of Oracle’s nonblocking approach is that if you actually want to ensure that no more than one user has access to a row at once, then you, the developer, need to do a little work yourself.
A developer was demonstrating to me a resource scheduling program (for conference rooms, projectors, etc.) that he had just developed and was in the process of deploying.
The application implemented a business rule to prevent the allocation of a resource to more than one person for any given period of time. That is, the application contained code that specifically checked that no other user had previously allocated the time slot (at least the developer thought it did).
This code queried the SCHEDULES table and, if no rows existed that overlapped that time slot, inserted the new row. So, the developer was basically concerned with two tables:
create table resources
( resource_name varchar2(25) primary key,other_data varchar2(25) );
create table schedules
( resource_name varchar2(25) references resources,start_time date,end_time date );
And, right after inserting a room reservation into SCHEDULES, and before committing, the application would query:
select count(*)
from schedules where resource_name = :resource_name
and (start_time < :new_end_time)
and (end_time > :new_start_time);
It looked simple and bulletproof (to the developer anyway); if the count came back as one, the room was yours. If it came back greater than one, you could not reserve it for that period. Once I knew what his logic was, I set up a very simple test to show him the error that would occur when the application went live—an error that would be incredibly hard to track down and diagnose after the fact. You’d be convinced it must be a database bug.
All I did was get someone else to use the terminal next to him. Both navigated to the same screen, and, on the count of three, each hit the Go button and tried to reserve the same room for an overlapping time. Both got the reservation. The logic, which worked perfectly in isolation, failed in a multiuser environment. The problem in this case was caused in part by Oracle’s nonblocking reads. Neither session ever blocked the other session. Both sessions simply ran the query and then performed the logic to schedule the room. They could both run the query to look for a reservation, even if the other session had already started to modify the SCHEDULES table (the change wouldn’t be visible to the other session until commit, by which time it was too late). Since it would appear to each user they were never attempting to modify the same row in the SCHEDULES table, they would never block each other, and, thus, the business rule could not enforce what it was intended to enforce.
This surprised the developer—a developer who had written many database applications—because his background was in a database that employed read locks. That is, a reader of data would be blocked by a writer of data, and a writer of data would be blocked by a concurrent read of that data. In his world, one of those transactions would have blocked the other—or perhaps the application would have deadlocked. But the transaction would ultimately fail.
So, the developer needed a method of enforcing the business rule in a multiuser environment—a way to ensure that exactly one person at a time made a reservation on a given resource. In this case, the solution was to impose a little serialization of his own. In addition to performing the preceding count(*), the developer first performed the following:
select * from resources where resource_name = :resource_name FOR UPDATE;
What he did here was to lock the resource (the room) to be scheduled immediately before scheduling it, in other words before querying the SCHEDULES table for thatresource. By locking the resource he is trying to schedule, the developer ensures that no one else is modifying the schedule for this resource simultaneously. Everyone wanting to execute that SELECT FOR UPDATE for the same resource must wait until the transaction commits, at which point they are able to see the schedule. The chance of overlapping schedules is removed.
Developers must understand that, in a multiuser environment, they must at times employ techniques similar to those used in multithreaded programming. The FOR UPDATE clause is working like a semaphore in this case. It serializes access to the RESOURCES tables for that particular row—ensuring no two people can schedule it simultaneously.
Using the FOR UPDATE approach is still highly concurrent as there are potentially thousands of resources to be reserved. What we have done is ensure that only one person modifies a resource at any time. This is a rare case where the manual locking of data we are not going to actually update is called for. You need to be able to recognize where you must manually lock and, perhaps as importantly, when not to (I’ll get to an example of this in a bit). Furthermore, the FOR UPDATE clause does not lock the resource from other people reading the data as it might in other databases. Hence, the approach will scale very well.
Issues such as the ones I’ve described in this section have massive implications when you’re attempting to port an application from database to database (I will return to this theme a little later in the chapter), and this trips people up time and time again. For example, if you are experienced in other databases where writers block readers and vice versa, you may have grown reliant on that fact to protect you from data integrity issues. The lack of concurrency is one way to protect yourself from this. That’s how it works in many non-Oracle databases. In Oracle, concurrency rules supreme and you must be aware that, as a result, things will happen differently (or suffer the consequences).
I have been in design sessions where the developers, even after being shown this sort of example, scoffed at the idea they would have to actually understand how it all works. Their response was, “We just check the ‘transactional’ box in our Hibernate application and it takes care of all transactional things for us. We don’t have to know this stuff.” I said to them, “So Hibernate will generate different code for SQL Server and DB2 and Oracle, entirely different code, different amounts of SQL statements, different logic?” They said no, but it will be transactional. This misses the point. Transactional in this context simply means that you support commit and rollback, not that your code is transactionally consistent (read that as “not that your code is correct”). Regardless of the tool or framework you are using to access the database, knowledge of concurrency controls is vital if you want to not corrupt your data.
Ninety-nine percent of the time, locking is totally transparent, and you need not concern yourself with it. It’s that other one percent you must be trained to recognize. There is no simple checklist of “if you do this, you need to do this” for this issue.
Successful concurrency control is a matter of understanding how your application will behave in a multiuser environment and how it will behave in your database.
When we get to the chapters on locking and concurrency control, we’ll delve into this topic in much more depth. There you’ll learn that integrity constraint enforcement of the type presented in this section, where you must enforce a rule that crosses multiple rows in a single table or is between two or more tables (like a referential integrity constraint), are cases where you must always pay special attention and will most likely have to resort to manual locking or some other technique to ensure integrity in a multiuser environment.