Database table column aliasing, and coding standards

Here’s a question I posed to myself recently, during a database code maintenance/upgrade cycle. Can adding a column to the underlying database table break the system? The answer is obviously yes and most database developers would cite an SQL NATURAL JOIN as an example where this may happen. For example:

SQL>
SQL>
SQL>
SQL>
SQL> select * from tbl1;

A
----------
1
2
5

SQL> select * from tbl2;

A          B
---------- ----------
1          100
2          46
3          18
4          21

SQL>
SQL> select a, b
2      from tbl2 t2 natural join tbl1 t1;

A          B 
---------- ----------
1          100
2          46

SQL>
SQL> alter table tbl1 add b number default 100;

Table altered.

SQL>
SQL> select a, b
2     from tbl2 t2 natural join tbl1 t1;

A          B
---------- ----------
1          100

SQL>
SQL>
SQL>

The same SQL query, a change to the table definition, and a different result set. This is rarely the required outcome but it is a fair response to the question being asked.

An organisations internal coding standards document will normally address the specific issue of the SQL NATURAL JOIN, and it will be addressed clearly with a diktat – ‘do not use the SQL NATURAL JOIN’.  Somewhat colloquially, the SQL NATURAL JOIN is a time bomb waiting to explode, and prohibiting its use is justifiable, as demonstrated above.

There are other examples where the system may break however, and I am going to demonstrate a further example that, like above, also focuses on coding standards.

Choosing an example using two simple tables:

SQL>
 SQL>
 SQL>
 SQL>
 SQL> select b
 2     from tbl1 t1,
 3          tbl2 t2
 4       where t1.a = t2.a;
B
----------
100
46

SQL>
SQL>
SQL>

You can see the expected result set. Adding the new table column, as before, the same SQL executed results in a parsing error.

SQL>
SQL> alter table tbl1 add b number default 100;

Table altered.

SQL>
SQL> select b
2     from tbl1 t1,
3          tbl2 t2
4       where t1.a = t2.a;
select b
*
ERROR at line 1:
ORA-00918: column ambiguously defined

SQL>

Imagine if this type of SQL were embedded in a view, database package, or other PSM in production code. By adding a new table column, unrelated code could break, not broken to the point where it returned an incorrect result set for a single method, but broken to the point that the PSM was invalidated and an entire object dependency tree was invalidated. And could you fix this code? Perhaps not. If the PL/SQL is wrapped for example, you may be well and truly  snookered. The code is broken and may not be fixable – the approach to get things working again is to roll back the system changes.

Now look at the following related SQL snippet.

SQL>
 SQL>
 SQL> select a
 2      from tbl0 t0
 3           tbl1 t1,
 4           tbl2 t2      
 5        where t1.a = t0.x
 6          and t1.a = t2.a;
A
----------
1

SQL>

Does “select a” refer to column “a” in tbl1, or tbl2 ? Or does it not matter ? The answer is it doesn’t matter, as the “a” in both is the same, because of the INNER JOIN. This is exceptionally clever isn’t it? Those bright guys at Oracle have implemented some fancy code under the bonnet, likely related to a transitive closure, that doesn’t require the database table column to be explicitly aliased when it doesn’t need to be explicitly aliased.

I am exceptionally impressed.

Oh, hold on …. just merely changing the table order in the query, the same semantic query, gives the now familiar parsing error. That cannot be right, can it? Have I done something silly?

SQL>
 SQL> select a
 2      from tbl1 t1,
 3           tbl0 t0,
 4           tbl2 t2
 5        where t1.a = t0.x
 6          and t1.a = t2.a;
 select a
 *
 ERROR at line 1:
 ORA-00918: column ambiguously defined
SQL>
SQL>
SQL>

Nope, what I observed above was just a plain old Oracle bug, or feature should you prefer. The first piece of SQL with the three table inner join compiled and gave the expected result set, but the second semantically identical piece of code did not. Code embedded within a PSM for the first code snippet would work just fine in 10.2, ……… until you upgraded the database version. Executing all the code above on 11g/12c above and likely a few Oracle patch sets in between, … well the ORA-00918 error becomes very familiar!

So coming back to my argument on explicit database table column aliasing in coding standards – I am surprised that the fundamentals on column aliasing are missing from the internal coding standards documents of so many organisations.* Aliasing the table columns correctly above, everywhere, would mean I’d never had been so familiar with ORA-00918, for normal DBA operations (such as adding new table columns/table restructuring) or working around bugs/features/limitations of Oracle SQL parsing.

*the focus seems to be on source code control system tags, variable naming, method naming, damn indentation, whitespace, and passing automated tests with some stupid static code analyser. Perhaps this is all a rant for another day.

— Published by Mike, 14:50:03 21 August 2016

Leave a Reply