ORDER BY "I am Desperate" DESCENDING

When an SQL result-set needs to be returned in some sort of order, it is not unreasonable to expect that the information to sort the data is available, somewhere. For example when querying a database and returning a result-set sorted in chronological order, it is not unreasonable to expect that some datetime information is available for the purposes of sorting the result-set.

Alas, in some information systems, the required information for ordering is just not available, or not easily available, and to deliver the software requirement the developer gets creative (ie. desperate).

Take the following financial trade/trade lifecycle information (and this information is representative of the type of typical time-series data found in a financial trading data warehouse, albeit vastly oversimplified) commonly referred to as a trade blotter.

ProductID TradeID TradeDate Amount Trader TradeDesk
 10  100010001 09 Feb 2018  1000 Max Mustermann  BI877
211  10B410012  12 Feb 2018  2100 Erika Mustermann  CSV458a
10  10B410013  12 Feb 2018  5000 Otto Normalverbraucher  BI877
10  882622100  08 Feb 2018  5400 Max Mustermann  BI877
100  100010099 05 Feb 2018  14500 Lieschen Müller  B100
10  GH65-12-1a  12 Feb 2018  1000 Max Mustermann  BI877
10  GH65-12-024  12 Feb 2018  1000 Max Mustermann  BI877
10  100010091  12 Feb 2018  800 Lieschen Müller  CD128

For some business reason, there existed a requirement to retrieve a result-set solely containing trade information for ProductID = 10. The result-set was to be returned in trade descending chronological order.

Clearly a trivial

SELECT *
  FROM somewhere
    WHERE productID = 10

will not suffice (the last three trades above are all executed on the same date, and the date has no timestamp component for a more granular chronological sorting – how can that data set be reliably sorted/ordered?).

I have found myself in this situation more than once. Furthermore, regardless of all the sighing and foot stamping that is necessary in these situations, more often than not you have to be pragmatic and just accept it is what it is (after all no large financial organisation is going to change their system to accommodate for your pissy little requirement, conduct the impact and risk analysis, and so on, in your urgent timeframe – they need to focus on running the bank first!).

Other than throwing in the towel, what can the developer do?

For chronologically ordering this data, in a best-efforts approach given there is insufficient information to reliably perform the task, the options are:

  1. First ordering by date, as above SELECT ….. ORDER BY tradeDate DESC.
  2. Eyeballing the data – is there any other data that could be used? No is the answer, not even a numeric PRIMARY KEY sourced from a database sequence that at least infers the order the data was added to the table.
  3. Is there any other column of data not shown in the table containing information that could be used to give the developer a clue as to the order that the trades were added. Yes is the answer, but as I wrote, you cannot see it (it is hidden, so formally the table is not even in 1NF).

What column? Well there are a few, but the one I am specifically referring to is ORAROW_SCN,  and it can be converted into the approximate timestamp associated with the table row SCN using Oracle function SCN_TO_TIMESTAMP. So if you have the approximate timestamp for the last modified (or hopefully first added) row insertion, and find yourself in this type of desperate situation, then this once fallback position. Given the circumstances, it is after all better than nothing.

Some example usage showing the function is below, on a different piece of contrived data.

SQL> select SCN_TO_TIMESTAMP(ORA_ROWSCN)
2  FROM test1;

SCN_TO_TIMESTAMP(ORA_ROWSCN)
---------------------------------------------------------------
12-FEB-18 07.22.42.000000000 PM
12-FEB-18 07.26.16.000000000 PM
12-FEB-18 07.22.42.000000000 PM
12-FEB-18 07.26.44.000000000 PM

A few days back, I had to debug a piece of code containing an ORDER BY SCN_TO_TIMESTAMP(ORA_ROWSCN). The prior developer was clearly desperate. He/she needed to order the database result-set, and there was insufficient data available, and he/she had resorted to using the timestamp derived from the last SCN for the database row. Clearly it isn’t reliable, but it is a best-efforts approach given the data for sorting chronologically was not available.

Recently my client started to notice an error. It is shown below. The background to the problem can also be found on Oracle-l containing a reference to the primary Oracle documentation as to the underlying cause of the error.

ORA-08181: Angegebene Zahl ist keine gültige SCN
ORA-06512: in "SYS.SCN_TO_TIMESTAMP", Zeile 1
08181. 00000 -  "specified number is not a valid system change number"
*Cause:    supplied scn was beyond the bounds of a valid scn.
*Action:   use a valid scn.

The error was eventually fixed (read the Oracle-l thread above) but as it turns out, ORA_ROWSCN only increments with each SCN, and for the purposes of some desperate ordering criteria, there is no necessity to convert the number to a timestamp, a simple ORDER BY ORA_ROWSCN would have had the same effect. This code change was implemented as a quick-fix, but clearly only satisfactory for the same-day urgent BAU purposes. Of course it still remains that the result-set returned may not be returned in chronological order, but this serves well given the limitations of the data at-hand and the various feeds used to populate this table. The use of ORA_ROWSCN however raises two issues. Specifically

  • the SCN is not unique – multiple trades above may have the same tradeDate and a different ORA_ROWSCN, but they may also have the same tradeDate and the same ORA_ROWSCN
  • the ORA_ROWSCN can even change (it reflects the most recent SCN for the most recent update to the table row). This has the knock-on effect of providing an ordered result-set, where the row order may change (sic), but the data giving the impression of not changing.

I had never previously considered using ORA_ROWSCN for ordering, even when desperate. I hope I never have to consider this pseudo-column for the purpose either. I did however feel for the original developer, coming up with a relatively novel solution given the data constraints.

Lastly

  • there is no way around this chronological trade order requirement, other than to introduce some other data that can be reliably be used for the purposes of row sorting/ordering (for example, adding a timestamp component to the tradeDate). This however is a cyclic argument – the whole desperate nature of this blog post is a recollection of some pain, and the background to this pain is that the data set was devoid of the information needed that enabled the business to get what was needed in the first place
  • The issue above is a data issue, not a database implementation issue. I have however recounted the issue from an Oracle perspective
  • Sometimes you have to ask yourself how these systems have ever worked, and what degree of confidence anyone has ever had in them, and what degree of confidence you should have in them.

— Published by Mike, 19:49:22 19 February 2018 (CET)

Leave a Reply