NULLification

Published in Databases on Jan 19, 2008

I've seen some "interesting" things during my time with database systems, but the one that takes the cake by far is variations in how NULL is interpreted. I'm going to provide some examples to showcase what I'm talking about using Oracle and MySQL, being that my experience is mostly with those two particular systems. Examples given are run on Windows XP SP2 using Oracle Express Edition 10.2.0.1.0 and MySQL Community Edition 5.0.45.

The stated intention of the existence of NULL is to convey the absence of any value. Both Oracle and MySQL say as much (and I particularly like MySQL's explanation of the reasoning behind this). Oracle, however, immediately goes on to contradict that principle and maintain that a character value with a length of zero is considered to be equivalent to NULL.

-- Oracle

SELECT CONCAT('test', NULL) FROM dual;

'test'

-- MySQL

SELECT CONCAT('test', NULL);

NULL

This behavior manifests itself in some other "interesting" ways.

-- Oracle

SELECT 'true' FROM dual WHERE LENGTH('') IS NULL;

'true'

-- MySQL

SELECT LENGTH('');

0

If you're wondering why I've formatted my Oracle query differently in this example, it's because this particular release of SQL*Plus appears to not want to give me a definitive answer. If I use the equivalent Oracle query, I simply get blank space where I would expect to see NULL. If you invert the WHERE clause in the query, you'll see what I mean. I wonder why this is?

To make things worse, it is only indirectly noted in the Oracle LENGTH() function documentation in the statement that passing NULL to LENGTH() will result in NULL. You have to read the section on NULL to find out that the empty string is equivalent to NULL, then put two and two together in order to figure this out. Oracle has made the statement that it is possible this behavior will change in the future. Given backward compatibility-related implications, however, I highly doubt that.

Strings aren't the only area to which I have an objection with respect to NULL; numbers are, as well. Prior to MySQL 5.0.13, the following example was handled in what I believed was "the right way." In versions 5.0.13 and later, it was changed to use the same logic as Oracle.

-- Oracle

SELECT 'true' FROM dual WHERE GREATEST(1, NULL) IS NULL;

'true'

-- MySQL

SELECT GREATEST(1, NULL);

NULL

What both do now is cause NULL to be the result if any one operand in the expression is NULL. In dealing with a general expression, I can understand this. When that expression deals with a logical set of associated row- or column-wise values, which this and some other functions do, the absence of a value for a single object in the set should not cause this to happen.

MySQL isn't even consistent about this; in a section on common problems with NULL in its documentation, it's stated that:

Aggregate (summary) functions such as COUNT(), MIN(), and SUM() ignore NULL values.

In another section on working with NULL values, it's stated that:

Two NULL values are regarded as equal in a GROUP BY.

NULL conveys the absence of a value, so how can this be?! Both Oracle and MySQL have a set of functions and operators for dealing with NULL. At least these treat NULL consistently.

In short, I'm not saying NULL shouldn't exist. I know it has its uses, such as rewriting queries to avoid using subqueries in order to improve efficiency. I'm saying that mainstream database systems should bear in mind the reason for the existence of NULL when deciding how any given function call or expression should handle encounters with it. Until they do, all I can do is recommend caution when dealing with fields that may contain NULL and querying against them. Be cognizant of these fields and operations involving them and make use of your respective database's functions and operators for handling these cases so as to avoid unexpected results. Don't let your data be nullified!