Undeclared key/data symbol for hash object

ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.


Although SAS recommands more and more to let their tools generate the SAS Code for you, you still might find yourself in a situation where there no other acceptable solution than writing your own code. Or maybe, just like me, you enjoy writing code much more than using code generators. Either way, this is one of these weird error messages you might encounter when trying to use a hash :


101      if(_n_ = 1) then do;
102          declare hash foo(dataset:"work.foo");
103          foo.defineKey('bar');
104          foo.defineData('bar','baz');
105          foo.defineDone();
106      end;
ERROR: Undeclared key symbol bar for hash object at line 105 column 9.
ERROR: DATA STEP Component Object failure.  Aborted during the EXECUTION phase.

Before anything else, you should know that most people are not very likely to encounter this error, because this happens only if you are declaring keys or data variable in your hash definition AND you're not using one or more of them anywhere in your data step. This is either wrong or subtle, it depends on what you are actually trying to do. In my case, I assume it was meant to be subtle because I knew exactly what I was doing : I did this in order to merge two data sets without having to sort both of them and then re-sorting the resulting data set on another key.


Where does this strange error message come from ?

If you know why you're not using the variables declared in your hash definition, then you're not guilty and the dumb SAS parser is to one to blame.
To understand what happened, a little understanding of the internal working of SAS is required. When your program is parsed, SAS tries to find each and every variable in your code and allocates some place for it in the PDV (Program Data Vector). So, if one of your variables is not used nor declared anywhere within the data step, the parser does not find it, and does not allocate any space for it in the PDV and the interpreter crashes when it tries to use these variables to define the hashtable.

But hey, wait a minute, they are declared in the defineKey and defineData statements aren't they ? Yes they are ! And that's why I blame the dumb SAS parser. They're simply enclosed within quotes (and this is mandatory, as you want to pass the variables' names and not their values) and the parser just think they are strings passed as arguments to a function. Up to this point, everything he does is right, but the interpreter/compiler could quickly inspect the abstract syntax tree to look further if that function just does not happen to be a function that takes variables names as parameters to define some kind of special object like a hash.


What can I do to avoid this and make everything work ?

There are several solutions to this problem, but you'll choose the one that fits best to your situation.

- If you don't use a data set as source of your hash table, you're probably doing something wrong, double check what you're doing.
- If you are using a data set as source but you're not using all of the variables contained in that data set, you can simply declare the variables using a length statement.
- If you are using a data set as source and all of its variables are defined in the hash table (either explicitly or with "all: 'yes'"), the easiest way (although not the cleanest) is to include anywhere in your data step code a set statement that will be parsed but never executed.

Such a statement can simply consist in something like this :

if (0) then set work.foo;

Of course, if you are not using all of the variables, you can also use the latest option with keep or drop statements. This saves you the trouble of maintaining the lengths statements in the case they would have to change but makes the variables definition less explicit.