Using Symbols for the Wrong Reason

2007.07.03 update: a related article: 2007.07.03_WhatAreSymbols

The concept of symbols have been popular among the lisp community, yet not many people know about it mainly due to the general ignorance.

Then ruby came and made symbols be a commodity programming construct. People who were not aware of symbols now are.

And they are asking about it numerous times. There is not a week in the ruby mailing list that there isn't a question about symbols: what are they and what are they good for?

Many people have tried to answer that, but the answer has been along the lines presented in these two articles: http://glu.ttono.us/articles/2005/08/19/understanding-ruby-symbols and http://zephyrfalcon.org/weblog2/arch_e10_00850.html#e857.

The answer has been putting undue emphasise on the way current ruby VM implements symbols. Ruby string is mutable, and it is not efficiently implemented in current ruby VM. So, use symbols for efficient, immutable, and string-like objects.

It is not wrong and it is correct for current ruby VM. However, I think, that is a misguided answer to the questions. The answer should, on the other hand, put an emphasise on the programmer's intention.

rubyists uses #each() method more frequently than a for loop because it clarifies their intention of iterating over some sequence even though they could have used the more efficient for loop or even the if and goto constructs.

One does not tell another to use if and goto over for loop for iterating a sequence simply because if and goto may be more efficient. No matter how inefficient a compiler/interpreter implements the for loop construct, the possibility of an efficient for loop implementation remains. In fact, by using the for construct, the compiler could have an easier time deducing your intent of looping, and if some conditions are met (e.g., closed looping of certain numbers of times), it could unroll your loop for a better performance if your systems allows it.

In short, any answer that depends on a particular implementation is doomed to be short-lived. What happens if the next ruby VM implements COW (copy-on-write) strings? A COW string would share initial instances. Only f there is a modification to the instance, then the initial instance is copied and the modification is performed on the copy. As long as one does not try to modify COW string instance, it can be as efficient as how the current VM implements symbols. IOW, any answer that rallies around so-called efficiency while abandoning intent would become obsolete and there is a new scramble to get at an updated answer.

Thus, I finally come to say that one should not use symbols just for efficiency gain. Symbols are not meant to be an immutable string-like object. It is really meant to be used to construct user-defined identifiers. The user in this case would be the programmers.

Consider:

foo1 = { 
   :host => 'localhost',
   :port => 80
}
foo2 = {
   'host' => 'localhost',
   'port' => 80

In foo1, symbols are being used to identify the following data. The string 'localhost' is identified as a host, and 80 is not just any number, but rather a port number.

In foo2, it is a bit unclear as to what purpose 'host' and 'port' serves. Is foo2 a macro replacement list? That is, if the program reads the string 'host', would it be replaced to 'localhost'? What is the purpose of 'host' there? Is it an identifier for the string 'localhost'?

The programming world should borrow the real estate's adage of "location, location, location". It should be translated to: "intention, intention, intention". It is the main reason why comments that clarifies the intention of the programmer are so valuable. It is the main reason why there are a variety language constructs. It should also be the main reason for you to decide whether or not to use symbols.

2005.12.28 update: I am joyful that not everyone resorted to dumbing down the concept of symbols. http://onestepback.org/index.cgi/Tech/Ruby/SymbolsAreNotImmutableStrings.red

2006.01.06 update: What an amazing interest on symbol! I don't think I've seen any one topic in ruby that has generated 142 posts in a single thread before this.

http://groups.google.com/group/comp.lang.ruby/browse_frm/thread/164ae5f5cbbac02e?q=differences+between+%3Afoo&hl=en&


web AT microjet DOT ath DOT cx
| Weblog Commenting and Trackback by HaloScan.com