The Yin and Yang of Typing

Published in PHP on Jan 19, 2008

Without a little background in programming languages or computer science in general, it's entirely possible that typing systems are not something that have crossed your mind. I thought I'd take a blog entry to share some of my thoughts on how it's affecting the creation and evolution of languages.

First of all, Benjamin C. Pierce probably has a point: terminology used to refer to typing concepts is about as useful as buzzwords like AJAX or Web 2.0 these days. Be that as it may, I'm going to reach back into the recesses of what I recall from the programming languages course I took in college to recall some of this terminology.

If you aren't familiar with static versus dynamic typing or strong versus weak typing, it may be worth it to read up on those before proceeding with the rest of this blog entry. Here are a few examples of each:

Static/weak - C
Static/strong - Java
Dynamic/weak - PHP
Dynamic/strong - Python

The line between strong versus weak typing seems to be blurred as languages like these evolve. The reason for this is that each side of typing has its advantages. Strong typing allows for compile-time checking, which can serve to eliminate human error, as well as performance optimizations from being aware of types at compile-time. They can also serve to make source code more intuitive to follow in some respects. Weak typing, on the other hand, can allow for higher levels of abstraction and, by proxy, the need for less code in order to allow identical operations to be executed on multiple types. It can also allow for things like variable variables, variable functions, and other interesting features not possible in strongly-typed languages.

Yet languages on either side of the proverbial fence are drawing in strengths from the other side. Java, before limited to the flexibility that could be provided by polymorphism while still maintaining strong typing, introduced generics in 1.5, whereby typing was still enforced but a higher level of logic abstraction was enabled for developers. By the same token, PHP has had explicit typecasting for a while and more recently in 5.1 introduced type hinting for array and object types (which may extend to scalar types in later versions). C# in 3.5 adds type inferencing, which while it's only syntactic sugar at least alleviates the need for verbosity when performing the most common method of initialization (i.e. setting a variable of a given class to an object instance of that class, as opposed to one involving a subclass of one or more of the classes involved).

It's also becoming commonplace for dynamically typed language interpreters to get ported to Java and .NET in order to leverage the features of those languages and the native libraries of the host language in the existing execution environment. Take these examples for instance.

Quercus (Java) and Phalanger (.NET) for PHP
JRuby (Java) and IronRuby (.NET) for Ruby
Jython (Java) and IronPython (.NET) for Python

In short, some level of control over typing is obviously a desired feature in any useful language. As well, I don't think a language can be truly useful without having a bit of both worlds to some degree. The reason for the existence of programming languages is to enable developers to control machines whose primary purpose is to manipulate data (and, as has been pointed out many times before, are stupid and do what we tell them to do). If control over said manipulation is hampered by the typing system, it hampers the effectiveness of the language. In this, I have to agree with Ludwig Wittgenstein, who said, "The limits of my language mean the limits of my world."