Archive for January 2008

Latest Oracle Gripes

I’ve been saving up Oracle-related gripes for a little while now. I’ve organized these into categories and, while each category isn’t really enough to stand as a blog entry on its own, all the gripes collectively make for an entry of a decent length. So, here goes.

Oracle 10g Express Edition

A coworker of mine actually came across one of these issues. Specifically, she found out that XE by default does not include the grants necessary to allow the native UTL_FILE package to be used by PL/SQL routines. It would be nice if this was documented somewhere that didn’t require a significant amount of digging to find out.

I managed to come across another one, and apparently an obscure one at that. The development team can’t even figure out how to replicate this one. For no explicable reason, the Enter key stopped working within worksheet tabs. The fix is as simple as going to Accelerators in Preferences, explicitly selecting Default, and clicking OK. This was a really annoying bug to live with until I found out how to fix it.

Oracle Designer

This is the main laundry list of the post.

First off, if you perform a DDL export from a tab other than the DB Admin tab, grants are not included in the resulting deployment script. If you’re deploying a nontrivial number of entities, it makes it annoying to have to switch tabs and go through the deployment process again.

This leads to the next point: all entities have to be reselected each time you want to deploy them. It would be a lot less tedious and annoying to, say, allow for sets of related entities to be created that could be deployed collectively by just selecting the set.

Going back to DDL exports, changes cannot be generated against a database unless you have the privileges necessary to deploy them. I honestly don’t understand why this is, as read privileges are all that are necessary to read the data dictionary tables, which are all that should be required for the test. This limitation doesn’t make the program very conducive to workflow environments.

Again in DDL exports, exported entities are not intelligently ordered in the generated deployment script such that dependencies come before reverse dependencies, which necessitates manual modifications to those deployment scripts and ergo documentation describing these modifications so that the script can be regenerated at a later time and potentially by a different developer.

Can you tell I’m not overly happy with the DDL export?

PL/SQL

I only really have one gripe for this section, but it’s a big one: the first note in the Oracle Application Server mod_plsql User’s Guide, section 3.6 Parameter Passing lies! It may be technically correct, but what it doesn’t mention is that there is a way to return a default expression as opposed to a default value. It’s a bit of a workaround, and it showcases how difficult it can be to work with aggregate data structures in PL/SQL, but it works. Documentation being this lacking is just completely substandard.

Conclusion

I think that about wraps it up. I don’t do straight-up rants that often, but in these cases I felt it was necessary. It’s entirely possible no one at Oracle will ever read them, but at least I can say it wasn’t because they weren’t stated somewhere.

Book Review: PHP Web 2.0 Mashup Projects

You can find this review in podcast form on the Zend Developer Zone PHP Abstract Podcast.

I received an e-mail recently from a very nice gentleman at Packt Publishing, a UK-based publishing company focused on providing hands-on application-oriented publications to IT professionals, particularly those specific to open source technologies. Their representative asked if I would be willing to review one of their books, namely PHP Web 2.0 Mashup Projects by Shu-Wai Chow. Reviewing books is not something I had done before, so I thought I would give it a good old-fashioned college try.

In a supersaturated market, it is difficult to make an impression with a PHP book these days. The books of real value are those that focus on ways to apply the language to real world problems. These books delve into the depths of a particular application domain, showing PHP code and outlining design principles along the way. They are useful to current and prospective PHP programmers alike because they can introduce both not to PHP itself, but to an existing class of problems and how PHP can be applied to solve them. PHP Web 2.0 Mashup Projects is one of these books.

Most technology-related books on the shelves are several inches thick and an inherently daunting chore to sift through. Luckily, this book is not one of those. Do not let the size fool you, though; it is positively packed with useful information. It hits the high points of each topic it covers, giving you enough in the way of code samples and step-by-step explanations to get started, as well as resources to help you get better acquainted with topics that might be of particular interest to you.

The book is divided into six chapters, each of which covers a set of particular protocols, data formats, and APIs for acquiring and processing data in order to create a particular mashup application. These projects include:

  • A search engine to find products on Amazon by their Universal Product Code
  • A search engine to combine results from MSN and Yahoo!
  • A video jukebox that pulls songs from Last.fm and videos from YouTube
  • A traffic incident reporting application that sends SMS alerts
  • An illustrated tube station line map using Google Maps and Flickr for related photos

The book’s structure and layout make it easy to follow, whether you prefer to read it linearly or jump around to specific sections. It is an excellent reference that I can see myself returning to time and time again.

One of the strengths of the book is that it has a very wide base of coverage. It starts by introducing basics in interacting with web services and extracting the desired data from their responses using core PHP libraries. The REST, XML-RPC, and SOAP protocols and the WSDL standard are all covered in enough depth to get you started, so you can work with a web service regardless of the protocol or protocols it offers. The author does an excellent job of selecting example web services and data standards from large and well-known to small and obscure. For real world APIs, you will find the likes of Amazon, YouTube, Google, and Flickr, as well as sources that might not be household names, such as the Internet UPC Database. Data standards include general formats like XML, RDF, and JSON and more specialized formats like RSS and XSPF.

Another strength is that the book encourages good principles from the start. It advocates object-oriented design principles for code reuse and a DRY philosophy. It suggests using third-party libraries such as those in PEAR in order to avoid unnecessary reinvention of the wheel, but still shows you how to roll your own if and when it becomes necessary. The books also covers usability, particularly in the last chapter when it discusses AJAX and race conditions, and pays special attention to application security, an area of increasing concern in web applications. Unlike some books, this one includes tips for development outside its own showcased projects to alleviate you from having to spend your own time troubleshooting common issues or digging for solutions to “gotcha” situations.

And last but certainly not least, the book demonstrates that sometimes you have to be resourceful in locating and acquiring your data, particularly in Chapter 5 where one of my own areas of interest, web scraping, is covered. The topic is explained in plain language and supplemented with examples walking you through exactly how it can be used to acquire data for your own mashups. Web scraping is not a frequently broached topic and I applaud the author for making a point to include it. I believe it is a genuinely useful methodology that can help in data acquisition when no other options are available.

I cannot give the book an entirely glowing review, though. There are some errata present, both in content and code samples. Most are small, but some are enough to throw off a reader not already familiar with the material being covered. I’ve submitted some of these via the publisher’s web site already, though I have yet to receive any related communications or see them show up on the web site at the time that I write this review. These issues are able to be corrected, though, and the quality of the book’s content outshines them.

Overall, PHP Web 2.0 Mashup Projects is an excellent example of creativity in finding new ways to aggregate data sets in useful combinations. It is a testament to the possibilities of the internet when access to data is opened up and freedom to use that data enables developers to create exciting and inspiring new solutions. Mashups show the internet’s potential increasing in leaps and bounds and this book can get you on your way to contributing to their future development.

The Yin and Yang of Typing

Without a little background in programming languages or computer science in general, it’s entirely possible that typing systems are not something that have crossed your mind. I thought I’d take a blog entry to share some of my thoughts on how it’s affecting the creation and evolution of languages.

First of all, Benjamin C. Pierce probably has a point: terminology used to refer to typing concepts is about as useful as buzzwords like AJAX or Web 2.0 these days. Be that as it may, I’m going to reach back into the recesses of what I recall from the programming languages course I took in college to recall some of this terminology.

If you aren’t familiar with static versus dynamic typing or strong versus weak typing, it may be worth it to read up on those before proceeding with the rest of this blog entry. Here are a few examples of each:

  • Static/weak – C
  • Static/strong – Java
  • Dynamic/weak – PHP
  • Dynamic/strong – Python

The line between strong versus weak typing seems to be blurred as languages like these evolve. The reason for this is that each side of typing has its advantages. Strong typing allows for compile-time checking, which can serve to eliminate human error, as well as performance optimizations from being aware of types at compile-time. They can also serve to make source code more intuitive to follow in some respects. Weak typing, on the other hand, can allow for higher levels of abstraction and, by proxy, the need for less code in order to allow identical operations to be executed on multiple types. It can also allow for things like variable variables, variable functions, and other interesting features not possible in strongly-typed languages.

Yet languages on either side of the proverbial fence are drawing in strengths from the other side. Java, before limited to the flexibility that could be provided by polymorphism while still maintaining strong typing, introduced generics in 1.5, whereby typing was still enforced but a higher level of logic abstraction was enabled for developers. By the same token, PHP has had explicit typecasting for a while and more recently in 5.1 introduced type hinting for array and object types (which may extend to scalar types in later versions). C# in 3.5 adds type inferencing, which while it’s only syntactic sugar at least alleviates the need for verbosity when performing the most common method of initialization (i.e. setting a variable of a given class to an object instance of that class, as opposed to one involving a subclass of one or more of the classes involved).

It’s also becoming commonplace for dynamically typed language interpreters to get ported to Java and .NET in order to leverage the features of those languages and the native libraries of the host language in the existing execution environment. Take these examples for instance.

In short, some level of control over typing is obviously a desired feature in any useful language. As well, I don’t think a language can be truly useful without having a bit of both worlds to some degree. The reason for the existence of programming languages is to enable developers to control machines whose primary purpose is to manipulate data (and, as has been pointed out many times before, are stupid and do what we tell them to do). If control over said manipulation is hampered by the typing system, it hampers the effectiveness of the language. In this, I have to agree with Ludwig Wittgenstein, who said, “The limits of my language mean the limits of my world.”

NULLification

I’ve seen some “interesting” things during my time with database systems, but the one that takes the cake by far is variations in how NULL is interpreted. I’m going to provide some examples to showcase what I’m talking about using Oracle and MySQL, being that my experience is mostly with those two particular systems. Examples given are run on Windows XP SP2 using Oracle Express Edition 10.2.0.1.0 and MySQL Community Edition 5.0.45.

The stated intention of the existence of NULL is to convey the absence of any value. Both Oracle and MySQL say as much (and I particularly like MySQL’s explanation of the reasoning behind this). Oracle, however, immediately goes on to contradict that principle and maintain that a character value with a length of zero is considered to be equivalent to NULL.

Oracle: SELECT CONCAT(‘test’, NULL) FROM dual; -> ‘test’
MySQL: SELECT CONCAT(‘test’, NULL); -> NULL

This behavior manifests itself in some other “interesting” ways.

Oracle: SELECT ‘true’ FROM dual WHERE LENGTH(”) IS NULL; -> ‘true’
MySQL: SELECT LENGTH(”); -> 0

If you’re wondering why I’ve formatted my Oracle query differently in this example, it’s because this particular release of SQL*Plus appears to not want to give me a definitive answer. If I use the equivalent Oracle query, I simply get blank space where I would expect to see NULL. If you invert the WHERE clause in the query, you’ll see what I mean. I wonder why this is?

To make things worse, it is only indirectly noted in the Oracle LENGTH function documentation in the statement that passing NULL to LENGTH will result in NULL. You have to read the section on NULL to find out that the empty string is equivalent to NULL, then put two and two together in order to figure this out. Oracle has made the statement that it is possible this behavior will change in the future. Given backward compatibility-related implications, however, I highly doubt that.

Strings aren’t the only area to which I have an objection with respect to NULL; numbers are, as well. Prior to MySQL 5.0.13, the following example was handled in what I believed was “the right way.” In versions 5.0.13 and later, it was changed to use the same logic as Oracle.

Oracle: SELECT ‘true’ FROM dual WHERE GREATEST(1, NULL) IS NULL; -> ‘true’
MySQL: SELECT GREATEST(1, NULL); -> NULL

What both do now is cause NULL to be the result if any one operand in the expression is NULL. In dealing with a general expression, I can understand this. When that expression deals with a logical set of associated row- or column-wise values, which this and some other functions do, the absence of a value for a single object in the set should not cause this to happen.

MySQL isn’t even consistent about this; in a section on common problems with NULL in its documentation, it’s stated that “Aggregate (summary) functions such as COUNT(), MIN(), and SUM() ignore NULL values.” In another section on working with NULL values, it’s stated that “Two NULL values are regarded as equal in a GROUP BY.” NULL conveys the absence of a value, so how can this be?! Both Oracle and MySQL have a set of functions and operators for dealing with NULL. At least these treat NULL consistently.

In short, I’m not saying NULL shouldn’t exist. I know it has its uses, such as rewriting queries to avoid using subqueries in order to improve efficiency. I’m saying that mainstream database systems should bear in mind the reason for the existence of NULL when deciding how any given function call or expression should handle encounters with it. Until they do, all I can do is recommend caution when dealing with fields that may contain NULL and querying against them. Be cognizant of these fields and operations involving them and make use of your respective database’s functions and operators for handling these cases so as to avoid unexpected results. Don’t let your data be nullified!

Popular Posts of 2007

I noticed that one of the feeds I read regularly, Gadgetopia, did a post recently on popular posts of last year by visits and comments. I thought that was rather neat and, being that I’m also a user of Google Analytics, I thought I’d do the same. I’ve already been looking in on my traffic over time since I launched the blog and have found the results rather surprising.

Top Posts By Visits

  1. Supporting Hierarchical Data Sets
  2. Pondering PHP 6
  3. PHP Abstract Episode 22: Screen Scraping
  4. Log Analysis and PHP
  5. The Acme of Skill

Hierarchical data sets was a very hot topic, so I’ll definitely make it a point to post about that. At some point I’d like to examine how a nested set implementation performs against Oracle’s own hierarchical functions. And I know I still owe you a post on why NULL in Oracle scares me.

I’ve noticed a lot of people seem to be inquiring about PHP 6 on IRC, perhaps partly because PHP 4 is on the way out. PHP 6 is vaporware at this point, people. There’s no official ETA. As the internals people like to say, it’ll be out WIR (When It’s Ready). If you want to put your energy toward something more constructive, monitor internals and get clued into the features that will show up in 5.3, as I’m sure that will be out significantly sooner.

I’m not including my code page in the above list, but it actually got more traffic than the #3 post. Most of that is due to the plug on paste2.org for my PHP script to submit to it and output the resulting URL. It’s usable from any text editor that supports piping a line range to stdout, though I’ve only tested it with vim. Have a look if you like; it might prove useful to you.

Top Posts By Comments

  1. Supporting Hierarchical Data Sets
  2. The Acme of Skill
  3. There and Back Again – A Conference Tale
  4. Pondering PHP 6
  5. Log Analysis and PHP

Not surprisingly, there was a good bit of overlap with the previous list. I did get a few recommendations for conferences in #3, so that might be worth a look if you haven’t attended a conference before. One thing I would like to hear in regard to this is opinions on the best conferences for networking and for supporting speakers. Both, particularly the latter, are interests of mine that I’d like to pursue this year. Feel free to leave a comment if you have conferences you’d like to encourage me to attend.

Thanks to all you readers for making 2007 a great year!

Zend Framework and Remember The Milk

I’ve posted a few times on Twitter related to my latest project and a few people have already asked me about it, so I figured it was worth a blog post.

My first project for the Zend Framework was Zend_Service_Simpy, a service module providing a lightweight wrapper around the API for the Simpy social bookmarking service.

My latest project is another service module for the Zend Framework. This time, though, it’s for the Remember The Milk API. RTM is basically a TODO list on serious steroids. It’s the Swiss Army Knife of task management. It allows you to manage multiple lists of tasks. You can add them easily from a variety of mediums, tag them, prioritize them, set deadlines for them, have them repeat, get reminders for them, tie them to physical real world locations, and share them. RTM offers great support for integration with Google applications including Google Calendar, iGoogle, and Gmail (plus offline access powered by Google Gears). They’re also very big into supporting mobile devices, including those running on Windows Mobile as well as the iPhone.

If you like, you can check out my original proposal for this module. I can already say that the API will end up changing a little, though, but it’s good enough to give you a general idea of what the capabilities of the finished service module will be. I only actively started implementation recently and things are progressing at a fairly rapid pace. I still have unit tests and documentation to handle, but hopefully there’s a shot at seeing it moved to core within the next two releases of the framework.

2007 in a Nutshell

I know these will probably follow in short order from most of the friends I’ve gained this past year, so I figured I’d follow suit for the sake of posterity and do an annual recount of the past year’s experiences. So, here goes.

First off, I started this blog! The time had come to establish a more permanent residence for myself in the blogosphere. Being known for my inherent laziness, I’m glad I’m finally got off my laurels and saw the effort through.

I turned 25 in March, and hopefully am a little wiser for the wear as well as being a little older. I celebrated my fourth wedding anniversary, and it still doesn’t cease to amaze me how long it’s been. My children are now 4, 2, and 1; it’s the only time in their life when they’ll have ages that are consecutive powers of 2, which I think is pretty cool! We got to celebrate our first Thanksgiving and Christmas together in our home, which we hadn’t been able to do last year. It’s been just over a year since we bought the house.

I switched jobs once, which ended up not working out. Thankfully my former (and now current) employer was gracious enough to take me back. Despite the commute subtracting significantly from the time I have in a day, circumstances make it seem that it’s where I’m meant to be for the time being.

I got elected to position of Vice President on the Baton Rouge Oracle User Group Board of Officers for 2008, which now has a presence on Facebook by my suggestion. I also got to attend the first meeting of the Lafayette .NET User Group, the first such organization to appear in Lafayette in several years.

I attended my first conference and got to meet many of my friends from the PHP Community, as well as make some new ones. While there, I tested for and received my ZCE certification. I published my first podcast and later my first professional magazine article. Toward the end of the year, I received my first request to review a book for a publishing company, which should come out shortly. I also submitted my second Zend Framework proposal and had it approved for incubator status.

And last, but certainly not least, I made it up to being able to hold my own on Hard in Guitar Hero II and III. Had to get that one in somewhere!

I have a number of aspirations for the coming new year. I don’t know that I can accomplish even a fraction of them, but I certainly plan on trying. If nothing else, the list will give me something to look back on and continue to strive for. Here are the ones I can think of off the top of my head at the moment.

  • Serve well in my duties as Vice President of BROUG and help to revitalize the organization.
  • Catch up on reading the backlog of books I’ve been building up, including Sara Golemon’s Extending and Embedding PHP which I recently received as a Christmas gift.
  • Submit one or more papers to one or more conferences (and hopefully get accepted to speak at one or, that notwithstanding, at least get to attend one).
  • Have additional magazine articles and podcasts published.
  • Begin seriously looking at getting a book published.
  • Get Zend_Service_RememberTheMilk into the Zend Framework core.
  • Examine the possibility of migrating this blog to Habari.
  • Start a project to do for content management what Magento did for e-commerce this past year.
  • Restart an old project to establish a web site for the local music scene here in the Lafayette area.

So, wish me luck! Happy New Year!