Phergie on C7Y Again
Part two of the two-part article I wrote for C7Y on experiences gleaned from developing the PHP 5 IRC bot Phergie has been posted. Feel free to leave comments in the article’s forum.
Archive for the ‘Uncategorized’ Category.
Part two of the two-part article I wrote for C7Y on experiences gleaned from developing the PHP 5 IRC bot Phergie has been posted. Feel free to leave comments in the article’s forum.
Streams are quite possibly one of the coolest things about PHP. They’re a feature of the core and allow you to do some basic things that might otherwise require a separate extension, which may or may not be available if you’re in a shared hosting environment. Among these things is acting as an HTTP client, which you can do using the HTTP streams wrapper. See Example #1 on that page for a code sample showing how to submit a POST request.
I wrote a small script a while back that’s gained a surprising amount of popularity thanks to a plug from the site that it posts to. The current incarnation of the script uses the cURL extension to send a POST request to paste2.org, the response from which it then parses for the URL corresponding to the code that was originally sent. When I learned that this could be done with streams, I attempted to implement it in that fashion, but ran into strange issues where I would get 404 or 500-level HTTP errors rather than the response I was expecting.
After some digging, it turns out that this is a bug in the 5.2.x branch. The issue has to do with how headers are arranged by the underlying C code. As a result, explicitly specifying a Content-Type header for the operation will result in failure. However, not explicitly specifying the Content-Type header value results in a Notice being output and the correct header value being used automatically, which coincidentally causes the operation to succeed.
The bug has been fixed in the 5.3 and 6 branches and is expected to be fixed in 5.2.6 as well. Hope this workaround proves helpful to anyone who runs into a similar issue.
It’s pretty rare that I encounter a bug in the software I run that hampers my ability to work or my server environment’s ability to function normally. However, I encountered one last week that has taken me and several Rackspace support technicians nearly a week to figure out, namely PHP bug #43677.
The bug is in at least PHP 5.2.5, if not in previous releases in the 5.2.x branch. For the moment, we’ve downgraded the PHP installation to RHEL version 5.1.6, which I’m told includes backports of relevant bug fixes from the 5.2.x branch, to see if that doesn’t stabilize the situation.
So, if it seems that PHP starts to “forget” your include_path setting, your issue may be with PHP and not Apache as I initially suspected since the include_path setting was being set via an Apache configuration file. Hope this saves someone else time and grief.
In my last entry, I noted that I was testing my use of SWFUpload in both IE6 and IE7. You may wonder how I managed this. Google has plenty of information on “hack” methods to get both versions of IE to coexist on a single XP installation.
If you have the machine power for it, though, there is a method that is actually supported by Microsoft to accomplish this. It employs virtualization which is becoming increasingly popular in the computing world. It is useful for two particular applications: to make server environment installations independent of the host operating system and hardware and to allow multiple operating systems to coexist on the same hardware without the need for partitioning storage devices.
In 2003, Microsoft bought out a company called Connectix which specialized in virtualization software, one of their main products being Virtual PC. Microsoft subsequently released a rebranded version of VPC as a free download.
Once the need to test applications on IE6 and IE7 for cross-compatibility was realized, Microsoft also began sporadically releasing a freely downloadable series of up-to-date Windows XP images with IE6 pre-installed and expiration dates after which the images would no longer function. The latest image is set to expire in early June 2008.
With Microsoft pushing adoption of Windows Vista and IE7, it’s uncertain as to whether or not Microsoft will continue releasing these VPC images. However, it does appear that Microsoft plans to continue developing Virtual PC. Therefore, if you have an extra Windows XP license to spare, this arrangement is a nice and relatively lightweight solution to making IE6 and IE7 available for testing on the same machine.
A friend of mine who shall remain nameless pointed a post out to me on the PHP DZone web site recently. Noting that the article’s content was misinformed at best and downright ignorant at worst, even when examining it sheerly from the author’s knowledge of PHP as a language, this friend asked that I set the author straight.
I gladly obliged with a comment on the post, having become somewhat of an authority on the application topic myself. As much of an unorthodox practice as web scraping may be, there are some methodologies for it that are obviously better than others. The aforementioned post illustrates a lot of the ones to avoid, and my arguments against them.
Later, I randomly encountered a post on the blog at xml.lt on the topic of web scraping using the DOM extension. This article showcases recommended practices and reasoned arguments against bad (and unfortunately common) alternatives. The author comes across as being significantly more informed on both the language and the application in the article’s content and code examples.
If you’re looking for references on topic of web scraping with PHP, there’s always the article I wrote for the December 2007 issue of php|architect magazine, of which you can still purchase an electronic copy in PDF format. At some point, I also hope to write a short book on the subject. Until then, if you have related questions, you can generally reach me in the #phpc channel on Freenode, under the nick Elazar. I’m always glad to give out advice on web scraping and PHP, as I’m sure my good friend Jared Folkins (who is also my “Little Sis” from the PHPWomen Big Sis/Little Sis mentoring program) will attest.
Situations involving hierarchical data and relational databases are quite common in web applications. Trees lend themselves quite well to providing organizational structures a web site, such as sitemaps and breadcrumb trails. A slightly less common and different type of situation, where the application is just as useful and a solution is a bit more complex to derive, is one involving graphical data (as in graphs, not graphics) and relational databases. These situations have issues like the shortest path problem and find their solutions in graph theory such as the A* algorithm or Dijkstra’s algorithm. An example of such a situation is an airline web site that requires the ability to locate connecting and round-trip flights and find the flight path with the lowest cost in terms of time or ticket price.
If you use MySQL, this chapter from “Get It Done with MySQL 5″ is a fairly verbose but comprehensive guide to using MySQL to store graphical data. It includes background information such as terminology used in graph theory and has numerous implementation examples of adjacency list graph models, nested set graph models, and breadth-first and depth-first graph search algorithms.
For Oracle users, there’s a slightly more application-oriented tutorial that assumes more theoretical knowledge on the part of the reader. It shows that Oracle’s hierarchical data features unfortunately can’t be used in cases where cycles might exist in graphs (which is handy if you’re trying to detect them) and then goes on to show an implementation that uses temporary tables to store a summary of a graph analysis. A worthy side note is that part 2 of that tutorial deals with a more specialized approach using state machines that may or may not be applicable to your situation.
If you’d like more information on this topic, a good place to look is your local university. Most with a computer science program offer a course in theory of computation, which deals with topics like these as well as context-free grammars in the context of developing programming languages. Even if you never actually use this information to develop a language yourself, it can still serve a good purpose: it make you more informed when engaging in discussions about language development, and it can increase your appreciation for the beauty of a language from a user perspective.
One of the tasks I was recently given at my job involved examining issues with a browser-based batch file upload component. The component that was being used, iBULC, required a separate installation of proprietary software on the client-side, had a rather clunky browser interface, and proved difficult to troubleshot. Rather than expending effort to get iBULC to work properly, I went in search of a new solution. That new solution is SWFUpload.
SWFUpload is a small file upload solution that supports selection and uploading of multiple files, all while exposing a flexible event-driven API to allow you to handle events on the client-side in whatever manner suits your needs. It has two components, a Flash file (supports Flash 8 and 9) and a JavaScript include. A little HTML and JavaScript allows you to include these two components in your application and define handler functions to intercepts whatever events might be relevant to your interface. I was amazed at how quickly I was able to get a solution working. The examples and bundled documentation are quite good.
I only experienced two issues with SWFUpload in the process of completing my task. The first was that ampersands were being HTML encoded in the POST request being sent by the SWFUpload Flash component to my PHP script on the server side for intercepting uploaded files. This caused PHP to be unable to parse the POST parameters correctly. Luckily, the issue had already been reported and a quick fix in the JavaScript component resolved the problem.
The other issue was in the Flash component not properly sending cookie values to the PHP script, even with the JavaScript plugin for handling cookies enabled. Apparently this issue is most likely due to a known bug in the Flash player for Firefox. The only method I was able to use to circumvent the issue required including cookie data as POST parameters and checking for them there in the server-side script. Obviously this is a quick-and-dirty solution and I hope I eventually get to apply a more proper solution.
One outside issue unrelated to SWFUpload that I also ran into in the process of completing this task was an odd error in Internet Explorer. Even though the page using SWFUpload ran fine in Firefox, IE consistently returned the error “Internet Explorer cannot open the Internet site, Operation aborted.” For whatever reason, IE apparently doesn’t like it when you try to run JavaScript within <script> tags inside a table if the JavaScript isn’t contained within a function. An obscure bug, but an annoying one nonetheless. Moving the offending JavaScript to its own function appeared to fix the problem.
Outside of those issues, I was quite happy with the results. I was able to put the new solution together within an afternoon, it uses the native OS interface for multiple file selection, and it allows me to respond to the events that occur in the file upload process however might be necessary. I’d definitely recommend that you check SWFUpload out if you’re in need of a similar solution.
So after eventually getting fed up with WordPress, especially after the WYSIWYG editor disappered in the 2.3.3 update, I finally decided to bite the bullet and migrate my blog over to Habari. Once I’d been through the process, I thought I’d write a short blog entry about the experience.
First, there was the matter of content. Though it wasn’t as easy or intuitive as it could have been to track down how to migrate content from WordPress, once I knew how, it was a snap. Simply go to Admin > Plugins, activate the WordPress Importer plugin (which comes bundled with the release), then go to Admin > Import and you’ll have a WordPress Database option. From that point, it’s just a matter of putting in the authentication credentials to point Habari at the WordPress database and it seamlessly imports all your data into the Habari database.
Next came making Habari support my existing URL scheme from WordPress. It turns out that Habari has a database table for rewrite rules, but currently no section of the admin area to manage it. Ergo, the only way to add to or change these is to do it manually. Luckily, there was a blog entry from Michael Harris that detailed all this and even provided the exact INSERT statement needed.
After that came my blog theme. If the Habari developers are ex-WordPress developers as I’ve heard, they must not have liked the WordPress API much, because the two sure are different. This made theme migration look cumbersome enough that I decided to simply retire my old blog theme in favor of a slightly tweaked version of one of the stock themes available for Habari, namely Whitespace.
Finally, there were plugins. I wanted to continue using Akismet to manage content spam, as that had tended to serve me well while I was using WordPress. Luckily, Chris Davis has created an Akismet plugin. I downloaded the archive into /user/plugins, decompressed it, and then had to dig around in the plugin’s PHP file and add in my WordPress API key and blog URL. It would be nice if this was updated to use the configuration API that Habari offers for plugins. I tried the Blogroll plugin and didn’t really care for its interface. In that particular area, I actually liked how WordPress did things.
I experienced two particularly strange things during the process of migrating my blog. One occurred when I tried to swap out directories to make the new Habari-based version of my blog live. When I did that, all plugins mysteriously deactivated. I had to go back into Admin > Plugins and reactive them individually. They all seemed to retain their settings, at least.
The other oddity happened after I activated the TinyMCE plugin so that I could use a browser-based WYSIWYG interface to edit content. The dashboard screen in the admin area (and only that screen, from what I can tell) started throwing an “exception without a stack frame” error. I’ve e-mailed the author on that one, so we’ll see what happens.
Overall, though, I’m very satisfied with Habari and look forward to using it to catch up on the backlog of post ideas I’ve managed to build up over the past few weeks.
I’ve been saving up Oracle-related gripes for a little while now. I’ve organized these into categories and, while each category isn’t really enough to stand as a blog entry on its own, all the gripes collectively make for an entry of a decent length. So, here goes.
Oracle 10g Express Edition
A coworker of mine actually came across one of these issues. Specifically, she found out that XE by default does not include the grants necessary to allow the native UTL_FILE package to be used by PL/SQL routines. It would be nice if this was documented somewhere that didn’t require a significant amount of digging to find out.
I managed to come across another one, and apparently an obscure one at that. The development team can’t even figure out how to replicate this one. For no explicable reason, the Enter key stopped working within worksheet tabs. The fix is as simple as going to Accelerators in Preferences, explicitly selecting Default, and clicking OK. This was a really annoying bug to live with until I found out how to fix it.
Oracle Designer
This is the main laundry list of the post.
First off, if you perform a DDL export from a tab other than the DB Admin tab, grants are not included in the resulting deployment script. If you’re deploying a nontrivial number of entities, it makes it annoying to have to switch tabs and go through the deployment process again.
This leads to the next point: all entities have to be reselected each time you want to deploy them. It would be a lot less tedious and annoying to, say, allow for sets of related entities to be created that could be deployed collectively by just selecting the set.
Going back to DDL exports, changes cannot be generated against a database unless you have the privileges necessary to deploy them. I honestly don’t understand why this is, as read privileges are all that are necessary to read the data dictionary tables, which are all that should be required for the test. This limitation doesn’t make the program very conducive to workflow environments.
Again in DDL exports, exported entities are not intelligently ordered in the generated deployment script such that dependencies come before reverse dependencies, which necessitates manual modifications to those deployment scripts and ergo documentation describing these modifications so that the script can be regenerated at a later time and potentially by a different developer.
Can you tell I’m not overly happy with the DDL export?
PL/SQL
I only really have one gripe for this section, but it’s a big one: the first note in the Oracle Application Server mod_plsql User’s Guide, section 3.6 Parameter Passing lies! It may be technically correct, but what it doesn’t mention is that there is a way to return a default expression as opposed to a default value. It’s a bit of a workaround, and it showcases how difficult it can be to work with aggregate data structures in PL/SQL, but it works. Documentation being this lacking is just completely substandard.
Conclusion
I think that about wraps it up. I don’t do straight-up rants that often, but in these cases I felt it was necessary. It’s entirely possible no one at Oracle will ever read them, but at least I can say it wasn’t because they weren’t stated somewhere.
Without a little background in programming languages or computer science in general, it’s entirely possible that typing systems are not something that have crossed your mind. I thought I’d take a blog entry to share some of my thoughts on how it’s affecting the creation and evolution of languages.
First of all, Benjamin C. Pierce probably has a point: terminology used to refer to typing concepts is about as useful as buzzwords like AJAX or Web 2.0 these days. Be that as it may, I’m going to reach back into the recesses of what I recall from the programming languages course I took in college to recall some of this terminology.
If you aren’t familiar with static versus dynamic typing or strong versus weak typing, it may be worth it to read up on those before proceeding with the rest of this blog entry. Here are a few examples of each:
The line between strong versus weak typing seems to be blurred as languages like these evolve. The reason for this is that each side of typing has its advantages. Strong typing allows for compile-time checking, which can serve to eliminate human error, as well as performance optimizations from being aware of types at compile-time. They can also serve to make source code more intuitive to follow in some respects. Weak typing, on the other hand, can allow for higher levels of abstraction and, by proxy, the need for less code in order to allow identical operations to be executed on multiple types. It can also allow for things like variable variables, variable functions, and other interesting features not possible in strongly-typed languages.
Yet languages on either side of the proverbial fence are drawing in strengths from the other side. Java, before limited to the flexibility that could be provided by polymorphism while still maintaining strong typing, introduced generics in 1.5, whereby typing was still enforced but a higher level of logic abstraction was enabled for developers. By the same token, PHP has had explicit typecasting for a while and more recently in 5.1 introduced type hinting for array and object types (which may extend to scalar types in later versions). C# in 3.5 adds type inferencing, which while it’s only syntactic sugar at least alleviates the need for verbosity when performing the most common method of initialization (i.e. setting a variable of a given class to an object instance of that class, as opposed to one involving a subclass of one or more of the classes involved).
It’s also becoming commonplace for dynamically typed language interpreters to get ported to Java and .NET in order to leverage the features of those languages and the native libraries of the host language in the existing execution environment. Take these examples for instance.
In short, some level of control over typing is obviously a desired feature in any useful language. As well, I don’t think a language can be truly useful without having a bit of both worlds to some degree. The reason for the existence of programming languages is to enable developers to control machines whose primary purpose is to manipulate data (and, as has been pointed out many times before, are stupid and do what we tell them to do). If control over said manipulation is hampered by the typing system, it hampers the effectiveness of the language. In this, I have to agree with Ludwig Wittgenstein, who said, “The limits of my language mean the limits of my world.”