Movable Type 3.34 to WordPress 2.8.4

As you can see, I have moved my blog over to WordPress. Actually, the whole domain,, is powered by WordPress.

If you see any problems, please comment on this post or contact me.

I was using Movable Type 3.34 which came out in January 2007 and was now badly outdated. The reason why I hadn’t upgraded is because I was using a lot of hacks and had made my own modifications to the core code.

A few months ago, I ran into major problems with spam comments. That got me thinking about an upgrade. I was, however, able to solve the spam issue with MT-Akismet.

I downloaded Movable Type 4.2, the latest version, and played around with it. I liked it, but I realized that none of the plugins I was using had an upgrade for MT4. Also, I could find very few amateur bloggers on Movable Type. Professional websites and blogs were mostly using Movable Type, but the rest of the bloggers had switched to WordPress long ago.

I had already been using WordPress for a private blog and so I decided to take a look at the latest version of WordPress and how I could migrate my blog to it.

There were a few things I had to give up: No more MathML or serving all pages as application/xhtml+xml; no OpenPGP signed comments.

I also had to modify the Movable Type export script and WordPress’s import script so I could keep the same post IDs and slugs (basenames) as well as import tags and convert the content based on the text filter used in Movable Type.

Here are changes required in Movable Type 3.34’s lib/MT/ file:

---        2009-09-23 11:25:12.975789000 -0700
+++     2009-09-23 11:25:12.764003000 -0700
@@ -529,27 +529,27 @@
 AUTHOR: <$MTEntryAuthor strip_linefeeds="1"$>
 TITLE: <$MTEntryTitle strip_linefeeds="1"$>
+BASENAME: <$MTEntryBasename$>
 STATUS: <$MTEntryStatus strip_linefeeds="1"$>
 ALLOW COMMENTS: <$MTEntryFlag flag="allow_comments"$>
 CONVERT BREAKS: <$MTEntryFlag flag="convert_breaks"$>
 ALLOW PINGS: <$MTEntryFlag flag="allow_pings"$>
+POSTID: <$MTEntryID$>
-<$MTEntryBody convert_breaks="0"$>
-<$MTEntryMore convert_breaks="0"$>
 <$MTEntryExcerpt no_generate="1" convert_breaks="0"$>
+<MTEntryTags include_private="1" glue=","><$MTTagName$></MTEntryTags>
@@ -558,7 +558,7 @@
 IP: <$MTCommentIP strip_linefeeds="1"$>
 URL: <$MTCommentURL strip_linefeeds="1"$>
 DATE: <$MTCommentDate format="%m/%d/%Y %I:%M:%S %p"$>
-<$MTCommentBody convert_breaks="0"$>

And here are the changes required in WordPress 2.8.4’s wp-admin/import/mt.php:

--- mt.php.orig 2009-05-05 12:43:53.000000000 -0700
+++ mt.php      2009-09-23 11:49:00.182602000 -0700
@@ -375,6 +375,15 @@
                                        $post->post_title = $title;
                                else if ( 'ping' == $context )
                                        $ping->title = $title;
+                       } else if ( 0 === strpos($line, "BASENAME:") ) {
+                               $postname = trim( substr($line, strlen("BASENAME:")) );
+                               if ( '' == $context )
+                                       $post->post_name = $postname;
+                               else if ( 'ping' == $context )
+                                       $ping->post_name = $postname;
+                       } else if ( 0 === strpos($line, "POSTID:") ) {
+                               $postid = trim( substr($line, strlen("POSTID:")) );
+                               $post->import_id = $postid;
                        } else if ( 0 === strpos($line, "STATUS:") ) {
                                $status = trim( strtolower( substr($line, strlen("STATUS:")) ) );
                                if ( empty($status) )

I also added the following to the .htaccess file to handle the redirects needed from my old URLs to the new ones:

RewriteEngine On
RewriteBase /
# Old Monthly archives
RewriteRule ^weblog/archives/([0-9]{4})/([0-9]{2}) weblog/$1/$2/ [R,L]
# Old single entry links
RewriteRule ^weblog/archives/000([0-9]{3}).html$ ?p=$1 [R,L]
# Another old single entry links
RewriteRule ^weblog/archives/entry/000([0-9]{3}).html$ ?p=$1 [R,L]
# Old category archives
# Change underscores to hyphens
RewriteRule ^weblog/archives/([^_]*)_([^_]*)_([^_]*)_(.*)$ weblog/category/$1-$2-$3-$4/ [R=301,L]
RewriteRule ^weblog/archives/([^_]*)_([^_]*)_(.*)$ weblog/category/$1-$2-$3/ [R=301,L]
RewriteRule ^weblog/archives/([^_]*)_(.*)$ weblog/category/$1-$2/ [R=301,L]
RewriteRule ^weblog/archives/([^0-9]*)$ weblog/category/$1/ [R,L]
# To handle the old MovableType feeds.
RewriteRule ^weblog/atom\.xml$ feed/atom/ [R,L]
RewriteRule ^weblog/index\.xml$ feed/ [R,L]
RewriteRule ^weblog/index\.rdf$ feed/ [R,L]
# To handle old Movable Type permalinks.
RewriteRule ^weblog/([0-9]{4}/[0-9]{2}/.*)\.html$ $1/ [R,L]

I liked the iNove theme and installed it with some modifications done via child theme.

I have also installed the following plugins:

  1. About Me widget
  2. Akismet
  3. AmazonFeed
  4. AVH Amazon
  5. Collapsing Archives
  6. Contact Form 7
  7. Easy AdSense
  8. Efficient Related Posts
  9. Google Analyticator
  10. Google XML Sitemaps
  11. Lifestream
  12. NextGEN Gallery
  13. Now Reading Reloaded
  14. Page Links To
  15. Recommended Reading: Google Reader Shared
  16. Rich Text Biography
  17. JanRain RPX – Authentication from Facebook, Twitter, Google, Yahoo, Windows Live ID and OpenID
  18. Search & Replace
  19. Search Meter
  20. Sociable
  21. Thread Twitter
  22. WP-Syntax
  23. XML Google Maps

I am also working on a fork of Now Reading Reloaded. It will be for movies and will be called Now Watching. Once I have tested it, I’ll release it here and on the WordPress site.

While the blog itself has been completely moved over to WordPress, I am still working on migrating the other static pages. Also, my list of books read and movies watched is still not back yet.

Commenting Problems

Commenting using non-ASCII characters is not working right now. We hope to have a fix soon. Temporary fix is to revert a recent upgrade of the MTValidate plugin.

We are having some problems with comments right now. You cannot write any Urdu in the comments or even use single or double quote signs (smart or otherwise). Basically you cannot use any non-ASCII characters in comments.

This seems to be related to the recent upgrade of this weblog to Movable Type 3.34 and MTValidate 0.5. It is probably a result of this blog being the only Movable Type blog running native Unicode.

We are working on fixing this soon.

UPDATE: I have temporarily fixed it by reverting to MTValidate 0.4. So comment away!

Movable Type Security Bug

Movable Type 3.33 has a script injection bug if the nofollow plugin is disabled. Comment text is no longer sanitized as it should be.

Last month, Jacques Distler brought to my attention that Movable Type 3.3 had a script injection problem. Basically, any Javascript entered in a comment would not be sanitized and would appear on the blog. For example, try typing this in the comment form:

<script type="text/javascript">alert('hi!');</script>

It looked like Movable Type was no longer sanitizing comments, which it did until version 3.2.

Since both our installations were heavily modified, we were not sure whether it was due to our code modifications or an inherent Movable Type problem. I checked at a number of other weblogs and found out that script injection was a problem at some but not at others.

I brought this bug to the attention of Six Apart, the company that makes Movable Type immediately. They confirmed the issue and clarified that it affected only those users who had disabled the nofollow plugin distributed with MT 3.33. They also asked me for 30 days before making the issue public so that they could work on a fix.

While there has not been any announcement by Six Apart on this matter, I expect that they would fix it in the bugfix release 3.34 currently being worked on in their code repository.

Meanwhile, if you are using Movable Type 3.3, here are your options. If you have the nofollow plugin enabled (which it is by default), you shouldn’t have a problem. Otherwise:

  1. Enable the nofollow plugin.
  2. Edit your templates by adding sanitize="1" to the MT comment tags, like this:
    <MTCommentBody sanitize="1"> and <MTCommentPreviewBody sanitize="1">.

UPDATE: It looks like the sanitize function is completely disabled when you disable the nofollow plugin as it isn’t sanitizing my entry text either.

UPDATE II: Movable Type 3.34 fixes the problem.

Movable Type and Unicode

Running Movable Type natively in Unicode was not as difficult as I thought but it still required a number of patches to the code.

I have been trying to get Movable Type to run Unicode natively for a while. When Movable Type was upgraded to version 3.3, I saw my chance. This new version has a lot of the needed code for encoding and decoding etc. and made my job much easier than before.

If you remember my previous travails, DBD::mysql module lacked UTF8 support. Almost immediately after my changes, the develper release of DBD::mysql finally included a UTF8 patch. But that was too late for me. Plus I am going to wait for it to be included in a regular release since DBD::mysql is somewhat complicated.

What I did was to set the UTF-8 flag for everything coming out of the database using a wrapper around the DBI module. I used Pavel Kudinov’s code for that, which is given below.

# re-implementation by Pavel Kudinov
# originally from:
package UTF8DBI    ; use base DBI    ;
package UTF8DBI::db; use base DBI::db;
package UTF8DBI::st; use base DBI::st;
sub _utf8_() {
use Encode;
if    (ref $_ eq 'ARRAY'){ &_utf8_() foreach        @$_  }
elsif (ref $_ eq 'HASH' ){ &_utf8_() foreach values %$_  }
else                     {         Encode::_utf8_on($_) };
sub fetch             { return _utf8_ for shift->SUPER::fetch            (@_)  };
sub fetchrow_arrayref { return _utf8_ for shift->SUPER::fetchrow_arrayref(@_)  };
sub fetchrow_hashref  { return _utf8_ for shift->SUPER::fetchrow_hashref (@_)  };
sub fetchall_arrayref { return _utf8_ for shift->SUPER::fetchall_arrayref(@_)  };
sub fetchall_hashref  { return _utf8_ for shift->SUPER::fetchall_hashref (@_)  };
sub fetchcol_arrayref { return _utf8_ for shift->SUPER::fetchcol_arrayref(@_)  };
sub fetchrow_array    {                 @{shift->       fetchrow_arrayref(@_)} };

With that code, I needed to replace calls to DBI module with calls to UTF8DBI module as shown in the patches below.

--- lib/MT/ObjectDriver/	2006-09-06 19:27:17.000000000 -0700
+++ lib/MT/ObjectDriver/	2006-09-06 19:23:09.000000000 -0700
@@ -7,7 +7,7 @@
package MT::ObjectDriver::DBI;
use strict;
-use DBI;
+use UTF8DBI;
use MT::Util qw( offset_time_list );
use MT::ObjectDriver;
--- lib/MT/ObjectDriver/DBI/	2006-09-06 19:26:55.000000000 -0700
+++ lib/MT/ObjectDriver/DBI/	2006-09-06 19:24:20.000000000 -0700
@@ -93,10 +93,10 @@
$dsn .= ';hostname=' . $cfg->DBHost if $cfg->DBHost;
$dsn .= ';mysql_socket=' . $cfg->DBSocket if $cfg->DBSocket;
$dsn .= ';port=' . $cfg->DBPort if $cfg->DBPort;
-    $driver->{dbh} = DBI->connect($dsn, $cfg->DBUser, $cfg->DBPassword,
+    $driver->{dbh} = UTF8DBI->connect($dsn, $cfg->DBUser, $cfg->DBPassword,
{ RaiseError => 0, PrintError => 0 })
or return $driver->error(MT->translate("Connection error: [_1]",
-             $DBI::errstr));
+             $UTF8DBI::errstr));

However, that didn’t fix all the problems. The Perl CGI module was still working in Latin1 mode. I could wrap that into a UTF8CGI module but the newer versions of CGI module support Unicode. So I just upgraded the version of CGI bundled with Movable Type. Still I needed to tell the CGI module that the character set in use was UTF-8. I could either do that every single time the CGI module was called or I could just set the default character set to UTF-8. Since this CGI module was in the Movable Type extlib folder, I decided to modify its default character set.

--- extlib/	2006-09-15 10:39:30.000000000 -0700
+++ extlib/	2006-09-15 10:39:59.000000000 -0700
@@ -517,8 +517,8 @@
$fh = to_filehandle($initializer) if $initializer;
-    # set charset to the safe ISO-8859-1
-    $self->charset('ISO-8859-1');
+    # set charset to utf-8
+    $self->charset('utf-8');

I also set the utf8 mode for writing the files to disk.

--- lib/MT/FileMgr/	2006-09-27 06:56:39.000000000 -0700
+++ lib/MT/FileMgr/	2006-09-27 06:57:36.000000000 -0700
@@ -75,6 +75,9 @@
binmode($from) if $fmgr->is_handle($from);
+    else {
+        binmode(FH, ":utf8");
+    }
## Lock file unless NoLocking specified.
flock FH, LOCK_EX unless $fmgr->{cfg}->NoLocking;
seek FH, 0, 0;

These changes caused problems with file uploads through the Movable Type interface. I expected this since I have run into this problem with PHP and mbstring as well. The following patch fixed this issue.

--- lib/MT/App/	2006-10-08 21:17:11.000000000 -0700
+++ lib/MT/App/	2006-10-08 21:17:37.000000000 -0700
@@ -8334,6 +8334,7 @@
$app->validate_magic() or return;
my $q = $app->param;
+    $q->charset('iso-8859-1');
my($fh, $no_upload);
if ($ENV{MOD_PERL}) {
my $up = $q->upload('file');

Then it was time to comment out the liberally sprinkled code to switch off the utf8 flag in Movable Type.

--- lib/MT/I18N/	2006-09-16 20:22:22.000000000 -0700
+++ lib/MT/I18N/	2006-09-16 20:23:26.000000000 -0700
@@ -292,7 +292,7 @@
$text = $class->_conv_to_utf8($text, $enc) if $enc ne 'utf-8';
$text = substr($text, $startpos, $length);
-    Encode::_utf8_off($text);
+#    Encode::_utf8_off($text);
$text = $class->_conv_from_utf8($text, $enc) if $enc ne 'utf-8';
@@ -322,7 +322,7 @@
-    Encode::_utf8_off($text) if $to eq 'utf-8';
+#    Encode::_utf8_off($text) if $to eq 'utf-8';

Finally I had to make changes to the MTHash plugin that I use to force comment previews. The Digest::SHA1 module only accepts bytes, therefore, the UTF-8 characters had to be encoded as bytes before being passed to any functions in the module. Here is my patch:

--- lib/MT/App/	2006-09-16 21:01:21.000000000 -0700
+++ lib/MT/App/	2006-09-16 21:03:08.000000000 -0700
@@ -266,9 +266,10 @@
require Digest::SHA1;
my $sha1 = Digest::SHA1->new;
-     $sha1->add($q->param('text') . $q->param('entry_id') . $app->remote_ip
-                . $q->param('author') . $q->param('email') . $q->param('url')
-                . $q->param('convert_breaks'));
+     my $octets = Encode::encode_utf8($q->param('text') . $q->param('entry_id') . $app->remote_ip
+                                      . $q->param('author') . $q->param('email') . $q->param('url')
+                                      . $q->param('convert_breaks'));
+     $sha1->add($octets);
my $salt_file = MT::ConfigMgr->instance->PluginPath .'/salt.txt';
my $FH;
open($FH, $salt_file) or die "cannot open file <$salt_file> ($!)";
--- plugins/	2006-09-16 20:29:22.000000000 -0700
+++ plugins/	2006-09-16 20:57:22.000000000 -0700
@@ -32,7 +32,8 @@
or return $ctx->error($ctx->errstr);
my $sha1 = Digest::SHA1->new;
-  $sha1->add($content);
+  my $octets = Encode::encode_utf8($content);
+  $sha1->add($octets);
my $salt_file = MT::ConfigMgr->instance->PluginPath .'/salt.txt';
open(FH, $salt_file) or die "cannot open file <$salt_file> ($!)";

One thing that I still need to do is to fix the Serializer and Un-serializer used by Movable Type plugins.

Movable Type Upgrade

I have upgraded to Movable Type 3.32 and have made modifications to run it natively with Unicode. I like some of the new features (better Unicode support and tags). If there are any problems due to the upgrade, drop me a line.

I just upgraded to Movable Type 3.32 and am now running Movable Type on UTF-8 natively (that’s my doing, not Six Apart’s), but more on the Unicode issues later.

Here’s what I like about the changes in MT 3.3.

  • Movable Type can now be configured (using the DeleteFilesAtRebuild configuration directive) to delete files made unnecessary by changes made in the administrative interface. Individual archive files are deleted when previously published entries are deleted or unpublished. Category archives are deleted when their corresponding categories are deleted.
  • Pings coming from the same IP address with the same Source URL are now silently discarded. A success value is, however, sent to the pinging server so that the doesn’t keep trying to reping. [Not the best approach but better than duplicate pings.]
  • The order of attributes specified in template tags is now observed and respected (e.g. trim_to=”10” remove_html=”1” is different than remove_html=”1” trim_to=”10”). In addition, the same attribute can now be processed multiple times if so desired (ie, regex=”abc” regex=”def”). [ Jacques must like that.]
  • Added textarea resizing controls to the template editing pages.
  • In version 3.2, using certain later versions of MySQL or postgreSQL, some non-ASCII characters were not returned correctly from the database as originally written. This was caused by a mismatch between the character_set_client and character_set_connection variables. To fix this problem, we’ve added a configuration directive, SQLSetNames, which will inform the database of the character set being used by the client. The database character set must match the PublishCharset used by Movable Type. [I had implemented it already in my system.]
  • Implemented TrackBack transcoding between many character sets via Encode/JCode modules. This allows for correct display of TrackBacks sent in an encoding different than the recipient’s blog character encoding. [It is good that Six Apart is doing this, but I don’t like their implementation and prefer the one by Jacques Distler which I have been using already.]

I also like the inclusion of tags, though it would take me some time to populate my posts with tags and make any use of them.

The search page is barely working right now, but I’ll fix it soon. If there are any other problems with the upgrade (with commenting, trackbacks or anything else), please let me know.

I have also made some template changes. One is the inclusion of a menu bar at the top so that you can find the most common pages easily. Also, I am now including the sidebar in most pages other than individual entries.