« North Georgia Pictures | Main | Michelle at 6 Months »
جمعہ 11 فروری 2005Friday, February 11, 2005
Front-end and Back-end Changes
There have been a lot of changes here recently, most of them on the back-end. Most of this work was related to having a bilingual (English and Urdu) blog along with MathML equations. This required valid XHTML 1.1 and serving the site as application/xhtml+xml as described before.
One strange artifact of a good CSS/XHTML design is that something doesn’t show up correctly in Microsoft Internet Explorer. I got two problems.
- One was the lack of the
lang()pseudo-class selector in MSIE. Therefore, I had to style Urdu text using a class for MSIE. - Another bizarre effect was that the calendar on the right sidebar overflowed in MSIE. I have fixed that if you are using medium or smaller fonts.
I also changed the Reading list and movie list archive pages so that they show an excerpt from my review on the main blog instead of just showing the book or movie titles. I used the multiblog plugin for the purpose. I found two issues with the plugin:
- I couldn’t use MT tags as the entry ID argument in multiblog to specify which entry should be shown from my main blog. That was fixed with a couple of lines of code.
- Multiblog has a bug where it displays entries from the other blog regardless of whether they are draft or published. I didn’t know how to fix that in the plugin code, so I am using
<MTIfEqual>from the Compare plugin along with the<MTEntryStatus>tag to filter out the unpublished entries.
My category pages were becoming huge. The photography archive was more than 1MB in size while a couple of other categories were more than 500KB. So instead of showing the whole entry text in the category archives, I am now showing only an excerpt.
I also wanted to change the category and monthly archives to dynamic. However, in Movable Type’s implementation, most of the plugins would not work and that was unacceptable. So they are going to stay static.
I patched lib/MT/Entry.pm so that I could send trackbacks to my own entries. This way, when I write a new entry about something that I have written about before, I won’t have to update the old entry with a link to the new one. Instead that would happen with the trackback.
Like a lot of recent changes, I got the weather forecast from Jacques Distler. I am showing the current weather and the 2-day forecast for Atlanta from the National Weather Service on the sidebar on the main page.
To enhance the reader experience, I have added small logos to posts containing Urdu or MathML. Clicking on these logos will open a window telling you how you can view MathML or Urdu content nicely. I stole this idea (and the MathML logo) from Jacques Distler.
I have also added instructions about how to comment in Urdu, write Math in the comments, or PGP sign your comment.
I am trying to increase the Urdu content and interface of the blog to make it truly bilingual. As a first step, the date headers and category names are bilingual. I had to add the month and week names in Urdu to lib/MT/Util.pm. For the categories, I am using the category description as the field for the Urdu version of the category title.
Along with the list of the number of entries, comments and pings, I added a counter showing how long this blog has been up in the blog statistics section on the right sidebar on the main page using the Countdown and MTSQL plugins.
As for valid XHTML, everything in the zackvision.com domain is XHTML 1.1 + MathML 2.0 valid and served as application/xhtml+xml. This includes cgi pages, which broke Typekey. Validating individual archive pages with the comments and trackbacks wasn’t as difficult as I thought it would be because I am using the textile plugin. Ampersands were the main problem in trackbacks.
Another issue with trackbacks is that Unicode Urdu (or Farsi) text in the trackback excerpt seems to generate invalid characters because the character count doesn’t work as expected. I don’t think it is a MT problem as it happened with trackbacks from Wordpress as well as Typepad.
The only exception to the valid pages are the popup photos. The default uploaded image popup template is horribly invalid. I have managed to write my own template to fix that, but changing the code for all the photos I have put on the blog for more than 2 years is not that easy.
Another place that doesn’t generate valid XHTML 1.1 is the <MTCommentFields> code in lib/MT/Template/Context.pm, which I have modified as well.
Serving my pages as application/xhtml+xml had broken the Google ads on my individual entry pages. They work now thanks to Keystone Websites.
First, the MIME type broke the Typekey commenting because of the use of document.write in the commenting part of the individual entry archive. I fixed it with PHP-Typekey. But serving mt-comments.cgi as application/xhtml+xml broke it badly. So I have removed Typekey since almost no one uses it.
Since application/xhtml+xml requires the served page to be valid or it breaks badly showing an error message instead of the page, I am now requiring comments to be valid XHTML. I have installed MTValidate for the purpose. The plugin wasn’t able to find sgml-lib until I changed its path in the config file to an absolute path instead of a relative one, but it works beautifully now. Since MT-Textile 2 filter is the default for comments, most comments should validate easily.
The next step was to force commenters to preview the comment first (and after any changes.) I am doing that with MTHash. An added benefit of this plugin is that it should stop bot-submitted comment spam.
Another counter-measure for spam is Jacques’s version of MT-DSBL which blocks comments and trackbacks from open proxies. This should be especially useful against trackback spam which is a general weak point in the fight against spam.
I have also installed the Real Comment Throttle plugin and set OneHourMaxPings and OneDayMaxPings in mt.cfg as a defense against crap-flooding of comments and trackbacks.
The nofollow plugin seemed useful when it was released. I installed it at first but removed it when I realized that all comment links were getting rel="nofollow", even including comments by me.
Another enhancement to the commenting system is the OpenPGPComment plugin. Installing the Perl module Crypt::OpenPGP required for this plugin was a big hassle. CPAN shell with the LIB variable to mt/extlib did not work. Actually I couldn’t install it at all with the CPAN shell. So I downloaded all the required modules and installed them manually. During this process, I found out that Crypt::Random was throwing up errors. The patch in this bug thread fixed the problem and the rest was easy.
So how can you sign your comment? Here are some instructions. I use GPGshell with a back-end of GnuPG. I have set the options to word wrap off and UTF-8 character set. With GPGshell, I can clear-sign current window contents (like the comment textarea) as well as clipboard contents. However, GPGshell has Unicode issues, so you can’t clear-sign Urdu text.
A final enhancement is the ability to write itex (a dialect of LaTeX) code in comments. This allows you to write all sort of equations to explain your argument better. :-) This was made possible by the extensive work done by Jacques Distler to let every commenter switch the comment text filter.
Posted by Zack at February 11, 2005 9:20 AM in Internet
Advertisements
Trackback Pings
TrackBack URL for this entry:
http://www.zackvision.com/mt/zv-trbk.cgi/792
Comments
Posted by: Zack (1823 comments) at February 11, 2005 9:32 AM | PGP Sig
The inverse of a matrix may be written as
Somehow, even the most innane comments seem profound when you throw in an equation or two.
Posted by: Jacques Distler (10 comments) at February 11, 2005 11:32 AM | PGP Sig
OMG what LANGUAGE Is this in? I don’t speak Techie!
Posted by: Leila (20 comments) at February 11, 2005 9:41 PM
Another issue with trackbacks is that Unicode Urdu (or Farsi) text in the trackback excerpt …
The trackback specification is somewhat broken, as it does not provide a way to specify a character encoding. In principle, that means it ought to default to iso8859-1. In practice, you are likely to get some mixture of iso8859-1, windows-1252 and utf-8.
Figuring out how to handle that mess is an outstanding challenge. Welcome to Sam Ruby’s world.
Posted by: Jacques Distler (10 comments) at February 11, 2005 11:27 PM | PGP Sig
Most of this post is Latin to me my dear son, I do make out what you mean though. I have liked the results and am specially interested and happy with the pictures, Urdu and Math. Today I gauged why my knowledge is so obsolete today and recalled:
1. Having worked in MIS department and struggling with computers for seven years when I took retirement in August, 1992, I stopped working on computer software after a year or two of retirement.
2. You passed B. Sc. Engg (Communication and Electronics) in 1993 and would not let me do any trouble shooting or installation of new software. Dad, please get aside. Why you worry (or why you over-work yourself). You should rest. I will do this for you. After you left for USA in 1997, your younger sister and then your younger brother followed your suit.
One thing is there son! I was your first teacher in computer (like pre-nursery teacher) and also it was me who taught you Urdu and English Alpha-bet before you joined Nursery of Sir Syed School. So, I am doubly proud of your successes (masha Allah), firstly because you are my son and second that I was your pre-Nursery teacher). Ha Ha Ha Ha !
Posted by: Ajmal (324 comments) at February 12, 2005 12:24 AM
Leila: I guess you are at a disadvantage here then. :-)
Jacques: Strangely, trackback still does mostly work. The major problem I have been having with it is that MT code truncates the incoming excerpt to 252 characters. This is done in the most absurd way possible as they are actually packing the string into bytes and then truncating to 252 bytes. The result is half (or even quarter) of a Unicode character left dangling, invalidating the page which was pinged.
I am looking at the solution Japanese MT users came up with for this. The problem is that all their explanations are in Japanese.
GPGshell has major Unicode issues as well. I am talking to its developer who is trying to fix it.
Posted by: Zack (1823 comments) at February 12, 2005 12:30 AM | PGP Sig
I am looking at the solution Japanese MT users came up with for this. The problem is that all their explanations are in Japanese.
I’ve thought a bit, about trackbacks, but haven’t implemented anything. Mostly ‘cuz I keep thinking there must be a better solution, but also ‘cuz I think 6A ought to fix what’s manifestly broken about their current implementation.
The first order of business is figuring out what the character encoding of the ping is.This is not trivial. A possible algorithm is
- If a Charset is declared in the HTTP headers, use that.
- If not, check to see if it’s utf-8 (there’s a regexp for doing so).
- If it’s not utf-8, assume it’s iso8859-1 (or, more realistically, windows-1252).
- Having decided what the encoding of the ping is, truncate to N characters (not bytes).
- transcode the result (using Text::Iconv) to MTBlogCharset.
Posted by: Jacques Distler (10 comments) at February 14, 2005 12:37 AM | PGP Sig
Ack! That’s “PublishCharset”, not “BlogCharset”, but you know what I mean.
Posted by: Jacques Distler (10 comments) at February 14, 2005 12:48 AM | PGP Sig
All the above was greek to me. So i wouldnt spend much time trying to understand that. What the hell i am a medi’
anyways. Zack since you and couple of other guys have taken upon themselves to popularize and promote urdu blogging which i think is kinda cool i would like to request if you could point people who use blogger to ways of publishing in urdu on their blogs. thanx man
Posted by: Moiz (62 comments) at February 14, 2005 12:33 PM
Jacques: That sounds like a good idea. I think Phil Ringnalda was doing some part of it. May be I’ll get his code and add on to it.
Moiz: That has already been done. See our Urdu blog or Urdu Wiki.
Posted by: Zack (1823 comments) at February 14, 2005 9:37 PM | PGP Sig
Post a comment
Note: Disagreements are welcome, but please keep it civil. Any comments full of hatred, bigotry, trolling or spam will be deleted and the commenter banned. Do read the commenting policy.
Valid XHTML: You have to preview your comment to make sure that it is valid XHTML 1.1. You will see the "Post" button on the preview page.
Urdu: To comment in Urdu, include "p[ur](urdu). " (with a space at the end and without the quotes) at the start of every Urdu paragraph. If you want to write an Urdu word(s) in an English paragraph, do it like this: %[ur](urdu)اردو%. If you want to put an English word(s) in an Urdu paragraph, write it like this: %[en](en)English words%.
PGP Signing: PGP-signed comments are encouraged. However, clearsigning Urdu text with GPGshell produces garbage.
MathML: Select the Textile with itex to MathML text filter. What you'll use is itex, which is a superset of WebTeX and differs somewhat from standard LaTeX.
Text Filters: For regular comments, whether in English or Urdu, keep the text filter setting to its default of Textile 2. Change it to Textile with itex to MathML when writing MathML.

Let’s test math in the comments with the optical flow constraint equation:
It works. Have at it, guys!