Nauroz Mubarak

Happy Persian New Year to Kianoush and my Iranian readers.

سال نو مبارک

Happy Persian New Year to Kianoush and my Iranian readers.

Are you guys in the 14th century still?

PS. Kianoush, when are we going for dinner at a Persian restaurant to celebrate the new year? I love Persian food.

I am an American Now

I guess I can claim to be a real American now since I have got that quintessential American trait: monolingualism. Just kidding.

But my language skills are getting worse. Especially in my native language, Urdu. I hadn’t written anything (yes, I mean anything) in Urdu since my college years, now more than a decade in the past. Then, recently I decided to switch this weblog to a bilingual one. Trying to write here in Urdu has been hard. And not just because I was typing in Urdu for the first time in my life. I have found that my Urdu writing skills are no longer adequate. My Urdu in these posts has been stilted and unnatural.

Mind you, I am not one of those yuppie Pakistanis who never learned adequate Urdu. We spoke Urdu at home. While I didn’t read as much Urdu literature as English, I read quite a bit, which is another thing that has changed. I haven’t read any Urdu literature recently (other than the few books, especially poetry, that I brought with me from Pakistan). Even though I have never been a great writer, I used to manage to write decent Urdu essays in high school.

Another thing that has changed is that in recent years most of my friends and acquaintances are not Pakistanis. Hence, the only person I regularly speak Urdu with is Amber. The same is true for her. Even with her, I have noticed that English is creeping into our conversation quite a lot. We seem to switch between English and Urdu all the time while talking to to each other.

On the internet, I read mostly the English Pakistani newspapers. I read Dawn regularly and The News on occasion. This is more due to the liberal nature of English journalism in Pakistan as compared to Urdu newspapers than any other factor.

My efforts to be multilingual (i.e., more than two languages) as an adult have been dismal failures. I tried to learn Arabic and Persian after high school and then French a few years ago. Sad to admit, but I am not a language person.

I think I need to work on my Urdu skills more regularly. So expect regular Urdu posts here. Also, I need to find some bookstore which carries good Urdu books.

More about Urdu Blogging

Sorry about another process post, but there were a few issues that I didn’t get to in my last post about Urdu blogging.

Win 98 issues

With a default installation of Windows 98 and Internet Explorer 6.0, you cannot see some of the Urdu characters on this weblog. It seems that Tahoma version 2.3 which is included with Windows only includes the basic Arabic characters and not all the extra Urdu ones. To solve this problem, I installed the Arabic language support for Internet Explorer from the Windows Update site. This did not fix the character issue, though it seems that it is required to display Arabic script properly in Windows 98. I then downloaded the newer version (3.0) of the font from Umair’s Urdu Blog. This fixed the problem in the entry body of the posts, but not in the entry titles or the sidebar. The reason is probably because the later two are using Tahoma Bold which is still the old version.

Since Windows 98 does not seem to support an Urdu keyboard, I downloaded Unipad, a Unicode Text Editor, and copy and pasted the Urdu text I typed in the editor into Intenet Explorer text areas for comments or a new entry. This seems to work well.

CSS issue

While I was struggling with Windows 98, I decided to change my CSS file by adding some more fonts to the font-family attribute for Urdu text. The purpose was to have at least one font which has all the Urdu characters. Unfortunately, it seems that if the first font in the list (Tahoma in my case) is installed on your machine, the page will display using that even if it does not contain all the characters used. Those characters not present in the font will show up as squares in your browser. The browser will not try to locate those characters in other fonts in the list. This is the behavior in Internet Explorer 6.0 at least, which was disappointing.

Font embedding

I tried to embed the Tahoma font with the website, but in the end decided not to do that when I realized the drawbacks of that approach.

First of all, embedding of truetype fonts works only with Microsoft Windows and Internet Explorer. It doesn’t even work with Internet Explorer on Mac. Secondly, if I embed the whole font, it results in a big file. I can reduce the file size by embedding only the characters used on the website. However, this means that I have to re-analyze the website after every change, using WEFT, and upload the new embedded font file.

Blogging Configuration

To elaborate on the essential steps for Urdu blogging in the previous post, here’s what you basically need to do:

  1. Get your computer to type in Urdu either by
    • installing the language pack from your OS, or
    • using a Unicode text editor.
  2. Setting the character set for your weblog to be UTF-8.
  3. Defining some styles for Urdu and English for direction, fonts, etc.

If you blog mainly in Urdu, you might want to set the language and direction for the whole web page to Urdu. This can be accomplished by changing <html> in your blog template(s) to <html lang="ur" dir="rtl">. I think this will result in a scrollbar on the left instead of the default right as well.

Urdu Blogging and Web

I am using the Tahoma font for writing Urdu. Unicode is the standard character set for such things nowadays and the Arabic script part of it contains characters for Arabic, Persian, Urdu, etc. Some other fonts provide the Arabic characters which are a subset of Urdu ones, but not all Urdu characters.

ا آ ب پ ت ٹ ث ج چ ح خ د ڈ ذ ر ڑ ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن ں و ہ ھ ء ی ے

Problems Viewing Urdu Text Above

I am using the Tahoma font for writing Urdu. Unicode is the standard character set for such things nowadays and the Arabic script part of it contains characters for Arabic, Persian, Urdu, etc. Some other fonts provide the Arabic characters which are a subset of Urdu ones, but not all Urdu characters. Tahoma has support for Urdu characters.

I have checked both Windows 2000 and XP and the Tahoma versions (2.80 and 3.00 respectively) installed with these two operating systems support Urdu properly. Windows 98 by default does not. However, installing a newer version of the font might help you viewing Urdu pages.

Older web browsers won’t display Urdu correctly. You should use a recent browser, like Internet Explorer 5.5 or later, Mozilla 1.5 or later, Netscape Navigator 6.0 or later, Opera 6 or later. Alan Wood has a detailed list and description of different browsers’ support for Unicode. He also has specific information about Arabic support.

Mac OS 9.2 and Mac OS X also support the Arabic script. I am not sure about Unix/Linux.

This website has a list of operating systems, browsers and fonts that support the Arabic script.

If you have more information about viewing or creating Urdu web pages on Macintosh or Unix, please let me know. Also, let me know if you would like me to include other Mac- or Unix-specific fonts in my style class for Urdu and how my weblog’s Urdu posts look in other browsers and operating systems.

I have set up my weblog so that if your computer does not have a Tahoma font, then it is provided from my website automatically. This should work on both Windows machines and Macs. I am not sure about Unix. However, if you have a really old version of Tahoma which does not contain all Urdu characters, then my newer version of the font is unfortunately not downloaded (I am actually not sure about this.) I am using Microsoft WEFT 3 to embed the fonts.

If the font is downloaded, it is installed only temporarily to view my weblog. This is due to the licensing provided with the Tahoma font. I can only do “editable embedding” which means temporary installation. “Installable embedding” which would allow a font to be permanently installed is not allowed.

Umair has a font for download as well as instructions on how to install it at the bottom of his Urdu weblog. That should help all Windows users.

UPDATE: Asif has an installable package of Urdu fonts for Windows.

How to Type in Urdu

I’ll only describe the Windows options here since they are the only ones I am familiar with. For other systems, please take a look at the links at the end of the post.

You have two options: You can either install full Urdu/Arabic support or use an Urdu Unicode editor.

Urdu support based on Unicode is available only in Windows 2000 and XP. The Microsoft website has the instructions on how to install Urdu language support and keyboard. Once you have installed it, you can simply switch language to Urdu from the taskbar and start typing in Urdu.

If you are like me, you might not like the Urdu keyboard layout. Shehzad has designed a phonetic Urdu keyboard. This maps Urdu characters on your keyboard such that phonetically similar Urdu and English characters are mapped to the same key. This is good for us who are used to typing in English. I recommend that you install that as well. The installation instructions are on his page.

UPDATE: I like this phonetic Urdu keyboard layout better.

Another thing you might need to get used to an Urdu keyboard is the on-screen keyboard. This is available from the accessibility options of Windows. You can either click on the keys of the on-screen keyboard or just use it as a guide while you type.

If you are not using Windows 2000 or XP or you don’t want to install the whole Arabic/Urdu support stuff, you can download some Unicode editor. A good one is Unipad. The free version allows you to type upto 1000 characters. For longer text, you have to buy it. It comes with a built-in Urdu keyboard and font. You can display the Unipad keyboard on-screen as well. Unipad does not require that you have the Urdu/Arabic language support installed on your machine. Once you have typed your text in Urdu, you can copy and paste it into other applications. If you are planning to put the text online and are afraid if you might not have support for Urdu setup properly, you can select the Urdu text in Unipad and convert it to XML/HTML entities. All the Urdu characters will change to &1651; or some other number. This is useful sometimes, though the problem is with editing. Unipad comes through, however, since you can convert the entities back to the characters as well.

How to Type Urdu Comments

Now, let’s talk about how you can type Urdu comments on this weblog. You can follow one of the two methods from the last section.

There is, however, the little matter of correct alignment of the text since Urdu is written from right to left while English is written left to right. To get that correct, you should do the following:

  • At the start of every Urdu paragraph, type in English: @p[ur](urdu). @ followed by a space.
  • If you have an English word within the Urdu paragraph, surround it with some code like this: %[en-US](en)Word%.
  • If you have an Urdu word in an English paragraph, it needs something similar: %[ur](urdu)لفظ%.

Urdu Blogging

Let’s now talk about what stuff I had to do with my blogging software to get the Urdu blogging going properly. This is obviously Movable Type specific, but the general principles apply to other blogging tools as well. I might later add stuff about Blogger. If you are blogging in Urdu, please write up something about how to setup your blogging tool for that and let me know.

First of all, you need to make a few changes in mt.cfg. Find the line about PublishCharset. Uncomment it (by removing the # at the start of the line) and change it to PublishCharset utf-8. This will change the character encoding of your weblog from ISO-8859-1 (Latin 1) to the Unicode one. Also, uncomment the NoHTMLEntites line and set it to NoHTMLEntites 1. This will leave your Urdu Unicode character as is instead of changing them over to HTML entities like &1649; etc.

There is one more change that I did in mt.cfg. This one was for comments. Since I am allowing comments in Urdu, I have to allow some extra HTML tags than the default to take care of the alignment. I’ll explain the tags later, but here’s my sanitize line in mt.cfg:

GlobalSanitizeSpec a href,b,br/,p,strong,em,ul,li,blockquote,p class lang,span class lang,i

If your web server is running Apache, you should also add the following in the .htaccess file in the top weblog directory (if the file doesn’t exist, create a new one):

AddType 'text/html; charset=UTF-8' html

This tells the server that all your files with .html extension should be served as of type text/html and with the character set of UTF-8.

The next problem was that the Movable Type interface was using fonts that did not display all Urdu characters properly. So I was seeing a lot of squares while typing my Urdu entries. To fix that, we need to make changes to the Movable Type interface style file. This is styles.css in your Movable Type directory. I am providing my styles.css file with the necessary changes.

The changes basically are to add “tahoma” as the first font in “font-family” for the following classes/styles: a.list, input, textarea, textarea.width500, textarea.wide, textarea.short310, and textarea.short340.

The last thing that needed to be done was to define the alignment for Urdu text in a style class for the weblog. Here is my CSS file. I am using the following two classes:

.urdu {
font-family: tahoma, "Arial Unicode MS", arial, georgia, verdana, sans-serif;
font-size: medium;
text-align: right;
direction: rtl;
unicode-bidi: embed;
}
.en {
text-align: left;
direction: ltr;
unicode-bidi: embed;
font-family:georgia, verdana, arial, sans-serif;
}

Whenever I have an Urdu paragraph, I use <p class="urdu" lang="ur">...</p> around it. I don’t use anything around the English paragraphs.

When I have a few words of Urdu in an English paragraph, I use <span class="urdu" lang="ur">...</span> around the Urdu words. Similarly, when I have some English words in an Urdu paragragh, I put <p class="en" lang="en-US">...</p> around them.

Actually, since I use the MT-Textile plugin, I use the simpler Textile codes as I showed in the section about commenting.

Resources

  1. Alan Wood’s Unicode Resources
  2. Unicode
  3. Browsers and Fonts that work for Arabic
  4. Shehzad’s website about Urdu websites creation
  5. Mac OS 9 Language Pack Installation
  6. Alan Wood’s List of Arabic Unicode Fonts
  7. Urdu Support for Windows
  8. Justifying Arabic Text
  9. HTML lang and dir attributes
  10. A better way for language-based styles
  11. Authoring HTML for Middle Eastern Content
  12. U-Trans: A program for converting ArabTex transliteration code into Unicode plus an Urdu Nastalique font
  13. Unipad: A Unicode text editor
  14. Urdu Phonetic Keyboard Layout for Unipad
  15. Syed Rizwan Rizvi’s Urdu-related page
  16. Hugo’s Urdu Page
  17. Urdu Alphabet
  18. Yahoo! Group for Urdu Computing

UPDATE: More here and here.

اردو بلاگ ویب رنگ

میں نے اردو بلاگز کے لۓ ایک ویب رنگ بنایا ہے۔ اس کا مقصد ان تمام بلاگز کی فہرست جمع کرنا ہے جو اردو میں لکھے جاتے ہیں- میرے بلاگ کے دائیں طرف نیچے بھی اس ویب رنگ کا لنک ہے جہاں سے آپ دوسرے اردو بلاگ جا سکتے ہیں۔

I have created a webring for Urdu bloggers. The purpose is to collect a list of all blogs which are written in Urdu (partially like this one or fully) in one place. Here is the home page for this webring. There is a link for the webring through which you can reach other Urdu weblogs down on the right sidebar.

Currently, there are three blogs on the ring, including Tuzk-e-Jalali by Jalal who has another blog in English and Urdu Blog by Umair Salam.

If you have a weblog where you post, even if ocassionally, in Urdu, please join the webring using the form on the Urdu blogs webring home page. Also, please spread the word about this webring.

میں نے اردو بلاگز کے لۓ ایک ویب رنگ بنایا ہے۔ اس کا مقصد ان تمام بلاگز کی فہرست جمع کرنا ہے جو اردو میں لکھے جاتے ہیں- میرے بلاگ کے دائیں طرف نیچے بھی اس ویب رنگ کا لنک ہے جہاں سے آپ دوسرے اردو بلاگ جا سکتے ہیں-

ابھی اس رنگ میں تین بلاگ ہیں- تزک جلالی جلال کا بلاگ ہے- وہ انگریزی میں بھی لکھتا ہے- عمیر سلام انٹرنیٹ کا پہلا اردو بلاگر ہے-

اگر آپ اپنے بلاگ پر کبھی بھی اردو میں لکھتے ہیں تو اس ویب رنگ میں ضرور شامل ہوں- میں آپ کا مشکور ہوں گا اگر آپ اس اردو بلاگز کی فہرست کا ذکر اور لوگوں سے بھی کریں-

No Saudi Visit for Us

Via Brian’s Study Breaks, it seems that the Saudis are doing what they do best.

Saudi Arabia is barring visits by Jews after launching a new visa scheme to try to attract more tourists. The Saudi tourism department website said tourist visas would not be issued to Israeli passport holders or Jews.

Earlier this month, it began a drive to attract more foreign visitors by issuing visas to non-Muslim tourists for the first time. It has traditionally only issued visas for work purposes, officially-approved visits and pilgrimage to Mecca.

There has not previously been an explicit ban on Jews travelling to Saudi Arabia, though people with Israeli passports or with Israeli stamps in their documents, have not been allowed in.

A page on the Supreme Commission for Tourism website originally said visas would not be issued to Israeli passport holders or those with a passport containing an Israeli stamp; “those who don’t abide by the Saudi traditions concerning appearance and behaviours”; “those under the influence of alcohol”; or “Jewish people”.

The page was later amended, removing details of the restrictions.

That is clear and open anti-semitism and par for the course for the Saudis.

I have dropped Saudi Arabia from the list of countries we would like to visit one day. Granted it was at number 80 or so on that list. Having a restriction against Jews is unconscionable.

The Saudi Commission for Tourism website had this cryptic explanation:

The Kingdom of Saudi Arabia’s visa regulations are available at the Kingdom’s Consulates. When erroneous information was noticed on SCT’s website, it was removed. SCT regrets any inconvenience this may have caused.

برائن سے مجھے پتہ چلا کہ سعودی حکومت اپنے کرتوتوں سے باز نہیں آ سکتی۔ بی بی سی کے مطابق سعودی عرب اب غیرمسلموں کو بھی سیر و سیاحت کے لئے ویزا دے گا- مگر اس اجازت سے کچھ لوگ مستثنی ہوں گے: اسرائیلی پاسپورٹ کے حامل اشخاص، جن کے پاسپورٹ پر اسرائیلی ویزا کی مہر ہے، وہ لوگ جو لباس وغیرہ میں سعودی طور طریقوں کا لحاظ نہیں کر سکتے، جو شراب کے نشے میں ہوں یا یہودی لوگ۔

یہ فہرست سعودی کمیشن براۓ سیاحت کے ویب سائٹ پر موجود تھی- مگر اب وہاں نہیں ہے۔

یہ سعودی حکومت کی یہودیوں سے نفرت کا ایک اور نمونہ ہے۔

یہ خبر سن کر میں نے فیصلہ کیا ہے کہ میں سعودی عرب کو ان ملکوں کی فہرست سے نکال رہا ہوں جہاں ہم کسی دن جانے کا ارادہ رکھتے ہیں۔ اس سے پہلے بھی سعودی عرب اس فہرست میں بہت پیچھے تھا۔

اگر کبھی میری یاد آۓ

یہ نظم عنبر کے لئے ہے کیونکہ وہ مجھ سے یہ نظم سننا پسند کرتی ہے۔ اسے آپ ابرار الحق کی آواز میں سن سکتے ہیں۔

This poem is for Amber who is always asking me to recite it for her. It’s by the Pakistani poet Amjad Islam Amjad. It has also been sung by Abrar ul Haq. (Warning: I can’t figure out if the songs on muziq.net are legal or not.)

Sorry I can’t translate poetry into English.

یہ نظم عنبر کے لئے ہے کیونکہ وہ مجھ سے یہ نظم سننا پسند کرتی ہے۔ اسے آپ ابرار الحق کی آواز میں سن سکتے ہیں۔ مجھے معلوم نہیں کہ muziq.net پر گانے قانون کے مطابق ہیں یا نہیں۔

اگر کبھی میری یاد آۓ
تو چاند راتوں کی دلگیر روشنی میں
کسی ستارے کو دیکھ لینا
اگر وہ نخل فلک سے اڑ کر تمہارے قدموں میں آ گرے تو یہ جان لینا
وہ استعارہ تھا میرے دل کا
اگر نہ آۓ؟ ۔۔۔
مگر یہ ممکن ہی کس طرح ہے کہ کسی پر نگاہ ڈالو
تو اس کی دیوار جاں نہ ٹوٹے
وہ اپنی ہستی نہ بھول جاۓ!!
اگر کبھی میری یاد آۓ
گریز کرتی ہوا کی لہروں پہ ہاتھ رکھنا
میں خشبووں میں تمہیں ملوں گا
مجھے گلابوں کی پتیوں میں تلاش کرنا
میں اوس قطرہ کے آئینے میں تمہیں ملوں گا
اگر ستاروں میں، اوس خشبووں میں
نہ پاؤ مجھ کو
تو اپنے قدموں میں دیکھ لینا
میں گرد ہوتی مسافتوں میں تمہیں ملوں گا
کہیں پہ روشن چراغ دیکھو تو جان لینا
کہ ہر پتنگے کے ساتھ میں بھی سلگ چکا ہوں
تم اپنے ہاتھوں سے ان پتنگوں کی خاک دریا میں ڈال دینا
میں خاک بن کر سمندر میں سفر کروں گا
کسی نہ دیکھے ہوۓ جزیرے پہ رک کے تمہیں صدائیں دوں گا
سمندروں کے سفر پہ نکلو
تو اس جزیرے پہ کبھی اترنا!!

Urdu/اردو

This is just a test to see if I can make this weblog bilingual with entries in both English and Urdu . Don’t worry this weblog will still be mostly in English. Urdu is my first language but this is my first time typing in Urdu…

اردو زبان میں بلاگ

This is just a test to see if I can make this weblog bilingual with entries in both English and Urdu (اردو). Don’t worry this weblog will still be mostly in English. Urdu is my first language but this is my first time typing in Urdu.

There are quite a few things I need help with.

  • First of all, if you notice any problems like gibberish or misaligned text etc. please let me know.
    • UPDATE: RSS feeds have misaligned text because of the right-to-left issue. I have no idea how to fix that. Is there anyone who knows more about RSS who can help?
  • If anyone knows a better way to write Urdu unicode than to try each key on the keyboard to find the appropriate character, please let me know. The OS I am using is Windows XP.
    • UPDATE: This is still my biggest problem.
  • When I publish my entry, the Urdu characters change to the unicode HTML entities in the edit screen. That makes the editing of Urdu posts very difficult. Is there a solution to that? Please note that I am using Textile 2 formatting and have set the character set for my weblog to be UTF-8.
    • UPDATE: Fixed by setting NoHTMLEntities to 1 in mt.cfg. Will that cause any other problems? Ampersand, smart quotes, em dashes etc.?
  • Right now, I am using <p style="text-align: right"> to align the Urdu text to the right (since it’s written from right to left). Is there a better way? How would I put that in CSS? Also, what if I want an Urdu word in the middle of a paragraph in English or vice versa?
    • UPDATE: I have created an urdu class in my CSS and use p[ur](urdu). using MT Textile. I still do want to make it simpler. Also, I am using %[ur]Some Urdu words% for Urdu text within an English paragraph. I need to create similar stuff for English as well.
  • Also, which fonts can I use for Urdu? Which ones are better looking? Which ones are more likely to be installed on my readers’ machines?
  • See how the commenter names in Urdu are misaligned with the numbers and the brackets. Is there any solution to that?
    • UPDATE: Fixed. It was being caused by the text direction issue. The brackets had an ambiguous direction. They could be left to right if in English text or right to left if in Urdu text. Since the brackets came at the boundary of the change in language, they were somehow being interpreted as in right to left direction. I fixed it by putting <span dir="ltr"></span> around them in the template.

Umair, you are the pioneer in Urdu weblogs. I need your help here.