Thematically Constipated

Friday, July 30th, 2004

I have had a few things on my mind this week, but am having trouble writing them down; I’d like to log a decent entry but haven’t found the words yet. So in the meantime, I dug up a few cool images which maybe some people haven’t seen before.

The one to the right is called the Checker Shadow Illusion, and should be of interest to people interested in computer graphics and the perceptual blah-blah that goes with it. Believe it or not the two squares marked here with black dots are exactly the same shade of grey. Go on, open it in Photoshop and and check for yourself if you don’t believe me…

Nonvisual Interludicule

O Draconian Devil! Oh Lame Saint! Oh Bollocks! Seems like absolutely everyone is reading Dan Brown. Dan Bloody Brown. Last time I checked he had three novels on the [Australian] top ten best-seller list, with Da Vinci Code holding at number one! Please read my previous indictment of his writerly skill, then nod haplessly in agreement, since chances are it’s too late and you too have already bought the damn thing.

Back to the Show

And from the subtle to the truly eye-gouging, how about this great animated image which seriously writhes and squirms before your very eyes. I’ve modified the original slightly, so as to fit in this space, but I think it still works pretty well. [thanks Coco, original source unknown]

B-B-Blah…

Tuesday, July 27th, 2004

I don’t feel like writing today, I feel like scribbling, but my stupid Wacom tablet is on the fritz so I can’t draw neither. I was glad they came down in price, but maybe that’s because they aren’t made so good. Me talk better soon.

News in Brief:

  • Bloglines suscribers reaches 13!
  • I, Robot turns out to be not bad!
  • But not great by any stretch!
  • I’m tired!
  • And boring!

Dry Spell

Sunday, July 25th, 2004

It’s been a while since I went more than a week without blogging. Please rest assured that I am still alive and in fact reasonably healthy.

Among other things, I’ve been doing a bit of paid work this last week. On a Mac no less. Like, as in, not Windows. Only I wasn’t actually in the same room as the Mac, since it was in New Zealand. It was running CodeWarrior, a C++ development environment, and I was controlling it via VNC [Virtual Network Computer?], which was running on a WindowsXP machine here in Australia.

So I was programming in C++, with CodeWarrior on a Mac in New Zealand from Australia over a VPN via VNC running on a laptop under WindowsXP. Neat, eh?

Wednesday, July 14th, 2004

The Mysterious Dancing Sausage

Wednesday, July 14th, 2004

On Monday morning I decided to cook myself some breakfast, specifically sausages and eggs. Whilst frying the sausages however, I got a surprise, when the unusual thing that I’ve already referred to in the title started happening… so by now you are probably less surprised than if I had built up to this fact without giving it away.

Catchy Title or Surprising Twist, sometimes you can’t have both.

Anyway, you might be wondering just what the hell I mean by "dancing", and yes, to say the sausage was dancing is probably is a bit of a stretch. What it was doing was more "playfully rolling" back and forth without me even touching the pan. It was doing this a lot, and even after I manually stopped it it would go right back to doing laps.

Are you getting spooked yet? I know I was. Perhaps this was some sort of physical manifestation of a restless spirit, attempting to communicate some important piece of information to me so that they could finally be at peace. Via sausage.

Or perhaps the pan was simply hot enough that the sausage was being constantly lifted by a cushion of its own steam, with the net effect that it was always "rolling down hill" which ever way it was going, always falling forward from its cushion of steam.

The moral of this rather anti-climactic little tale is this:

Use Your Brain.

There are too many people who would stop before the "Or perhaps…" part of this story, convincing themselves that they were witness not only to the paranormal, but to a paranormal event of great personal significance to them.

I cite the case of a sad female guest on James Van Praagh’s show, who told a story so dismally stupid that even Van Praagh had to look at his feet and change the subject. She told of how twice she had seen the toilet paper completely unroll itself from the dispenser onto the floor of her bathroom, all by itself, and that when this happened she would look out into her backyard, and sure enough, she would see a leaf wave at her. And she knew that this was all her dear departed one saying hello to her. I have a mental picture of her standing there misty eyed, doing absolutely nothing as an entire roll of toilet paper piles up on her floor.

When I’m dead, I think I’ll come back and communicate with my loved ones by knocking their toothbrushes into the toilet; I can think of no better way to tell them that I miss them but that things are going fine on The Other Side.

On a related note, beware of people who make it a point of pride to never absorb any technical or scientific information, and yet can suddenly grant themselves the authority to say that something is “scientifically impossible” in order to justify their lame-ass paranormal experience.

Subscription Drive!

Sunday, July 11th, 2004

How I read blogsBloglines, my favourite RSS aggregator, has updated their interface with obligatory shaded buttons etcetera, which is cool, it was bound to happen sometime. But they have also added a subscribers count to the header for each blog, so that everyone can see how many other [Bloglines] readers there are for any feed.

[ Bloglines is a simple web based aggregator, which means no software will be installed on your machine, and you can read your subscribed blogs from anywhere. It’s pretty neat. Click here to see a sample of my personalized Bloglines feed covering the last 12 hours. ]

Whilst it might seem a little nonsensical for me to feel uncomfortable about this minor addition — “Oh no! All six of my Bloglines readers will realize that there’s only six of them!” — as an author it still bugs me that this number appears on the screen like that… It’s like having to wear a hat that tells people how many friends you have!

So, if you are not already a Bloglines user, I hereby highly recommend the service for its convenience and… errmmm… implore you to please use it to subscribe to JujuBlog intepid to push that tiny number into double digits at least…

Together we can make it to eleven!

[If you are already reading this through Bloglines, then thank you, One-of-Six, and please don’t hesitate to recommend Bloglines and JujuBlog to anyone else you think might be even vaguely interested…]

An Introduction to UTF-8

Saturday, July 10th, 2004

I think I’ve finally got UTF-8 support across the board with my HTML Editor, the custom software that I use to maintain this website. This is an improvement on my lame-ass "paletted text" approach [see previous entry] which worked fine except for when I tried to run clean-up functions on the code and everything unusual got stripped out. I noticed my RSS feed was being exported with all non-standard ASCII characters removed, and decided it was about time I implemented at least minimal UTF-8 support.

UTF-8 and Unicode

UTF-8 is a method for encoding Unicode text [UTF-8 stands for Unicode Transformation Format, 8-bit encoding ], so to understand UTF-8, you first need to know what Unicode is.

Unicode is fast becoming the standard for encoding characters from all written languages. In a nutshell it is simply an agreement by all parties that any given character from any given language will always be represented by the same numeric value. From the Unicode web site:

Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.

For example, the Greek character α [alpha] is always represented by the numberic value 945 [hexadecimal: 3B1, Unicode notation: U+03B1].

The problem with Unicode is that there are a rather large number of characters which can be represented, so it’s hard to know how many bytes should be used to store a character digitally. Growing up with ASCII [American Standard Code for Information Interchange ] we programmers have become pretty comfortable with the convenience of using a single byte per character, but with only 256 possible combinations, this is woefully inadequate for the richness that is Unicode. Two bytes seemed sufficient a few years back — allowing over 65000 characters — but Unicode has already outgrown that limit. Three bytes would be plenty — with more than 16 million possible values — but we programmers hate numbers which aren’t powers of two! So the next viable option is four bytes per character; more than four billion possible values, more than we’ll ever need, and in general an enormous waste of space.

This is where UTF-8 comes in. The design philosophy of UTF-8 can be paraphrased: ASCII is compact and ubiquitous, so let’s try to keep it, but let’s also add extensions to support the rest of the scripts of the world.

This is achieved by introducing variable byte lengths for characters, with high bits used to signal "extended" characters. For standard 7-bit ASCII characters [U+0000 to U+007F] ASCII and UTF-8 are identical; eg: "cat" translates to the same three bytes in both ASCII and UTF-8.

Character Range (hex) Unicode (UCS-2/UTF-16) UTF-8
0-7F 00000000 0xxxxxxx 0xxxxxxx
80-7FF 00000xxx xxxxxxxx 110xxxxx 10xxxxxx
800-FFFF xxxxxxxx xxxxxxxx 1110xxxx 10xxxxxx 10xxxxxx
10000-1FFFFF - out of range - 11110xxx 10xxxxxx 10xxxxxx
10xxxxxx
200000-3FFFFFF - out of range - 111110xx 10xxxxxx 10xxxxxx
10xxxxxx 10xxxxxx
4000000-7FFFFFFF - out of range - 1111110x 10xxxxxx 10xxxxxx
10xxxxxx 10xxxxxx 10xxxxxx

Note that all bytes of multi-byte UTF-8 characters have the high-bit set to one, and only the first byte of a multi-byte character has both its highest bits set. This means there can never be confusion about where a character starts. So in UTF-8, the combined Greek and Latin sequence aβcδe is represented by the following seven bytes, and looking at the high bits you can pick out the extended characters without too much trouble:

01100001 11001110 10110010 01100011 11001110 10110100 01100101

Now the really clever bit about UTF-8 is that it is capable of passing unharmed through ASCII only systems [programs which don’t even recognize UTF-8], thanks to the fact that each character beyond U+007F looks like a valid sequence of extended ASCII when read as a byte-per-character. This is in stark contrast to other Unicode encodings such as UCS-2, which are full of zero bytes and therefore wreak havoc with ASCII processing systems. To an ASCII system, the UTF-8 representation of aβcδe parses as aβcδe . On the surface this may seem like a corruption, but the important thing to note is that no illegal ASCII characters appear in a UTF-8 bytestream, and so the same string can be read and written out again as raw ASCII and then decoded later as the original UTF-8. With the exception of 7-bit text systems [a legacy email standard, unfortunately, for which the hideous UTF-7 had to be invented] UTF-8 should be able to pass through ASCII systems unscathed.

Cons

Although UTF-8 is an incredibly useful and largely backwards compatible method of encoding the ever growing Unicode character set, there are a few things to watch out for:

  1. Not all byte sequences can be interpreted as valid UTF-8. Some might think this a good thing — like built-in data verification — but I find it simply annoying because it means you have to do error checking as you read a UTF-8 string to make sure it conforms to the rules, and if it doesn’t then you have to decide on an appropriate response to this error. For example [C1, 34] is an invalid UTF-8 sequence because it has a lead byte which implies a two byte character, and yet the following byte does not have its high bit set. I for one don’t want to reject text files where such invalid codes appear, and yet there is no single approach to dealing with them.
  2. Conversely, there are multiple ways of encoding the same value in UTF-8, such that a naive parser will not notice the difference. This causes security risks, because a character can "sneak through" disguised as a higher value. Especially dangerous here are NULLs, slashes, ampersands, percent signs and other symbols commonly given special treatment in software. The ampersand for instance should always be encoded with the single byte [26], but could easily be encoded as [C0,A6] or even [E0, 80, A6] as an attempt to slip by dodgy parsers [like mine for example]. Technically, such overlong sequences are illegal, but again the onus is on the software to check for them.
  3. UTF-8 is slower to process, by virtue of requiring any processing at all. This is unlike ASCII, which can be read directly into memory byte for byte. To find the fifth character of an ASCII string one simply reads the fifth byte, whereas UTF-8 requires every character up to the fifth to be read in order to establish how many bytes they occupy. This can be a pain.

Points 1 and 2 are the biggest drawbacks for me. If UTF-8’s invalid and ambiguous byte sequences could be collapsed, I think it would be a brilliant encoding scheme. Sadly, point 2 could have been easily avoided with only a minor modification to the specification. Point 1 is trickier though.

Is it possible to devise a similar variable byte length encoding scheme where every conceivable byte sequence can be unambiguously interpreted as a valid character sequence, and every character sequence can be represented by one and only one byte sequence? I think probably yes, but it’s a bit late to worry about that now. [unfortunately it is impossible for such an encoding to also guarantee that f(A+B) will always be equivalent to f(A) + f(B), where A and B are arbitrary bytestreams and f is a string decoding function]

As it is, UTF-8 is the best we’ve got: It’s supported by almost everyone, it’s fairly easy to parse, and it replaces a hideously parochial code-page system, which benefit alone can hardly be overstated.

The Book of the Blog

Tuesday, July 6th, 2004

Now available!Hooray! A crappy slow day during which I achieved nothing at all was somewhat ameliorated by the arrival of the printed version of JujuBlog, compiled for, uploaded to, and ordered from CafePress just thirteen days ago.

I’ve done a couple of other test versions over the last six months, but this is the first one which feels "finished". This time I’m happier with the details, like getting the page numbering right, adding a contents part, putting in those little intro pages before the contents etc. Also I spent quite a while designing the cover.

So if you feel like catching up on JujuBlog and would like to do so away from a computer screen, you can purchase it here.

The Point [of the Book of the Blog]

Actual pages! Digital data, whilst infinitely reproducable, is also inherently disposable. For the same reason that it costs nothing to create an electronic document, it costs nothing to destroy it either. And although I do still have records of my work from more than ten years ago, I honestly can’t be bothered digging through my digital archives [read: old stacks of CDs] to try to find stuff.

Having stuff printed and bound in book form — even if it’s only a single copy for your own reference — means that in ten or twenty years there is a reasonable chance that it might still be lying around your house, for you to pick up spontaneously, flip through and reminisce. Think of how interesting a ten year old magazine can be, and then think of how much less interesting a ten year old article is when published on a web site. There’s still that extra special something about the printed word.

Philip K Dick’s prose kicks Dan Brown’s’s ass!

Monday, July 5th, 2004

Opening sentence from The Da Vinci Code:

Renowned curator Jacques Saunière staggered through the vaulted archway of the museum’s Grand Gallery.

Opening sentence from A Scanner Darkly:

Once a guy stood all day shaking bugs from his hair.

Need I say more?

[previous Dan Brown rant]

The Littlest Pronoun

Saturday, July 3rd, 2004

I think it’s probably time that I started capitalizing "I", since that’s what 99% of the english speaking world do[es]. Using the minuscule form is a habit of mine that started with the advent of email, and probably comes across as a bit pretentious. Really it’s just a combination of laziness and hardline nerdulence, with my internal logic unit complaining that since "me" and "myself" aren’t capitalized, why should "I" be the one dopey exception?

Well, language is full of dopey exceptions — and on the whole is probably better for it — so from this day forward I will undertake to use the conventional majuscule for that tiniest of pronouns.

_________

UPDATE (2005): I have since cracked and updated all blog entries to use the proper form I. I just couldn’t take it any more ;)

We have no past, we won’t reach back

Saturday, July 3rd, 2004

She's So Unusual

Reading an entry on Ben’s weblog a little while ago reminded me that there exists an artist known as Cyndi Lauper. Listening to her 1983 Album She’s So Unusual, I am indeed surprised at the emotional power in some of these songs; I keep having to drop whatever I am doing to stare glassily into the middle distance.

If you’re lost you can look and you will find me
Time after time
If you fall I will catch you I’ll be waiting
Time after time

So is this just eighties nostalgia kicking in? Perhaps. Maybe that’s why Time After Time was so effective in the dance finale of 1997’s Romy and Michele’s High School Reunion, [IMHO a grossly underrated film, judging by its measly 5.8 rating on IMDB… even American Pie: Reheated gets 6.3]

But come on, how can anyone not get goosepimples at the chorus of All Through the Night? It’s such a powerful song, notwithstanding the nails-on-blackboard synth solo.

We have no past we won’t reach back
Keep with me forward all through the night
And once we start the meter clicks
And it goes running all through the night
Until it ends there is no end

The most exciting thing that’s happened to me this fiscal year

Thursday, July 1st, 2004

Last night I was driving home around midnight and noticed some flashing lights and fire trucks two blocks away from my house. I parked the car and ran back up the street, and there was a big fire just getting going!

I took some pictures, but what with it being night time and the fact that I can’t hold a camera still, almost none of them turned out. This is about the best I’ve got, and it utterly fails to capture the impressive scale of the turgid orange smoke plumes:

About half an hour later the power went out across the entire suburb, I think because there was an electricity sub-station close to the fire that was shutdown for safety reasons [news story here]. Blackouts in urban environments are spooky and surreal! All the streetlights and shopfronts were dark, and with a nearly full moon everything was in silhouette, except for all of the floodlights of the firetrucks [and the fire itself of course].

I hung around partaking of the novelty for a while and then, when the fire started to die down a bit [always a bit of a letdown really], I returned to an utterly dark house, with nary a standby LED to wink at me.

Electrickery

I know it’s a totally banal thing to say, but it is amazing how crippled we are without electricity! Just getting to my front door was a trial, ascending four flights of stairs in pitch darkness and fumbling for the keyhole.

With no lights inside I had to find the candles…. Exciting! [I then remembered that I had one of those keychain torches all along. What an idiot] So with candles lit, I wondered what my next move should be.

I couldn’t use my computer… Obviously.

I couldn’t turn on the television… Derr!

I couldn’t use the house phone, since it is a cordless requiring AC power. Shit!

I couldn’t boil the jug, and I couldn’t heat up a cup of tea in the microwave. I couldn’t do any washing. The only hot water available to me was whatever was left in the system, so I had a late night shower, just in case the power wasn’t back up in the morning.

I couldn’t warm myself because my heater is electric. Brrr-r-r-r!.

I couldn’t recharge the camera batteries to go take more pictures of the fire.

Somehow, I don’t even seem to own a radio. I wanted to be able to hear what was happening in the world, but had only a mobile phone to do it with. Was there a number I could ring for "general information"? This was getting stupid. Then I remembered that my absent housemate’s old phone has built-in radio. Hooray! I powered it up only to discover that the radio would only function with the headset plugged in, which I didn’t have [or want… how would I hear the looters coming if I was wearing headphones?]. Damn! Damn! Damn!

And so, eventually, after pacing about for a while with a candle and being mesmerized by the big spooky shadows it cast, I went to bed.

And I’d completely forgotten about the poor bloody goldfish, who nearly suffocated overnight.