Douwe Osinga's Blog: March 2004

Tuesday, March 30, 2004

Open Source and the Third World

Rishab Aiyer Ghosh of Electric Dreams fame, wrote an interesting article about License Fees and GDP per capital. He expressed the cost of buying/owning Ms Office & Windows with the average income in countries. In Viet Nam it takes 16 month of average income to earn enough to buy this software. In American terms this would mean over 48 thousand dollars for something you can get basically for free if you download OpenOffice and install Linux. No wonder 94% of all software in Viet Nam has been pirated. His conclusion is that Open Source is a big chance for ICT development in the third world and thereby a huge implied thread to Microsoft's kingdom. There's a point to this, but it isn't that straightforward.


GDP per head with the cost of software isn't the most relevant comparison. The proportion of people using computers in the Third World is often much smaller than in the West. The people that actually work with computers are much richer than the people that don't. Of course, a programmer in Bangalore  makes much less than one in San Francisco, but it has been argued that their living standards are not that different. There are lots of cars too in the Third World, though they are comparatively very expensive (a 50 thousand dollar Mercedes in the US compares to over 4 million in Viet Nam, and still you see expensive cars over there).


Another thing is of course the piracy. In Viet Nam only 6% of the software is legal, in the US it is 75%, which means that on average a copy of Windows/Office brings Microsoft 420 dollar, while in Viet Nam it only brings 33.5 dollar, which evens out things nicely. Fighting piracy nail and tooth in poor countries could change that, so this might very well the reason why the BSA is aiming at richer countries so far.


The third thing is of course that the marginal cost of Windows/Office for Microsoft is negligible. Given a good working mechanism of market separation, Microsoft could easily charge 20 dollars for Windows in Viet Nam and 200 in the US. Here the Redmonians have two options: either license certain versions of Windows only for use in certain countries or discount the Vietnamese version heavily.


The second option should work; not a lot of Americans will learn Vietnamese to avoid the cost of a copy of Windows, but it is rather restrictive. The first option would work within the EU for example, because Microsoft could block parallel import of cheap windows versions from Viet Nam with reference to their trade mark (there is free movement of trademarked products within the EU, but not into the EU). Books are cheaper in India than in the UK, but are stamped: to be sold in India only.


But what about the point of the article? Does Open Source open a route for Third World countries to build up expertise cheaply so that they can join the world market for software? May be in some cases, but outsourcing is just another market, with market forces at work. If I were a programmer in the Third World I wouldn't necessarily concentrate just on gcc, but would also include Visual Basic and it's other * Studio cousins. In all fairness, Microsoft is not so bad for starting developers. Their tools are usually pretty easy to get going with and they share a host of good documentation.


For governments it might be a different story. Releasing all software developed by the central government under the GPL isn't a bad idea; the people paid for it, so they should have rights to the results too. Also, getting a localized version out for Open Source products is usually not that much work and can bootstrap the local IT industry faster than waiting for Microsoft to come out with something.

Friday, March 26, 2004

The Future of Searching

A number of emails from different people got me thinking about the future of search. The Third Era of Searching, opened by Google, is drawing to an end. The new era might prove as disruptive as the other three.


The first generation of search engines (Lycos, Excite, Infoseek) did little more than match pages with keywords. There were not that many pages (remember when there was a page on yahoo: what's new, featuring all new websites of that week?) and getting a list of all pages containing travel AND amsterdam was usually enough search power. If you wanted more than 10 results, you had to pay.


This didn't last that long. The World Wide Web grew really fast and the search engines of old started to go down in the results. Spidering the whole web got very slow and returning all pages containing amsterdam AND travel no longer did the trick; too many results were only slightly relevant.


Hotbot and Altavista were the new kids in town. Complex algorithms decided from now on which page scored how well for which search terms. The spiderbots were much faster, reducing the time between submission and appearing in the list. All seemed well again.


But two things happened. Search Engine Optimization became big business. Companies started specializing in reverse engineering the ranking algorithmes of search engines. These companies then used their findings to create pages that would score perfect for certain keywords. The result was that search engines no longer returned the best matches, but the pages that were best optimized. At the same time, portals became the cool thing. No longer was it enough for Altavista to offer the best search engine, they also wanted to offer the latest news, horoscopes, stock quotes and weather information. And lots of colorful & moving advertisements, of course. And all of this on the same page.


Search seemed broken until Google opened its doors. Gone were the graphic-heavy pages with lots of information. Just a search box with two buttons. Search and I feel lucky. Google also introduced their pagerank algorithm, which took search engine optimization out of the equation (at least for a while). From now on, only the most relevant pages were returned. Google didn't just look at the keywords on the pages self, but even more at the links pointing to this page and the keywords surrounding that link. The search engine optimization companies could modify the pages all the wanted, but could hardly influence the links pointing to the pages.


But the growing importance of a good Google ranking made it worth it to go the extra mile. People started link-farms, websites that didn't do anything else than link to other websites, thus increasing the pagerank of those sites. People started exchanging links just for the increased Google rank. And building pages became even more an art - not only did you have to optimize the content, you now also had to pay attention to your links, incoming and outcoming.


Google is putting up a brave fight, but I think the end of the third era is neigh. It is hard to create a link-sphere around your site, but not impossible. In the end market forces will wreck the algorithm.


The next generation search engines will have to immunize themselves completely against search engine optimization. This seems very hard, but I see a few options:



  • Clicks on search engine results. If I see a page of returned results from Google, I can usually see from the description whether the page is really relevant or just a bit. And if it seems relevant, I click the links, otherwise I won't. A search engine could just check which pages are clicked more often for which search terms and push the ranking of those pages for those search terms up.

  • Page popularity. Google takes the links to a page as votes for that page. But that is rather indirect. Why not return the page that has the most visitors and is relevant for Amsterdam Travel (based on the content), when somebody searches for Amsterdam AND Travel.
    Sure, this will create a self perpetuating situation, because popular pages will get the lion share of search engine traffic, but pages with a lot of visitors are usually better than the ones with almost no traffic. And it is very hard to fake, at least much harder than links to a page.
    How would a search engine go about this? Well, they would need something on the computer of the user that could measure browsing habits, something like... the Google Toolbar. Makes you wonder.

  • Personal Search. If two people search at a search engine for something using the same keywords, it is not necessarily the case that the same pages are the most relevant for these two people. Some people would like pages about the great museums of Amsterdam, for others the local tolerance of some substances is more a reason to go. From clicking and searching, a profile could be build up, returning more specific results (and ads).
    This of course has tremendous privacy implications and would very well not be acceptable for a lot of people, but it is an idea.

So will Google lose their spot as top-dog? It is possible. MSFT is out there to conquer the search engine space, while AOL and Yahoo! still control a lot eyeballs. And then there are the outsiders. Of course a new startup might develop startling new technology and sweep everything away, just like Google did before them. But I wouldn't rule out Amazon either. Amazon bought Alexa. Not only does Alexa have a large database of opinions about what people think about websites, they also measure actual browsing behavior. It would not be that hard to build a very competitive search engine around Alexa. Amazon has a history of expanding into other things.

Monday, March 22, 2004

Some new stuff

I was cleaning up some code; I'm writing a piece for the upcoming mapping hacks of O'Reilly. Anyway, in the process I created a standalone version of Google News Map (i.e. it shows a news map on your desktop vs on your browser) and a nifty program that you can use to grep a property from the CIA factbook database.


The first program should work on any OS as it is pure Python, using the TkInter toolkit. The second one is a command line program. You give it one parameter, for example 'capital' and it will list the capitals of all countries. Works with any keyword of the CIA world factbook, i.e. 'population growth', 'geographical location', etc.

Thursday, March 18, 2004

The European Free State Project

In the United States, the land of the free, a group of libertarians has decided that there is no such thing as too much freedom.  They’ve set up a free state project. Their plan: take over one state (New Hampshire) and get of all redundant laws. Such a thing seems unlikely to succeed in the US and completely impossible in Europe, where the state is mightier and more controlling. Or so it would seem. May be Europe offers the best options for a free state project.


European governments are much more intrusive than the American Federal Government in general, but if you compare the European Union with the American Federal Government, you get a completely different picture. In the US, 25% of GNP is spent by the central government. The EU spends about 1%. The European Law might be complicated, but is still simpler than the American Law. European countries spent of course much more than American States and their laws are even more complicated.


But what if you could start a European country without all that? Just have the European Law, 1% federal taxes and that’s it. Also, the European Union doesn’t seem to be that critical when it comes to new members. Respect for the law, a functioning market economy, respect for European Values, all things liberty-loving people can live with. The only thing we need is a available part of Europe.


Kaliningrad springs to mind. It is now part of Russia, but not connected to the rest of the country. It used to be German (Immanual Kant lived here) and after Poland and the Baltics had joined the EU, it will be surrounded by the Union. Putin will need to be bought off, of course and the recent history of Kaliningrad probably doesn’t make them the most suitable inhabitants for a free state.


An alternative is new land. The Dutch have a great tradition of making land out of sea and there is still one rather big piece of water available: the Markerwaard. On older maps you see this area marked as a planned polder, but the Dutch government has decided not to go through with the plan (while at the same time insisting that the Netherlands are full). The Markerwaard would be the ideal location for a European Free State. So how does a country secede that doesn’t exist yet?

Tuesday, March 16, 2004

New project: Poetry in Translation

There was this story about one of the first translation programs that translated from and to russian. After a return trip, the sentence 'the spirit is willing but the flesh is weak' read 'the wodka is ok, but the meat stinks.' Computer aided translation is still not quite there and Poetry in Translation is a point in case. Here I use Google to translate a sentence from English into German, then into French and then back into English again. Results vary from perfectly understandable to rubbish, with sometimes funny inbetween.

Monday, March 15, 2004

Back from the USSR

Back from Armenia and Georgia. All in all a very interesting week with lots of sights, activities feeling like real adventure and of course amazingly friendly people. No night without an offered drink and sometimes not even a morning.


Travel books about the region are either filed under Asia or Europe. Technically speaking, they are part of Europe. This might have to do a lot with the fact that they used to be part of Russia/the USSR and that you have to draw the line somewhere (though Siberia is usually deemed to be Asia). But it makes you wonder, where are the boundaries of Europe?


The first question of course, is, which Europe? The song contest? Euro soccer champion? Or the big one, are they elligible for the European Union?


Walking in downtown Yerevan (the capital of Armenia), you could swear you're in any other Eastern European capital recovering from socialicism with a vengeance. Young people sipping their lattes in hip bars, expensive shops selling the latest gadgets from Japan. Up in the mountains it is a different story. Things are in general disrepair and anybody you meet has a 70% chance of being drunk; usually on home made stuff that tastes awfull (but does grow on you).


But it is all fixable and if the right economic pollicies are implemented, no doubt that these countries will grow richer, healthier and happier. But EU candidates? Why not. But where do you draw the line then, should the Kazachstan be able to apply? Or Morocco? Or even Iran? I'd say, why not. Europe is an idea more than a place. If countries attain the right policies, implement democracy and freedom of press, they're welcome. Take Canada, I'd say it is a perfect candidate.


So they EU will keep absorbing countries until it borders the US on all sides? Why stop there? I'm hoping for the day the US and the EU will merge.


 

Saturday, March 6, 2004

Georgia on my mind

So I'm in Georgia. No left lugage and no sleep made for a day of hard work in Tbilisi with walking around and taking in the sites, finished off in style with a night of drinking and eating the local specialties. And man, did we sleep in that night train.


Batumi, once the stone in the crown of soviet black sea resorts, had a surprise for us. The 20 degrees and sunshine predictions didn't realize. Everything was covered in a thick layer of snow instead. And it rained. Since the main point of a visit to the town was the famed board walks and and 7 am really is a bit early to start with the Wodka again, we went to Gonio, the Byzantine last stand, after they lost Constantinople to the Turks.


The fort was impressively big and hard to enter, whether you were a Turk looking to stamp out Christianity in the area or a tourist looking for a culture fix, especially since the main gate was locked. When asked about it, a local mimed that the official who looked after the fort, was still in bed, which was possibly related to the amount of drinking he'd done the night before, pointing vaguely in the direction of the near by village.


The nearby village consisted mainly of two crumbling appartment buildings once housing the proud workers of the soviet union, now also used for rearing goats. One of the guys hanging around took it upon him to wake the gate keeper and ten minutes later we were walking inside of the fort, which looked remarkably like the outside, except for the orange trees. Orange trees in the snow against a background of a Byzantine fortress is an unusual sight.


Anyway, tomorrow we're off to a ski resort. May be we'll get some sunbathing there.

Wednesday, March 3, 2004

Australian Broadcasting Corporation

There are not many reasons to blog 4:54 in the morning. Being featured on abc is one. I just got off the telephone from an interview with them! In Australia, of course, it was close to lunch time, while here the word 'ungodly' seemed more appropriate to name the time, but it was fun, though not very deep. It also shows how global the Internet really is; some Australian researcher discovers Visual Poetry and there you go. Could have happened anywhere really. Was a good week, really, press-wise, with the Guardian calling on tuesday to talk about world66. Now I'm back to bed and then off to Georgia later on the day.

Tuesday, March 2, 2004

Memory Systems

Wired has a piece about the American Championship remembering stuff. People compete in things like is remembering the order of a stack of cards, random poetry or the binary digits. The world champion in binary digits does around 3000 in 30 minutes, i.e. memorizes 100 bits per minutes. Computers laugh about this, of course, but to a human this sounds pretty impressive.


I was discussing this article with Hans-Peter and how you could do something like this and we imagined, they'd probably memorize a lot of words, say 1024, and then name every 10 bits after a word, thus reducing the stack of 3000 bits into a list of 300 random words, which is still hard, but already sounds a lot more doable.


Random words are, of course, harder than words that make some sense, so it would probably be better to mix in some colors, other adjectives, numerals and verbs. This could actually be a usefull technique applicable in every day. Make a list of ten colors, ten verbs, ten locations, etc, etc and you can code complex numbers in sentences like 'seven gray birds jumped happily into the dark river.' Remembering a telephone number was never this easy.