Archive for the ‘writing’ Category

MIT Handbook of Collective Intelligence opens up

Wednesday, September 5th, 2007

Since last year I was contributing to MIT project that attempts to create a comprehensive Handbook of Collective Intelligence. This project was initiated by the newly created MIT Center for Collective Intelligence (CCI). It is quite logical that the managers of this project decided to use a collective intelligence technique to describe the collective intelligence itself. The collective intelligence techniques (e.g. human-based computation) that process natural languages are ideally suited for this purpose and were successfully used to describe themselves in the past. I used a human-based genetic algorithm to evolve a short description to put on the website implementing it in 1998. Wikipedia had an evolving page describing itself since 2002 at least. Dr. Terence Fogarty used human-based genetic algorithm to evolve another name for itself (”Automated Concept Evolution”) in 2003. A more recent example is Assignment Zero by Jay Rosen and collaborators, a successful experiment to use collective intelligence and crowdsourcing to report on crowdsourcing itself. MIT project with the same purpose may be even more ambitious than the previous projects, but can’t be called a success so far.

I was initially surprised that MIT Handbook of Collective Intelligence decided to use SocialText wiki software and not the MediaWiki one (especially taking into account that Jimmy Wales is on the advisory board of the CCI). I found SocialText less convenient to work with than MediaWki software (though I am biased here as I was a contributor of Wikipedia for several years before I tried SocialText for the first time). The accumulation of the content in the Handbook was rather slow. Researchers had to request an account by email to contribute to the Hanbook. In addition, it is often suggested that researchers are reluctant to contribute content to wikis, because of the pressures of the academic system encouraging them to submit their writing into the traditional peer review system and avoid publications not officially approved to be “peer-reviewed.” I contributed the majority of the content to the page on Examples of Collective Intelligence in February. Little editing by others was made to this page since then. Other pages were not frequently updated either. It seems that the page on Examples of Collective Intelligence was the most visited page of the Handbook with 10885 views at the time of writing. The only page I could find with a larger number of views is the main page (now missing) with 16619 views.

This summer, the Handbook of Collective Intelligence team decided to move the Handbok from SocialText to the scripts.mit.edu domain, changed the software from SocialText to MediaWiki, and what is more important, open the Handbook to the public contributions as Wikipedia does, i.e. they now allow anyone to register or edit the content of the Handbook without registration. Apparenlty it is the lack of progress that motivated opening up the project for anyone to contribute. People at MIT must have thought that their project will share the success of Wikipedia once it opens itself up to accept anyone’s contributions. However, the reality so far doesn’t seem to support this. A lot of new content is indeed being contributed and a lot of progress can be seen in the list of recent changes (at the time of writing it looks like this). My first impression was that the Handbook went international. I had a hard time to find anything related to collective intelligence in this list, though many irrelevant pages are created every day in different languages. The majority of the newly created pages seems to be in Simplified Chinese. In a random sample of three pages all were Chinese and had no relation to collective intelligence whatsoever.

This returns us back to the topic that I discussed in my previous post Bugs of collective intelligence: why the best ideas aren’t selected?. The common failures of collective intelligence clearly suggest that it is not a phenomenon that automatically emerges once someone set up a shared space like wiki and brought it to the attention of many people. It requires understanding of the dynamic of this systems to make them work, and this is especially true with wikis. There is still serious research to be done on the factors that make different collective intelligence methods effective. It is beyond the scope of this post, but here I want to give some hints into why some wiki-based projects perform poorly.

The main weakness of wikis as a collective intelligence platform is their weak mechanisms of selection. This may lead to what is known as a genetic drift. The selection in current wikis is strongly biased towards the most recently contributed content (”the last edit wins”). In order for a wiki-based project to work, the community have to have enough people who put some effort into overcoming this temporal selection bias present in the software. Those people should be motivated enough to go into the revision history, reverting unhelpful changes and selecting better versions of the content (the software doesn’t encourage the ordinary user to do this). They also have to check recent changes history to delete obvious spam pages. The deletion is necessary in wikis because there is no other way to focus attention of people on important pages (like importance sampling in human-based genetic algorithms). Any wiki-based project pretty much depends on the community of people to overcome the bias present in its software. The MIT CCI project so far haven’t created a community that is effectively performing these functions.

Update: I found it curious that the license under which the content of the Handbook is published prohibits its editing (link). It is a creative commons license Attribution-NonCommercial-NoDerivs 2.5 that allows no derivative works, while any edit creates a derivative work. The license explicitly says “You may not alter, transform, or build upon this work” and yet it is provided in an editable form of a wiki. On the other hand, the same license requires attribution, and yet when the content I contribued was copied presumably by MIT CCI employees to the mit.edu domain, the attribution information was stripped, so no attribution is given to me or any of the other contributors. Hopefully, whoever is responsible for this project will fix this because currently there are too many contradictions. Meanwhile I would recommend Wikipedia as a better organized resource about the topic of collective intelligence and, importantly, a working example of this concept.

Programming vs writing

Saturday, November 25th, 2006

Paul Graham here compares relative difficulty of programming vs writing.

Switching back to writing has confirmed something I’ve always suspected: writing is harder than hacking. They’re both hard to do well, but writing has an additonal element of panic that isn’t there in hacking.

I feel pretty much the same way, though I would rather explain difficulty by a context switch and different experience. Apparently, for most non-programmers writing is easier. I did quite a bit of both, but when I switch between them, I still feel difficulty and lack of confidence for some time. Here is my understanding why it happens.

There are at least two major differences between programming and writing that require adaptation. One difference is related to how the new piece of authorship is created, another is about its evaluation.

Programming languages are less expressive than natural languages. Pieces of code in a programming language depend less on their context. They do it mostly in strict and predictable ways. Their relative independence make it easy to combine them. In natural languages, the degree of context dependence is much higher and dependencies can be subtle and less predictable. Different word associations of different readers can lead them to different interpretations and conclusions.

There are many automated tools to test/evaluate programming result and they all give you objective feedback. Most programmers are rewarded psychologically when their program compiles, runs, and finally, passes all tests. There is an immediate feedback at any of these stages, that supports their confidence. Quite a different thing in writing. While there are automated ways to check spelling and syntax, I don’t know any tools to test semantics. It is not even clear if such tests are possible. We can specify the architecture a program will run on, but we can’t specify what kind of reader will read our writing. In some sense, our writing is executed concurrently in minds of people, each with a unique architecture.

However, writing becomes easier if we stop thinking like programmers. Instead of trying to achieve a specific goal in an optimal way, we can open our minds to other people and share our thoughts.