You are viewing technaut

Technaut's Idea Forge
 
[Most Recent Entries] [Calendar View] [Friends]

Below are the 20 most recent journal entries recorded in technaut's LiveJournal:

    [ << Previous 20 ]
    Tuesday, May 9th, 2006
    1:53 am
    Dependency Hell
    There is a major problem I have with every single Linux distribution I've ever used. Before very long I've managed to get myself into what is known as dependency hell.

    It always starts innocently enough. I have some project in mind that requires the latest feature of some piece of open-source software. By far the easiest way to install software in Linux is to use one of the many package management tools such as apt-get, yum, yast, urpmi or emerge. These automate the process of installing pre-compiled files and save the user the difficulty of knowing how best to compile a given piece of software for their particular distribution.

    The trouble with these tools are that they often insist that a given piece of software can only be installed in an environment that has a particular set of libraries available. This is not really a problem in Linux, in that it can happily host multiple versions of a given library in a seamless way. Theoretically, if you need a library you don't have, you can simply install it and you're good to go.

    In practice its much harder than that. None of the package managers I've ever used have a way of saying that you want to install a second (or third or fourth) version of a library. They always want to remove the old version before installing the new one. This, I suppose, reduces the problem of having a dozen different (and presumably unused) versions of a single library on a machine.

    In practice though, it leads to a far worse problem. You often discover that your current versions of a dozen different tools rely on a currently installed library. If you want to upgrade the library, you have to upgrade all of these tools as well. Of course, these new tools may themselves depend on other new libraries that need to be installed and so on and so on.

    I once installed a (I thought) simple upgrade to a program because it had a new feature I desperately needed. By the time all of the dependencies were resolved, my entire operating system had jumped two versions. I had started out running Fedora Core 2 and was now running Fedora Core 4. Or rather, a horrible mongrelized version of Fedora Core 2 and Fedora Core 4 which happened to think it was Fedora Core 4.

    This is the equivalent in the Microsoft world of trying to install a new program in Windows 95, only to discover yourself running Windows 2000 when you're done.

    There are many ways that one might try to combat the problem of dependency hell. The current system that most distributions use is a system of repositories. These hold versions of the software that are all consistent, and can be safely installed together. Of course that means that if a new piece of software requires a new library feature, one that isn't supported by the base libraries of the repository, then that version of the software doesn't get into the repository.

    When one is facing a software problem that is fixed by the latest version of some piece of software, but that isn't in the repository, that can be very frustrating. The temptation is to link to one of the experimental or 'unstable' repositories to grab the version of the software you need. Three times in four, it works. You install the program and a couple of supplemental libraries and you've happily solved your problem. The fourth time, you end up back in dependency hell.

    There are other obvious ways to with aspects of the problem, from having an explicit language to describe how library APIs change, so that library dependencies can be derived rather than being set by the software author, to having a far more strongly-typed and object-oriented operating system that could more easily deal with these issues. Both of these are difficult and manpower-intensive changes to implement and what you'd have when you were done would arguably not be Linux any more.

    A far simpler solution and one that would go a long way toward solving the problem would be to add a pair of features to most package managers. The first would be a flag to specify that you wish to install a parallel version of a package, be it a software tool or just a library. Thus, I would finally have a simple way of telling my distro that I need 3 different versions of GCC installed for development testing purposes.

    The second feature would be a way of installing just a library. Often you'll install a package and discover that it needs Library X. Library X is part of Program X and you have to install that program (which requires you to satisfy its dependencies first), even if you never want to use that program. Sometimes you have to the option of installing Development Kit X instead, which also contains Library X and is designed to let you write programs that use Library X. The thing is, you don't want to write programs that use Library X, you already have one that you simply want to install.

    By paring down the number of files you need to install to satisfy a given set of dependencies, and by having tools that are willing to explicitly manage multiple versions of the same installed library, we could go a long way to eliminating dependency hell.
    Monday, May 1st, 2006
    11:59 am
    File Indexing Service.
    There is an online media database called freedb that performs a very useful service. Almost anyone who has ever converted an Audio CD to MP3 format has used freedb, even if they weren't aware of it at the time.

    The service stores metadata about the songs that can be found on audio CDs. Since the CDs themselves only store the raw song data, information about the names of the tracks, the musicians, even the name of the CD itself have to be sought elsewhere. Rather than forcing everyone who has ever 'ripped' a CD to enter this data manually, the freedb service lets you enter the information into a central repository.

    That way, 99.9% of the time, when you put a CD into a ripping program, it will find an already-existing entry in the database for you to work with. Only in cases where the CD is so new, or so obscure, that no one has seen it before will you have to enter the data by hand. Once you've done so, you can then upload it to the central repository so that the next person won't have to do the same.

    Now, CDs are not the only digital objects that could benefit from an online database of meta-information. Anyone who has ever used a P2P client has come across the problem of files that are misnamed, misidentified, corrupt or just plain not what you thought they were from the name. A similar service that all P2P services could hook into to store and retrieve metadata on arbitrary data files could be a boon to P2P users everywhere.

    And I'm not just talking about a boon for pirates and illegal downloaders. If the P2P system is ever going to be seen as a legitimate method of distributing software and music, then it will be necessary to have a way of distinguishing between content that is free for everyone and content that one needs to have purchased.

    As a slight aside, this doesn't necessarily mean downloading copyrighted works should be forbidden. I have a large number of books that I have purchased. My understanding of my rights to that content means that I have the right to an electronic copy for backup purposes. I could, of course, use an OCR system to scan in the book, but its usually just easier to download it off the net. Besides, its slow and time-consuming to search a physical book for a particular fact you remember it having, while searching a text file is usually much faster and easier. Ultimately the only person who can actually know if they have the right to download a particular file is the person that is doing it, and I would like to make it easier for them to know where they stand. For the rest of this article I'll simply assume that its a good idea to let people be better informed of the details of files they encounter on the Internet, including their copyright status, as a thorough debate of this issue could fill several books (and has).

    Considering the acrimony of the debate on issues surrounding P2P distribution and the confrontational natures of the various parties involved, anyone that provided this service would face a number of technological hurdles. To deal with high traffic loads, denial-of-service attacks, database poisoning attacks, and a number of other dirty tricks, the system would need to be carefully designed.

    It should be distributed across multiple servers in different geographical locations and use a collaborative filtering system to help maintain a high degree of data accuracy. At the same time, it needs to be able to accept information updates from large numbers of users who may well have legitimate reasons to do so anonymously due to the regulations of their home countries. These problems are all solvable, although they would require some hard work on the developers part.

    The results would be a very useful service, and the owner of that service would find themselves in possession of a valuable database. The predecessor to freedb sold all rights to their database and made a fortune, albeit at the cost of bringing the service off line. The real success of a P2P file metadata database would be when it becomes so useful that copyright owners are willing to pay to have authorities data inserted into the database.
    Monday, April 17th, 2006
    11:36 am
    Audio Games
    Here is an idea that has been kicking around in my mind ever since not long after I first saw a simple 3D maze game for the Commodore PET computer back in 1980. The idea was a simple one: create a video game without the graphics. I thought it would be interesting to see how well someone could navigate a maze if their only clues were provided by a pair of stereo headphones. The maze would have to have some characteristic sources of noise that could be used as clues to when you were getting closer or farther from a given point, and what direction you were virtually facing.

    With time, this idea evolved and grew into a very Doom-like scenario where one is hunting monsters through a pitch-black series of caverns and tunnels, and some of them are hunting you. There would be careful attention given to the sounds echoing off different surface types, and the clue noises that would let you know that another creature was in the area.

    Of course, by the time you are working on something this sophisticated, you will need the dedication of a full blown sound engineer to get the modelling right, as well as the usual complement of level designers and game coders.

    It would also help to have a catchy name. At one point I was calling the game 'Alone in the Dark', but that title has long-since been used elsewhere for a computer game, so I suppose a new name would have to be devised if anyone were to actually produce this.

    The big issue that has prevented this being created up till now has always been the cost of production. I had always envisioned it as something the size of a walkman with high-quality stereo output. The trouble is that high-quality sonic modelling of an area is a fairly hefty computing task. Far more difficult than I ever realized back in 1980.

    On the other hand, programmable walkman-sized devices with quality stereo output have recently become commonplace, as exemplified by the Apple iPod. Most of these devices can have additional software installed on them, so the prospect of writing audio-only games for them has become less improbable.

    Now an obvious question might be: why not implement this on a desktop PC?  In fact, until this moment the option hadn't occurred to me. When I first came up with this idea there were no stereo outputs available on PC's, nor were there any for many years after. I got used to thinking of the program as requiring some sort of dedicated hardware. Nowadays though most computers have sound chips and stereo output that could probably handle the computing load easily.

    Of course, there's a big difference in user experience when sitting in front of a computer than when wearing a dedicated device, but it would certainly make sense to write and test a prototype on a desktop machine. That way you could demo it to a an mp3 maker as a possible item to bundle into their latest offering so as to make it stand out from the competition.

    How well it would sell another question indeed. For most of my suggestions in these essays, I have a good estimation for how well some product will sell or be adopted. For this game, I have no idea. I only know that I would want to play it.
    Monday, April 10th, 2006
    8:12 pm
    Quack
    One of the oldest still-playable computer games is the venerable Nethack (and its various relations like Slash'EM, the only variant still under production). What it has going for it is enormous replayability due to a huge set of monsters items and spells, complex semi-randomizable interactions, and auto-generated levels with carefully balanced  creature strengths, power progression, and treasure probabilities.

    What it sorely lacks is any sort of graphic interface. Nethack is a text-only game that has only relatively recently even allowed for colour. There have been many attempts to add isometric views, tiled maps and the like but they have all failed due to the clunkiness of the resulting interface, and the fact that in some pretty deep parts of the code, it really thinks that humanoids look like the letter @.

    Nethack has long been an open-source project, although any spelunking through the actual code base will rapidly reveal that multiple generations of coders over the last 30 years have rendered its internal structure to be well-nigh incomprehensible.

    Now, a game with the opposite set of problems is Quake. There was a game with wonderfully interactive 3D graphics, a fully immersive fantasy experience, but fixed levels and almost zero replayability. Still, when it came out in 1996, it was a sensation. The Quake game engine has long since been released under the GPL and has been updated and tweaked in various different ways in a huge variety of open source projects. There are now a fair number of other open source game engines which one could use instead, such as Cube, Saurbraten and Crystal Space, but since Quake was the engine of choice when I first came up with this idea years ago, I'll continue to use it as the suggested basis. In practice I'd spend a fair amount of time assessing the various alternative before making a choice.

    At some point it occured to me that it would be an interesting experience to marry Quake and Hack together into a sort of 'Quack' game. Steal the dungeon generation as well as the vast sets of items, monsters and interactions from Nethack, and use the real-time play and immersive 3D effects of Quake to interact with the results.

    Now, this wouldn't be a trivial operation. Nethack has been tweaked so much that one would be better off taking one of the various data-dumper programs that have been written for Nethack and using it to extract all relevant datasets and throwing most of the code away than to try to actually port it directly. Still there is a treasure-trove of game design there that can be salvaged.

    There would also have to be some careful user interface design. 3D worlds practically cry out for real-time play but Nethack was always more of a thinking game than a reaction game. One would have to make it practical to scout out areas without alerting monsters, pause the game when necessary and generally make it possible to think ones way out of trouble, rather than try to hack and slash through it.

    So, the work would not be trivial, but I think it would be very rewarding and since most of the design work would already be done, the job would be a "simple matter of coding". If done right I predict the results would be very popular, and quite lucrative to produce due to possible derivatives, even if one ended up giving it away (after all, it would be derived from open source products).
    Monday, March 13th, 2006
    5:03 pm
    Online Form Design.
    This is an idea I had a number of years ago, but that I shelved because the then-current state of the art in interactive websites wouldn't easily support the idea. Now that AJAX has become the next big thing, it would seem to be time to dust it off and take another look at it.

    The general concept is to produce a website that allows one to easily generate complex forms. By 'complex' I mean things that are beyond the abilities of plain HTML, and that require a constraint engine. With a constraint engine you can say things like "All columns are to be sized so that they are wider than their contents" and have it handle all the picky details, including having intelligent defaults when the constraint turns out to be impossible to satisfy.

    I had originally thought that OpenAmulet would be the perfect constraint engine to use for this task, but that project seems to be moribund. Nevertheless, there are other alternatives.

    In any case, one would go to the website and construct forms and pages for all sorts of endeavours: graph paper (polar, log/log, hierarchical hex, whatever), personal planner pages (such as Day Runner and Day Timer produce), various sorts of calendars, project planning schedules, Role-playing character sheets, etc.

    These forms could have a certain amount of intelligence behind them so that a Monthly Calendar form could configure itself for any given month, or instead, a form could be parameterized by what particular customizations you might want to add on a case-by-case basis (such as a company logo).

    These forms could be downloaded both in their native language (which would only really be useful for archival purposes) or as PostScript or PDF files. These latter formats would allow one to print the forms locally whenever needed.

    Registered users would be able to store the forms they created on the website, for later re-use, or for sharing purposes. Users would be able to generate lists of useful (to them) forms and rate the value of different forms produced. One could also search for existing public forms rather than make your own.

    The most popular forms would get showcased on the main page, as would their designers.

    The site would make money by advertising a number of paper and stationary related services and supplies, and would also have an agreement with a printing company to produce and ship short-run sets of forms on demand. Thus, although you could design and download a bizarre hex-based polar-log chart for free, if you needed several thousand copies, or cardboard-backed pads of the forms, you could buy that through the website.
    Wednesday, March 8th, 2006
    7:13 pm
    Walkabout.
    I am, quite naturally, very happy with Google's Map service. It has many things to recommend it, including ease-of-use and an open API. One thing it doesn't currently have is any acknowledgement that folks get around by means other than by car. I have heard that there are plans in the works for Google Maps to start including things like bus routes, subways and commuter rail lines on its maps, and thats definitely a step in the right direction.

    Still, it does me very little good when what I want to know is how to walk somewhere. Out here in the wilds of Montreal's West Island, there are many streets without sidewalks, or with sidewalks on only one side of a street. There are also numerous parks with paths through them (and this matters in winter, when the paths are the only plowed -- and therefor navigable -- way through the park), and sidewalks that take shortcuts between streets.

    One simple example of this is the fact that Google Maps shows a trio of dead-end roads a few blocks north of my house. What they don't show is that these roads are all connected by bicycle and walking paths, and form a convenient shortcut when walking to the store.

    In many ways things are worse in the city centre. There are numerous alleys between buildings that are never shown on the urban maps, but which one can walk or ride down, often saving a block from one's trip. There is also our famous underground city with its huge numbers of paths, tunnels and connecting buildings, none of which Google maps.

    All of this would be useful information to have, but I can't really blame Google for not providing it. After all, Google isn't a cartographer. They buy their geographic information from numerous suppliers, and no one bothers to collect walking information to resell.

    That doesn't mean it can't be collected though. What is too expensive to do when one is a cartographic information company is not necessarily too expensive when one can harness thousands of enthusiastic volunteers from the internet.

    So, today's idea is to build a web-site that lets folks edit Google map overlays in a wiki-like manner. The site would allow for the upload of data from GPS units, or let people draw directly on the maps. There would be ways to add annotations to, for example, distinguish between a hiking trail, a sidewalk and a back alley.

    The resulting user-created maps could then be displayed with a Google interface, and could even export a similar API so that folks could build upon the data presented.

    If one ended up being quite lucky, then Google (or a cartographic service) might buy out the web page. Even if this did not happen (and having a business plan that requires being "discovered" by someone with deep pockets is seldom a good idea), I suspect that there are probably as many uses for user-created mapping services as there are for user-created text services, and the huge number of extant wikis show just how popular the latter are.
    Monday, February 27th, 2006
    5:16 pm
    Next Generation Online Social network.
    Recently I wrote a proposal for the creation of the next generation of online social networks. Seeing as how I've not received any response to that proposal, and I think the ideas therein are good, I'm going to talk about it here.

    To start with, the current generation of existing social networks are narrowly focused on particular groups. Teenagers tend to gather on MySpace. Business professionals use sites like Ryze and LinkedIn. Writers tend to like LiveJournal. Each of these groups is attracted to a system that caters to its preferred modes of social interaction, and as a result the look-and-feel of these networks are quite different.

    In addition in existing networks, there is inevitably an element that refuses to follow (or is oblivious to) the social contract that has been set in place. This gives rise to problems with such disruptive users as spammers, trolls, stalkers, and serial complainers.

    To tackle the first of these issues, look and feel can be easily separated from the basic functioning of the social network software, so that one can provide one system that is almost everything to almost everybody. Each participating member would see a different user interface depending on which group they interact with. There need not even be any overt indication that such social networks as (for example) www.ravers.com and www.christianconservatives.com are hosted on the same network and handled by the same software.

    In effect, there would be a network of networks; each networked group free to build its own culture in its own online space. The interconnected nature of these networks would only become apparent when someone joined multiple groups and found that their one account let them switch seamlessly between their different groups.

    To aid in this seamless interoperation, the software would allow a user to carefully partition what information was public and which was private to particular individuals, groups, and the network as a whole. A businessman might simultaneously be a member of groups customized for marketing directors, golf enthusiasts and homosexuals, but might well wish to present themselves very differently to these three groups. They might (for example) have a staid business profile for the first group, a flashy and deliberately tacky profile for the second, and a completely anonymous one for the third.

    Once such a basic network-of-networks with its security and privacy models are defined, there is a huge range of additional features that can be provided on top of them. Just a few among these are:

    • Fllavoured Contacts. Most social networks allow one to have links to 'friends', but this would be too general a connection for a truly universal social network. We would allow links to be distinguished by the relationship they implied. One could have 'friend' links, 'business contact' links, 'sports buddy' links and so on as the user desired. They could also associate an intensity so as to indicate how much of a friend someone was, how sexy they appeared, or how trustworthy they were as a business partner, and so on. These could be kept private or aggregated to provide ranking predictions for new contacts (which could, in turn, be used to inform an automated introduction service).
    • Reputation systems and ranking. As much of how people interact socially is governed by considerations of status and social ranking, the system could explicitly keep track of each user and provide Google-like page ranks relative to each of their various interest groups. By careful consideration of how these ranks are calculated, the system could be made largely self-policing by automatically discouraging disruptive or destructive behaviour.
    • 'Private' interests. In some cases it might be desired to hide particular interests in a profile from everyone who does not have that same interest in their own profiles.
    • Full multimedia support. Current social interaction sites are moving to support the hosting of images, but I think it should go much further. The system should support the hosting of all kinds of digital content, including photos, videos, sound, and software. Where appropriate it should allow one to easily post a 'clip' from the middle of a sound or video file on one's account.
    • Full Internet interactivity. As an aid to gaining widespread adoption, the system should cater to a wide variety of different Internet protocols and interaction types, including SMS messages, voice messages, web cams, email gateways, RSS gateways, Usenet gateways, news services, and even other competing social networks. The more inclusive the system, the lower the barrier to migrating from existing systems that people may already be members of.
    These last two items may sound like huge development investments, but the adoption of an architecturally open software model in which technically sophisticated users can contribute plugins can greatly reduce the need for local development efforts.

    I could probably go on for page after page detailing the features that should exist in a good online social network, and that are missing in all current offerings, but this should give a good idea of what features the next generation should provide.
    Monday, February 20th, 2006
    1:09 pm
    Virtual Corporation Portal
    The term "Web Portal" has changed in meaning since it was introduced. Wikipedia's definition currently talks about integration of various offering from different vendors, personalized services for users and multiple-platform independence. None of this was part of the core idea behind Portals.

    They were originally intended to serve as one-stop-shopping for a particular class of internet user. That is, they were narrowly focused on providing, in one location, all of the services needed to conduct some particular bit of business on the web. The idea was that if there was one site that, for instance, told you everything that you wanted to know about buying, caring for and raising exotic fish, then it would quickly be the place that exotic fish aficionados would go first when looking for something. As a result you would have a captive audience for purposes of advertising, and would be able to act as a middleman (and take a cut) by connecting consumers with vendors.

    The idea was (and is) basically a good one, but it turns out that the look and feel of the portal is crucial. Someone visiting for the first time, even if they know almost nothing about the industry, must be able to spot at once what the portal is for, and how to find out what they want. At the same time a seasoned user of the service must not have to jump through hoops to get to the particular item that interests them today.

    During the 1990's, many companies created portals, and most of them failed miserably. Mainly this was because they were hard to navigate, did a poor job of being a one-stop-shop for their users, and were often very difficult for the uninitiated to understand. As a result, most portals deployed today are part of a company intraweb and are used to coordinate the work of all of the members of that corporation, and where the audience has little or no choice in the use of the service.

    You've probably guessed the reason that I'm providing this background: I would like to propose that someone create a new web portal. In particular, I would like to see one dedicated to the setting up and running of virtual corporations.

    The idea behind a virtual corporation is quite simple. If most of the work is outsourced and the few actual employees work from home offices, the costs of running the company can be drastically reduced. One success story is Topsy Tail Co., which has revenues of $100 million per year, but only three employees.

    Once everyone is working from home offices, then the cost of the corporate offices can be almost elimated. I say 'almost' because some corporate workspace is often useful. Sometimes one of the team members visits from out-of-town and needs a temporary office for the duration. Sometimes you need to get everyone together for an actual physical meeting. Sometimes its necessary to sit a client down in a business setting with a members of your team.

    In all of these cases its useful to have a single office or meeting room that one can use at need, although since they are used quite rarely its often more economical to rent them on an as-needed basis from a virtual office supplier.

    These suppliers are just one of the many services that a virtual corporation needs to contract for. In fact, it turns out that a virtual corporation has many specialized needs, including such items as:
    • Dedicated mail and web services.
    • Coordination and collaboration software.
    • Telecommuting employees
    • One-click legal and accounting services (such as incorporation in Vanuatu).
    • Phone answering and call-forwarding services
    • Receptionist and physical mail services
    And the list can continue in that vein for some time. All of the above services are available on the web somewhere, but most are extremely hard to find, even with Google's help. A web portal that brought it all together and provided comprehensive one-stop-shopping for these services would be a boon for the modern entrepreneur who is trying to start a new business as inexpensively as possible.

    Such a service might prove quite lucrative. Besides, the company that runs the portal could have rather low expenses, as it too could be virtual.
    Monday, February 13th, 2006
    12:42 pm
    RSS Aggregators.
    There are a large number of companies out there who are trying to get in on the RSS aggregation bandwagon. I'm going to assume for the sake of this essay that you already know about RSS and aggregation and the many uses to which it is currently being put (podcasting, vlogging, etc). If not, I suggest you check out the link above before reading further.

    Now, the RSS system has a number of flaws, the most major of which is that it doesn't scale. The server load from having several hundred thousand subscribers poll a website every few minutes can bring even major webservices like MSN to their knees. That however is not what this essay is about. Instead of proposing a replacement for RSS (which I may well post about in the future), I'm going to propose a replacement for the current crop of aggregators.

    There are many aggregators and I've looked at a fair number of them, but by no means all, so for all I know what I am about to propose may already exist out there. If they do though, I was unable to find them.

    What I want is a service that doesn't just merge multiple RSS streams into a single stream. That's useful, but its far from enough. I would very much like to see a service that can create RSS streams out of non-web material like newsgroups and mailing lists and syndicate them. The resulting output from the RSS aggregator should default to RSS (of course) but it would also be very useful if it could also be output in the form of a mailing list or newsgroup gateway.

    Then, I want to be able to merge streams and perform operations upon the results. I would like to collapse together all articles with identical texts. Too many of the news streams I subscribe to carry the exact same story and I would like to see it only once. It would also be good if the aggregator could tell when multiple articles were all about the same current hot topic and group them together, perhaps by merging the articles into a single meta-article.

    Another thing that would be good would be to be able to splt a stream (either a regular stream, or one of these merged streams) so that I could filter technical discussions into a different feed than political discussions about technology, both of which currently show up in some of the technology streams I follow.

    Once all that is in place, then I would like to see some collaborative filtering layered on top of it all. Something that allows the subscribers to the resulting feeds to give feedback, not only in the form of comments, but as ratings and subject tags so that the readers can refine the initial stream filtering done by the RSS engine. Think of it as a Digg for RSS with the ability to add and vote on topic categories.

    Now THAT is a service I would like to see!
    Tuesday, February 7th, 2006
    6:41 am
    File Signature Database
    During the dot-com bubble, I was working with some friends to build an internet service based on the many uses of digital signature technology. Since we went under when the bubble burst, and no else has yet to provide a similar service, I thought I would talk about what sorts of things we were doing, as I still think its a viable business. I mention in the FAQ for this journal that some businesses fail for reasons unrelated to their technical merit. This was one of those.

    Now, to start with, when I say 'digital signature', I don't just mean the output of cryptographic hash systems used in many authentication systems, but the output from the general class of hash functions as applied to digital objects. I make this distinction because we used far more than just one type of hashing. In our project we made use of at least three different sorts of hashes:
    • Cryptographic Hashes
    • Structural Hashes
    • Analytic Hashes
    The first type is the sort that I imagine comes first to mind for most technologists these days. Cryptographic hashes are commonly used in many internet protocols. They are designed so that even a one-bit change in the input data will change a random half of the bits in the hash. They provide authentication that a file hasn't been tampered with, can act as unique IDs for a data object, are very difficult to forge, and can be used to verify the origins of a digital object.

    Structural hashes are a different sort of fish. They provide information about the structure of the data, including such things as its type and format. They can be used to characterize the sorts of operations that can be safely performed on a file that one has never seen before, and what sorts of content-specific signatures could be applied. Thus, if one knows that a given binary stream is a JPEG encoded image in a JFIF wrapper, then it makes sense to attempt to display it visually. The treatment for a tar-ed, zipped, C-source file is very different. The standard Unix command 'file', provides a simplified structural hash.

    Finally, analytic hashes are designed to change their bits in characteristic ways when the data undergoes certain transformations. We were able to develop analytic hash systems that allowed us to tell what editor or compiler had been used to construct a data object. We could tell from its analytic signature if it was likely to have been infected with a virus. Best of all, we could tell if its content was substantially the same, but its format had been altered. Admittedly, we had achieved this last only for text documents, but we were researching ways to do the same things with images and audio data when the company shut down. This same technique allowed us to tell that two documents were textually related -- possibly even different articles on the same topic.

    Given this basic system of hashes and a robust distributed database, to store them in, there were several different businesses that we could engage in. The most obvious was to store as much meta-data as we could on every data object on the internet, and associate it with the signature for the object. We could then answer specific queries about all objects we had registered. This included such things as:
    • What is this object?
    • Where did it come from?
    • Is it part of some specific collection?
    • What rights do I have with respect to this object?
    • What other versions of this object are there?
    • Are there functional replacements for this object?
    • Are there security concerns with this object?
    • Do I need a license to own/use/have this object?
    • Where can I get support for this object?
    And many similar bits of information. The plan was that queries would be free but we would charge companies to store useful information about their products in our database. The cost to store the data would be based on the cost of the software, so that (for example) free software could be registered for free, while companies making large amounts of revenue from their software would have to pay more.

    The reason that a company would want to register their software, images, music files and so-on with us, is that it would cut down on piracy. Right now, most major corporations do not know what software is running on their systems, or if they have a legal right to what is there. Our system could scan a huge network over night and report that it had found 762 copies of a particular piece of commercial software. If the company had only bought a 500-unit license, then it would know that it was in violation and would need to take some action to remedy the situation.

    This could be extended in several ways. We could allow promotional information and service information to be registered. A company could also register local, private information about proprietary data objects and track their use within their organization. Version histories, comments, and distribution information could all be centrally collected and managed by a system built on top of the database.

    There are many other uses for such a system as well. Just one example is that by keeping statistics on the sorts of queries we were receiving, we could estimate the installed base for any registered software product, and sell the accumulated statistics. Another idea we had was to build a monitor for all ingoing and outgoing data traffic on a site and ensure that no proprietary data was being accidentally leaked.

    The longer we worked with the system, the more potentially profitable uses we found for it, and I think the business model is as valid today, as it was in 2000.
    Monday, January 9th, 2006
    5:59 pm
    Universal Profile Language
    Recently I joined a professional networking group, and was confronted with the need to provide a profile. Now, I already have a profile here on LiveJournal, (well, three actually) as well as one on MySpace, and one on Slashdot, and one on Monster, and one on dozens of other sites. Each of these profiles is different, since they are intended for different audiences, and the individual sites want different information (in different formats) but they all contain fundamentally the same information.

    The worst sites are the ones where I have to enter my previous work experiences. I end up cutting and pasting from my resume, and it can take hours to go through the process of entering all of the intformation. What I would love to see is some sort of Universal Profile Language for data interchange amongst all of the sites that want my profile.

    I'm imagining that a universal profile would be in XML, not because I like that format, but because its popular, and the secret to any data interchange standard is that its useless unless you can get folks to adopt it. It would also need a number of features in order to be useful:
    • Data needs to be stored at several levels of detail, so that a summary or more detailed information can be pulled out as needed.
    • It should handle variations and attributes so that, for instance, a biography can be stored in several versions depending on audience.
    • It must have a carefully thought out privacy and security system, since it will store data of a sensitive nature.
    • It should have a large controlled vocabulary of tags, as well as a way to handle extensions cleanly.
    • Sections should have defaults and be empty until needed. For instance, I've yet to need to fill out my medical information online, and while I may want to support this, I hardly feel I should have to fill the information in until the need arises.
    Once a data format has been determined and documented, then it needs to have a library of useful open-source routines written in the most popular web-scripting languages, including javascript, java, perl, php and python. Where possible, these libraries need to be incorporated into as many open source programs that ask for profile data as possible, so that adoption would be increasingly likely.

    Finally, there should be a protocol developped for transfering the information across the web as needed. Ideally, each person would choose a profile information repository, either on their own servers or on someone else's, and enter whatever information they feel relevant. From then on, whenever they find themselves needing to provide profile information to a site that supports the Universal Profile standard, they can give it a URL for their repository (or part thereof).

    The web site could then read in the profile and extract any needed information. Any remaining gaps would still have to be filled in by the user, but the data they placed into the online forms would be encoded in the profile language and sent back to the repository for later perusal and acceptance by the user. This way, over time, a user's personal profile would slowly come to hold the set of fields that they most often have to fill in online.

    If such a system were in place, maybe next time I need to join a professional society that want a copy of my work history, it won't take me two hours to provide it.
    Monday, December 5th, 2005
    9:46 am
    Personal Database.
    Before I start on today's topic, I just want to let folks know that, barring further unforeseen emergencies, this column will now be published every Monday, starting today.



    Today's idea is another one which is simple to state, but which (due to the lack of data interchange standards) is far from simple. What I want is a personal database that keeps tracks of everything on my computer, and everything I ever see on the web, in newsgroups or in email.

    This came up because I recently had to find an address for a party I was invited to. The trouble was, I couldn't remember how I had gotten the invite, so I didn't know where to look for the address. If it was a telephone invite, I would have made note in a file, or on my online Yahoo calendar, depending on how rushed I was at the time. Had it come by email it would be in an email folder for all mail from that sender. Of course, there was a half dozen folks that I know that might have invited me, so it could be in any of those folders. It was also possible I had seen it on one of the various online forums where its possible to send a private note to a group of people. I never did find the address, and had to phone someone and have them look it up in their notes.

    Clearly, there has to be a better way of organizing things. What is needed is a free-form database that will sit in the background and watch all email, newsgroups, new files and everything else I add to my system, and keep notes. When I burn a CD it should note what files I've put on it, and what its serial number is. It needs to do that all without ever prompting me or asking any questions. So, the next time I need to find an obscure article about a new ink-jet system I saw six weeks ago on the net, I can do a search on my personal database. Since it will have indexed every word of every textual file on my machine (.pdf, .doc, .txt etc) and every word I've seen on the net, I should be able to quickly find what I'm looking for.

    Of course, in order to do this it needs to be able to monitor all sorts of things I do on my machine. It needs to index files as I save them, and it has to be able to read my email and see what I'm browsing on the web. It will also need hooks into my CD burning software and other storage solutions that I use. Finally, it has to be able to run with minimal user setup and intervention. I must not need to know what a database schema is, never mind being asked to customize one. It should do its job silently by default, although it should be possible to augment its data in a simple way.

    For example, when I'm reading my email, I should be able to click on a missive and add a note or set of attributes to it, so that I can find it more easily later. Similarly, when I burn a CD, it should be possible to call up a database window in which I can associate a CD title with the serial number, to aid in later finding it on my shelf.

    Ideally, of course, the database would entirely replace the file system on my computer, but that is a more massive undertaking than I am proposing. What I am really asking for is something like a beefed-up Microsoft File Indexing system for XP, except that I want one that works. To this end there are some features it must have:

    • It needs to be open source so that folks can extend its API in useful ways as folks want to make it able to gather and index new types of information which weren't planned for in the original design. This will also facilitate it becoming a standard.

    • It needs a way to plug-in modules for dealing with various open-source and proprietary data types. That way, for example, Wordperfect could be encouraged to produce a module that reads their file types, without making their file format public.

    • It needs to be invisible when not needed and easy to use when searching for something. It must have a reasonable UI that caters to both the novice and the power user, without condescending to either — in other words, no cartoon avatars.
    • You must be able to search for individual words in titles and contents, attributes and other various metadata, and be able to do complex boolean searches.

    • It must be simple to backup and restore, and able to deal with such activities as migrating a partition to a larger hard drive.

    • It needs to have a good data export/import system and a data interchange API. There are too many personal databases out there that handle only one type of data, and which cannot talk to any other databases. This makes them almost useless.

    • Although its not a priority, it would be nice if it didn't use up outrageous amounts of disk space. Disks are getting cheap, but whatever amount it does use up, it should more than pay for the resources it uses in added convenience.

    As I said before, it would be better (from various standpoints) to have this integrated into a file system so that noticing when a file is renamed or moved is more straightforward, but database file systems have been promised for both Linux and Windows for years now, and neither appears any more closer to existing than when I first heard about the idea. Maybe this interim solution will help spur development in the right direction.
    Wednesday, September 21st, 2005
    10:05 am
    Hiatus
    Due to me being out of the country for the next few weeks, there won't be any new technaut posts until mid-october.

    Addendum: I have returned from my trip to be greeted by no less that six crises that are making major demands on my time and effort. This column will resume as quickly as I can manage. Believe me, I would far rather be writing this than dealing with dysfunctional bathrooms, collapsed ceilings and broken servers.
    Monday, September 12th, 2005
    3:25 pm
    Permanent Storage
    Most ISPs charge obscenely large amounts of money for storage on their systems, especially when one considers its actual value. Currently, mass storage can be gotten as cheaply as 60¢ per Gigabyte, and the price is dropping steadily, such that it halves roughly every 2 years.

    Now, the storage that you buy in the form of a hard drive is not going to last forever. I don't know what the typical Mean Time To Failure of a modern Hard Drive is, but they are usually only warrantied for three years, so lets go with that number. (EDIT: This report may shed some light on actual MTTF rates in modern disk drives.) This would seem to indicate that a base rental price for storage, right now, is about 20¢ per Gig per year. I don't know about your ISP, but mine charges around $48 per Gig per year, a substantial difference.

    If we assume that a hard drive needs to be replaced at the end of its warranty period, then the cost to replace that drive will have fallen to roughly 35% of its initial cost by that time. Three years later, its replacement will have fallen by 35% of its cost. With a bit of math we can then figure out the actual cost of storage for any length of time.

    In fact, since the cost is exponentially decreasing, there is a finite price at which you would have paid for an infinite amount of time. In this case, that's equal to about 1.55 times the initial cost. So, this gives you the current cost to store a Gig of information forever to be only about 93¢.

    That number is something one could build a business case around. One would have to charge more per Gig if one wanted to run a business, as there are all sorts of overheads that I haven't taken into account above, but the total price to a consumer for permanent storage would still end up being quite low. Keep in mind I'm being very conservative with these numbers. Most drives last far longer than three years.

    Now, what kind of business could you run based on this? I can see several possibilities. You could sell infinitely long warranties on hard drives, promising to replace them with an equal amount of storage free of cost if they ever failed for any reason. You could start up a an online data-vault and severely undercut all of the competition. You could also make a commercial version of the Internet Archive.

    This last possibility deserves some explanation. The Internet Archive takes regular snapshots of web pages and rehosts them, so that you can look to see what a web site looked like at one or more different times in the past. Naturally, they can't archive everything, and there are many holes in their coverage. Many short-lived events, like the defacement of a website appear and vanish between snapshots.

    What a commercial venture would do is release plugins for Internet Explorer and Firefox which allow you to visit an arbitrary website and request that it be saved forever. You could save just the current page (which will usually cost a few fractions of a penny), or request that a whole site be spidered and stored, and the user billed accordingly.

    People could use any number of micropayment schemes (including the new one proposed by paypal) or the micropayment issue could be sidestepped by letting someone open an account with the storage company, which would keep track of the minuscule charges for later combined billing.

    The beauty of this scheme is that multiple people are likely to ask to have the same site preserved, but the storage need only ever be bought once, even in the case of a rapidly changing site if the company uses proper delta technology. Each additional request to store a site is pure profit.

    In addition the costs of permanent storage aren't all accrued by the storage company at once. When someone pays their 93¢ for storing a Gig forever, the company need only spend 60¢ immediately. The rest will go into the bank where it will gain interest until that storage needs to be replaced. Thus the company can end up with substantial income earned on the money saved to meet future demands for technology replacement.

    Thus, there are at least two obvious revenue streams that one can tap in such a company, and probably several more indirect ones depending on the details of how the company is set up.
    Monday, September 5th, 2005
    9:52 pm
    Digital Agora.
    Over the labor day weekend I was engaged in the clearing-out-the-junk ritual known as a garage sale. During one part of the preparation phase, I was obliged to walk around the local area and post a fair number of 'Garage Sale' signs, so that folks would know we were having one. While doing so, I noticed the lists of other garage sales and lost pet messages that already adorned the various lamp posts and telephone poles that I affixed my notices to, and I started wondering if there wasn't a better way to do this. I figured out that there was, but as far as I know no one has done it yet.

    The trick would be to take some features from a bunch of existing websites and combine them in a new way. There have been some attempts to join together the features of Google Maps and Craigslist, such as housingmaps, but so far these are mostly just been proof-of-concept sites. What would be extremely useful is a site like Craigslist that allows you to place various sorts of notices and adverts indexed geographically and that can display the information as map overlays. If that's all it did, you would just have a bigger version of housingmaps, but you needn't stop there.

    Folks should be able to register one or more physical locations, and be able to call up a "what's new" view of a selected area. The older items would be grayed-out or missing entirely, and the newer items would be on top. This should make it possible to tell at a glance if anything is going on in the area.

    It would also be useful to have an API for use by software that wants to query the site. That way it would be possible to write a bit of code for a GPS-equipped cell-phone and let it inform the user of selected events in the area they are walking through. It could even go as far as the oft-predicted 'virtual coupon' that could be offered as an enticement to folks walking by a store.

    There is also no need to restrict the system to pull only. Folks should be able to subscribe to different categories of local information and have new notices delivered by email. Thus, everyone in an area who was signed up could find out about a lost cat.

    That brings up an interesting point though: the system will be of limited use unless most of the folks in any given area are signed up. To encourage just that, the site should provide a number of local community services and forums for discussion. The aim would be to have the system become the de facto place to go to find out about public events in the community, to talk about changes to local bylaws, or just to get to know your neighbors.

    This might be done by making alliances with local town officials so that they have their own special announcement categories and official forums via free privileged accounts. They, in turn, would announce the site in the quarterly fliers and similar community information channels that they would already have in place.

    Thus, it would need to become a digital agora, an online place for the whole community to come together to discuss subjects of mutual interest, if it is to succeed at its mission of freeing our lamp posts of the faded notices that accumulate year after year. In the end, it would not be a small project, but it would be one well worth doing.

    Updated 2006-03-26: Check out Loki, for a company trying something similar to this idea.
    Wednesday, August 31st, 2005
    5:22 pm
    Book Watch.
    In an earlier article I talked about online gift systems and this particular idea is tangentially related to that one. The basic concept is to put together a web site dedicated to book lovers, that can make their efforts to accumulate books much easier.

    I know that there are many book-selling websites out there, and Amazon currently dominates them, but this site would not be a book seller, per se, but would act as more of a front end to sites like Amazon. The idea is to provide a large number of value-added features that modern book selling sites lack, and really should have. If done right, the goal would be to get bought out by Amazon for a hefty piece of change.

    The main idea revolves around book wish lists and how they could be improved. All of the various wish list features from the online gift system I mentioned above should be included: lists imported from and exported to other sites, allowing folks to buy books from other sites via local lists, and building models of user buying habits so as to better make suggestions.

    Many more additional features would be added to the wish lists, so they would begin to resemble personal online book databases. The features would include allowing someone to add an announced title to their lists, even if it is not yet available, and folks should be able to add authors and series in progress to their lists. That way they could put all of Tom Clancy's books as an item on their lists. They would also be able to specify, for instance, the whole Harry Potter series by Rowling.

    The purpose of letting people put not-yet-available books on their lists is so that they can be alerted via email the moment such books become available for sale. The email would include a link to an online page where one could finalize the details for purchase and shipping of the book.

    Taking that idea further, the site would encourage users to register a credit card, and any shipping or purchasing preferences they might have (in case they, for example, never want to do business with Amazon, even if their books are cheapest). In that case, they could get an email informing them that a book they have specified as 'must have' has become available and has already been shipped. Mechanisms could be put in place to control the maximum number of books shipped to, or the maximum amount of money billed to, a person in a given time period.

    The site would use a system like those found at places like PriceGrabber.com and other online sales portals where it can rank the costs of an item by the expense of having it delivered, including all shipping (and possibly customs) duties, so that a user of the system would get their books for the lowest available price. The booksellers ranked by the system would explicitly include major shippers in Europe and Asia, to ensure that the site is the first place to check for a book, even if its an obscure or out-of-print one.

    Speaking of out-of-print books, they and second-hand books would be handled by the exact same mechanism and be afforded the same status as the in-print books, unlike is done at places like Amazon.

    Another thing that we would encourage folks to do is to enter their current personal libraries into their book databases, and annotate them by how much they liked them. They could also note that a book had been read, but that they don't own a copy, and don't want one. This would allow the site to inform a user when they were attempting to purchase a book that they had already bought, or had read but didn't like, and had forgotten about. It would also give the recommendation engine far more information to work with when deciding what books to suggest to a user.

    Once folks had online databases of books, they could designate subsets of books that they could recommend or pan in public forums. Users could specify a desire for books from these lists, just like any other series. Thus the concept of buying everything Oprah (or anyone else) recommends would be explicitly supported. These lists could also be restricted to a limited group of users, such as the members of a book reading club.

    There are probably dozens of more features that one could come up with to make the book lovers life a little less chaotic, but this should give you a basic idea of the sorts of things the site could do. In the end it should resemble a personal book management site that just happens to facilitate the purchase of books.
    Tuesday, August 30th, 2005
    10:22 pm
    Book-o-mat.
    Here's an idea for a device and a service that I actually started working on in early 2000. Various things, including the dot-com bubble collapsing and the fact that Xerox was working on something remarkably similar killed the project. That was five years ago now, and I would have expected this item to be common by now, and it isn't for some unfathomable reason. So, I'm giving away the idea here, and maybe someone else will run with it.

    The idea is simple enough: build a print-on-demand book printing press that will fit into a space the size of a mall photo booth and give it a net connection. You now have a device that can print a paperback book for anyone who needs something to read.

    These devices would be put in places like train and bus stations and airports where folks find they have time on their hands and might want to buy a book with which to occupy themselves. Since they are (relatively) small, these book-o-mats, as I call them, could even be put in places like hospitals.

    The design of one of these devices is quite simple in principal, while Xerox and other companies will happily sell you the technology to print a paperback book whenever you wish. There would still need to be a certain amount of engineering required, of course. A customer is likely to get impatient if a book needs more than five to ten minutes to print, so a book-o-mat would need multiple print engines running in parallel.

    There will also have to be a status display and a monitor so that a customer could browse through the list of available books and authors. It should also be designed so that the next customer can be checking for which books they want while the previous is waiting for his purchase to finish printing.

    The book-o-mat would have a large hard drive for storing .pdf or similar files, but it would mainly be a cache to hold the more popular books. When a book not on the drive is requested, it would be fetched over the net connection. The net connection would also allow the printer to report its status to a central location, either on demand or at regular intervals.

    This would ensure that maintenance folks would know when the printer was low on paper or ink, and when it had a paper jam or similar malfunction that required immediate service. The devices would be ruggedized to maximize the mean time between these maintenance visits.

    The company that owns and operates the book-o-mats would be a normal publishing house, and sign publishing contracts with the various authors who's books they wish to sell. They would have certain advantages that set them apart though.

    To begin with, they would not have a concept of an 'out-of-print' book. Once they had the typesetting file for a given volume, they could print it whenever there was a request. They would also be able to go back to the old system of A-List, B-List and C-List authors, as there would be very little expense in hosting books for authors who are not yet a big name (but might yet become one).

    Although print-on-demand systems cost more to produce a single book than do traditional methods, that book does not have any associated shipping costs, and there is no worry about unwanted books being returned. As a consequence this publisher would have a higher profit margin and could afford to give substantially more than the 8% royalty that is a typical author's payment today. 20%, 30% or even as much as 50% starts to become possible as the actual costs of producing a book drop to negligible levels.

    These higher returns would allow the publisher to out-bid others for printing rights, and would go a long way to allay the natural fears of authors about what negative impact this new technology might have on their profession. Once the technology was perfected, one might even start to see these machines cropping up in book stores.
    Saturday, August 27th, 2005
    7:34 pm
    Lego meets IKEA.
    As a person steeped in the culture of the IT industry, I tend to think of design space as being characterized by reusable modules and carefully-chosen interfaces. This tendency persists even when I am dealing with the real world. As a result, on a recent trip to Ikea, I was struck by what IKEA and Lego could teach each other when it comes to design and assembly. The thing these stores have in common of course, is modularity, interfaces and design.

    Lego is well known for the huge variety of modular toy components they make, and the fact that they all adhere to three or four carefully thought-out interconnection interfaces. Lego's motto for many years was "It Fits", and it has infused their whole production philosophy. On the other hand, Lego does not cater to its customers and has been known to act with complete indifference even to its largest distributors. It also cannot be said that most Lego toys are particularly stylish or elegant (although there are exceptions).

    Ikea, on the other hand, is a customer-oriented producer of a wide range of low-cost good looking furniture that is hand assembled by the purchaser. Having put together a fair number of Ikea pieces over the years, I've been struck by the wide variety of connection interfaces they use. It seems like every other item I've put together has a novel way of being joined.

    To a large extent this is because modularity is just not a key design criterion for Ikea. Even when they explicitly make a set of modular furniture (such as some of their wall units), the various pieces are reusable only within that line, and often only with sets made within a few years of each other as the design tolerances are known to wander in their products.

    This got me to thinking about a furniture store that combined these approaches. The idea would be to produce well-made, elegant looking furniture that was highly modular and configurable. There would still be different lines of furniture featuring different styles, colors and materials, but they would all be made of pieces that used the same set of standard joining interfaces.

    The individual parts would also conform to a number of standard shapes, widths and lengths so that they would more easily support the interchange of parts. It should be possible to exchange the legs on a chair and a table, or to mount kitchen cupboards on a bed frame, if one wished.

    Normally one would buy sets specifically designed to put together a given piece of furniture, but it should also be possible to buy individual pieces to mix and match furniture that fits a particular design goal.

    Now, none of this will be particularly easy to do. Good furniture design is an art form, as is the design of good construction interfaces. On the other hand, I think the results would be a line of furniture that would have not only have a high usability (due to its customization) but a high reusability as well. After all, if the pieces can go together in multiple ways, one can update their living room just by taking everything apart and putting it together in novel ways.

    The main competitor to any attempt to produce a line of furniture like this will probably be Ikea, but I think there are significant advantages that could be had over that company. To begin with, the modular furniture would support a far greater degree of customizability with respect to things like choices of materials, colors and textures. In fact, if the company adopted a just-in-time production philosophy it would be possible to have some lesser-demanded combinations of materials and colors produced automatically when an order came in.

    In addition, the modularity means that the store could cater to the rarer customer, simply by providing a few special-purpose pieces. Thus, it should be possible to produce smaller versions of furniture for children or little people, or to make a kitchen or bathroom designed to deal with the needs of those who are blind or confined to a wheelchair.

    As we enter the 21st century, I believe that demand will increase for ever more customizable and personalizable products, and furniture will be just one of these. Starting with modular furniture would give a company the chance to be in on the ground floor as new custom manufacturing technologies become ever more prevalent in the next couple of decades.
    6:15 pm
    Avatars Inc.
    In a recent article about a simulation engine, I briefly mentioned a revenue source that I would like to revisit here. In that article I talked about letting folks design and register a particular look for a virtual paper doll.

    A website that lets a user personalize his visiting experience to the extent that they feel a sense of ownership of their customizations is not difficult to set up. Such a venture has a far lower cost of entry than may have been implied by mentioning it in relationship to such a large project as a simulation engine. In fact, this is one of the simplest money making ideas I can think of, and all it requires is some minimal ability in web programming and some artistic talent.

    There already exist a number of sites that do something along these lines, but again I see no evidence that the owners have tried to maximize their revenue. For example The Cyborg Name Generator has very little in the way of customizability, but has enjoyed waves of popularity on a number of blogging sites. You simply choose one of six icons, type in a name, and it generates a customized image for you:

    If you look at the URL for the above image (http://www.cyborgname.com/webimages/edox-TECHNAUT.png) you will see that it encodes all of the information needed to recreate that icon. This allows the website to either serve the images from cache (if they are being requested often) or generate them on the fly when needed. Thus, these customized images take up very little storage room on the cyborgname server. Cyborgname makes a modest income by letting folks order a t-shirt or mug with their customized graphic on it.

    In contrast, an avatar creator like the one at SouthParkStudios has a much-more configurable system for generating images (in this case in the style of characters from the South Park cartoon). Here the image is generated entirely inside a flash application, and there is no simple way to refer to that image again. If you want to actually use the avatar you have created, it's necessary to use a screen grabber or similar program and import it into a graphic editing program and then export it in a useful format. This strongly discourages the use of the tool for its stated purpose (creating avatars). The only revenue this site derives from the customization tool is side revenue generated by attracting additional folks to the web site.

    Finally, there are sites like the Portrait Illustration Maker and the Candybar Doll Maker that take the customization to an extreme, but in both cases I found the task of making a character sufficiently complex and the user interfaces sufficiently bad that I was unable to actually make a character that I liked.

    So, how can we avoid the mistakes that these sites make, while capitalizing on their good ideas? The answer is to make the site easy to use, highly customizable, responsive and extensible.

    By making it easy to use the images and to allow hot linking to them, the site reduces barriers to adoption for use by the users, and encourages them to tell their friends about the site. Encoding the content of the images in their URLs ensures that the web servers do not need to actually host each image being hot linked. The bandwidth costs that this entails will end up being the only real cost of doing business, and should only rise as revenues rise.

    By having a large range of customizational options available for the images that the user can make, then it becomes more likely that they will be able to produce an image that matches their desires. It also opens up an additional revenue stream: ownership of certain configuration. For a nominal fee ($1.00 or so), a user can register a look and the system will not allow anyone else to generate an image which is too similar to a registered one. Granting users a sense of ownership of your virtual products is one way to gain brand loyalty while encouraging them to buy related products such as t-shirts and mugs.

    The system needs to be easy to use and responsive. It needs to have a user interface that makes it easy to generate and play with multiple looks, and to save alternatives for side-by-side comparison. A responsive technology like AJAX would be worth looking into as well.

    Finally, it should be extensible, since any non-changing system will eventually lose its novelty appeal. One way to do this is to create an API for the various image parts and the engine that pieces them together. An online editor that understands the API and lets artists upload new image pieces opens up a whole new line of interaction.

    Once outside artists are contributing to the site (and increasing its value to the users) you can start creating an artists community. Different image parts can be tagged by which artists created them. Statistics could be kept and displayed that explained which artist produced the most popular images, in a number of different categories. A forum could be put in place and feedback obtained on what parts of the system were easy or hard to use.

    When there are enough artists contributing to the system, they become a source of revenue that can be tapped. It might, for instance, be reasonable to let artists pay to register image parts that they have built and uploaded, so that only they (or perhaps some small group) has access to them. It all depends on how the system evolves and what proves to work best with the interface and implementation technology that is used.
    Friday, August 26th, 2005
    10:42 pm
    Online Gift Systems.
    It has occurred to me that there is profit potential in building an online gift system done right. Now, there are many websites out there that handle various sorts of gift giving, wrapping and similar things, but so far they all seem to have been doing things wrong. Rather than dwell on mistakes being made out there, lets look at the desiderata of a well-constructed online gift site.

    To begin with, it seems that everyone supports wish lists, but since everyone does it differently and as there is no interchange standard its not very useful. It doesn't do you any good to know that I have a think geek wishlist, if you are shopping on a different site even if they have things that I may want. So, the proposed site will have a wish list but will explicitly let it be exported, downloaded, or queried from another site. The thing will talk standard XML and it will be well documented. If possible, it will be based on Amazon's wishlist output format (although it may need extending for some of the ways we'll want to use it).

    Lets not stop there though. Letting others easily browse the site's wishlists is only half the battle. There would also be a need to write a number of query tools that were capable of fetching wishlists from other sites and reformatting them for local use. In fact, one should be aiming to bootstrap an entire distributed wishlist database system, spread out among as many merchants as wish to participate.

    Once the various wishlists are available on site, they can be used in two different ways. First of all, they will allow for the modelling of the desires of a particular consumer, especially if using wishlists from several different specialty sites. This will allow for the selection of items that aren't specifically on the wishlist, but which may be desirable. Secondly, the site can happily buy items off of wishlists from other websites, charging a commission for having done so. The convenience to the user of having only one place to go to shop for gifts should make up for the added cost.

    Since the site will cater to convenience, it should also try to make sure that folks who register with its service don't forget any birthdays, anniversaries or other special occasions. Not only should there be a calendar on which they can record who they need to buy things for and when, it should also happily interoperate with other systems, like Yahoo's calendar.

    When a special occasion is near, an email can be sent with a gentle reminder and an URL to click on to go to a customized selection of possible gifts for the recipient. If there isn't much info available on that person yet, the gift selection will be whatever seems popular, but the system will take note of what choices the buyer makes, and may even request feedback later. The user will be encouraged to add notes about certain categories of items that a person does or does not like.

    There would also be a system set up to allow someone to request random reminders to do something thoughtful for a friend or loved one. These will also come with suggestions for gifts that that can be as simple as a surprise gift of gourmet cookies, or a complete night out including a limo, reservations at a restaurant and theatre tickets.

    There could also be way to set up a layaway plan for paying for expensive items slowly. If a user has their credit card charged by a set amount every month, they will be able to afford that ruby pendant when the big anniversary rolls around.

    Finally, if someone has set up one of these advance payment options, they could be asked to be given gifts off of their own wishlist. Depending on the choice of the user, they could get the gifts at set intervals or at random times, they could receive the next highest desired gift on their list, or one at random.

    One temptation that would have to be avoided when setting up a site like this is not to automate it too far. Only in the case of a person asking for a gift for themselves is it appropriate to completely automate the system. After all, a gift is supposed to show someone that you care and are thinking of them. An automated system that you could sign up for that would be able to remember birthdays and send out gifts without any user interaction would completely defeat that. If the site were to get the reputation of catering to the cold and uncaring who merely wish to appear to care, public opinion could kill the venture.
[ << Previous 20 ]
Pooq: Technaut's Consulting Business   About LiveJournal.com