Now that I’m using del.icio.us again I’ve made some delectable discoveries by people hopping: browsing the bookmarks of people and those of the people connected to them. It’s a lot more interesting, if you start in a good place, than following links on a series of web pages.
Once, when I studied cladistics and produced trees of evolutionary history, I recall somebody producing a cladogram of cladists. A humorous excerise of course, but grouping people together based on shared characteristics–which could now be bookmarks. Being able to browse social networks visibly organized by interest affinity would be exciting.
Today, I found this article on visualizing information in Wikipedia via someone else’s bookmark, and this blog post by Jeremy Wagstaffe of the Wall Street Journal Asia on tree visualizing tools. Delicious! But… could I get information like this to find me, or at least better facilitate its discovery?
I discovered this paper some time ago while looking for a tool to help me with an increasingly common problem: fast and efficient deduplication of partially redundant and chronologically overlapping data sets (Don’t ask! Scientists + Laptops + Network = Chaos).
Sure, storage management software in future will manage this automatically, replacing identical files with pointers, but in at least some cases this is not desirable.
There are hundreds of tools to remove duplicates from collections of individual files. All of them give a listing of files or a simple side-by-side comparison of directories. The best tools for these, respectively, (that I’ve found for Windows) are DiskState from Geekcorp and Beyond Compare from Scooter Software.
They fall down in providing a graphical view that would allow one to see identical and similar directories and structures–except in side-by-side comparisons with the starting directories already selected. Beyond Compare permits the easy identification of differences any number of levels deep:
Note the different colour of the top level folders.
I have been searching Google for “find identical directories” on and off for years and there’s surprisingly little out there for anything other than pairwise comparisons. This is odd. There ought to be a tool that would do for tree structures something like what Beyond Compare does for pairs of directories. It’s surprising because biologists, in particular phylogeneticists, have been dealing problems of representing similarities in the form of trees for years.
At the moment the most efficient way to deduplicate similar directory structures is:
1. Find identical files
If there are many from the same directory:
2. Flip to a directory comparison and eliminate duplicates
Doing this manually and repeatedly is tiresome, especially when one has to use two tools that can’t pass data. The problem is that comparisons are made on files without reference to context. Even without a tree we should be able to do better than this. It would be useful to have, e.g., a view like this:
Left pane: a list of directories (like Beyond Compare or any similar program)
Right pane: for the currently selected directory on the left: a list of directories (icons with path info) that are identical or similar (with % similarity shown, also using color)
Ideally, the directories in the left pane could be sortable by total size and perhaps by other criteria (e.g., most recent differences from directories on the right). The differences should be determined in a variety of controllable ways, as in the case of Beyond Compare, and differences deep in the structure should cause color changes higher up, as shown above. And, as with Beyond Compare, it should be possible to exclude similarities from the display.
Such a tool would be useful to any network manager but I have found I could use a tool at home too, and once home storage systems take off more people will too. Note: A nasty shortcut, worthwhile in some cases, is to zip directories and compare files.
Which brings me to: thoughts about bookmarks as files in (virtual) folders and visual representation thereof. Imagine each level in the diagram above is a person.
Related posts:
- Why is Social Tagging Still Siloed? Social bookmarks are still siloed in separate systems: Furl, del.icio.us, spurl and dozens of others. Microsoft's new TagSpace and ClaimSpace may just add to the...
- In Memoriam: systema clausum est. This story about the funeral of an old computer struck a chord. My job once, well, part of it, was to shut down an old...
- Minibar A brief write-up on London's Open Business Minibar event where startups give presentations on their business models. This event saw the launch of Ubuntu Feisty...
- Geotracking: Never Lose your Spouse in the Supermarket Again Geotracking is coming to cellphones soon. It will show you where that bus is, where in the mall your spouse is and, perhaps, even where...
- In Bed with Ariel Leve Ariel Leve, not to be confused with Ariel Levy, is an expatriate Brit who has an angst fuelled life in New York. She is a...
Related posts brought to you by Yet Another Related Posts Plugin.
