Friday, July 23, 2004
Earlier this week I posted excerpts from the lead article in the current issue of CiSE. The article was titled, "Managing XML Data: An Abridged Overview," which is a good, accurate title. The excerpts contain useful links, too.
The second article is titled, "Information Retrieval Techniques for Peer-to-Peer Networks." Fortunately, a full-text PDF copy of this paper can be accessed at either http://dblab.cs.ucr.edu/
, although the URL for the former looks a little bit too generic and might change at a moment's notice (also, the two papers are slightly different). I have 19 bookmarks on my smartphone for this paper, but I guess I can summarize by saying that IR for P2P networks is hard and very different from "traditional" search
. The last statement actually says a lot -- read between the lines
. This paper covers all the usual suspects and also includes Skype. This paper is based upon the lead author's Master's thesis
which can be accessed from http://tinyurl.com/696ml
. Other papers by the lead author
can be accessed at http://tinyurl.com/43kkh
. This is an important issue which needs to be resolved, especially as collaborative grid computing (CGC) comes to life.
Two figures; 20 references (28 references in the preprint).
Less luck with the paper titled, "Web Searching and Information Retrieval," i.e., I couldn't find a free copy on the Web. The author's site is woefully outdated, too. The author does speak favorably of a particular approach to decentralized P2P web crawling called "Apoidea." A copy of a paper describing Apoidea
can be accessed at http://tinyurl.com/4m2v5
; accompanying slides
can be accessed at http://tinyurl.com/4b4sh
. As described in the CiSE
paper, "Apoidea is both self-managing and uses the resource's geographical proximity to its peers for a better and faster crawl."
Two figures; 21 references.
"Web Mining: Research and Practice" is not available, either, but a lot of excellent info on the senior author's projects related to this paper is available. First, take a look at the eBiquity research areas
. Next, you may want to take a look at the abstracts for papers published
as part of the eBiquity Group at http://tinyurl.com/5om58
(current through December 2004 -- it doesn't get more current!!). Move on to their "Semantic Web" page
. I then downloaded a PDF copy of their paper titled, "Mining Domain Specific Texts and Glossaries to Evaluate and Enrich Domain Ontologies
" (see http://tinyurl.com/3lg2m
). It looks like a relatively recent paper, newer than the CiSE
paper (different authors and different subject matter, though). The PDF is part of their Semantic Web research, whereas the CiSE
paper is more "generic." Anyway, the "Web Mining" paper is another call for distributed mining techniques
, and covers fuzzy clustering as well as content-based recommender systems -- but doesn't forget good 'ol HITS (Hyperlink-Induced Topic Search), the basis for IBM's Clever and Google (to a certain extent).
No figures; 31 references.
Finally, "Intelligent Agents on the Web: A Review" was very disappointing. The lead author has impeccable credentials, but his paper is based on yesterday's news: Old, outdated, buried stuff (like Firefly). Matter of fact, the only live link I can recall finding was Recursion Software's "Voyager" home page
), which states that the "Voyager applications development platform provides the software layer which handles communications across the network for distributed JAVA applications." (Looks interesting.)
As the chair of the Internet and Web applications session of the First International Conference on Autonomous Agents (1996), I have a soft spot for agent-oriented everything (especially Web apps). I remember an old saying from IJCAI (International Joint Conference on Artificial Intelligence) in the mid-70's: Artificial intelligence is better than none. (I probably still have a button with this saying somewhere.) I'm keeping the faith, sans the hype and more toward the realities of software agents. BTW, this CiSE paper isn't bad if you don't have any background in this space. It covers the basics, such as ACLs, but with an "updated" perspective.
No figures; 27 references.
The Ultimate Killer App
BTW, the "Ultimate Killer App" is attached and in some browsers it will automatically download. (See the bottom of this message.) You have to admit, this really is the ultimate killer app!!
I've never sent an attachment this way simultaneously to both my e-newsletter and blogs (and blog variants). Just in case the attachment isn't included, I've uploaded it to the "Photos" section of the e-newsletter (see http://tinyurl.com/2r3pa
>> Note to AlwaysOn
readers: You'll need to go to the e-newsletter ( http://tinyurl.com/2r3pa
) in order to see the "Ultimate Killer App." You can try the blogs, but no guarantees.
Tidbits on Enterprise Software
.NET wins converts.
For the VARBusiness story
. Evans Data reports that .NET usage showed a sharp YoY increase in adoption with 52% saying they use .NET and 68% saying they plan to deploy .NET apps by 2005. In May, Forrester reported that 56% of developers consider .NET their primary development environment contrasted with 44% for J2EE
. (It must have been a binary choice!) VARBusiness
found in a May survey that 53% have already deployed a .NET app and 66% plan to do so within the next 12 months. In the VARBusiness
survey, the most important reasons for going with .NET were ease of use and quicker time to market. A developer goes on to state that .NET development time is to Java what Java is to C++. (Wow, what a claim!)
Python and Perl beat Java
? (See http://tinyurl.com/44m5t
for the PDF file.) Actually, an indirect "attack" against all "mainstream" programming languages, notably Java, C and C++. The idea is that the "mainstream" languages are ill-suited for many distributed computing and integration apps. Gives a "thumbs up" to Python, Perl and PHP
, with a peek at PEAK -- the Python Enterprise Application Kit. (Sorry for the pun.) PEAK's developers claim future superiority over J2EE. They also knock Java for not being suited to rapid application development. PEAK's developers believe a Python-based approach to component-based apps will result in systems that are simpler, faster and easier to install, manage and maintain than variants in J2EE. PEAK, however, is still immature.
- Should the IT agenda include investment in outsourcing technologies or services?
- Does the future of the business include operations in, or electronic trade with, additional countries - China, for example?
- Are the services of an outside provider being considered to help in managing proliferating applications or complex "interenterprise" business relationships?
- What role will utility computing play in the future of IT?
(All items in bold are my emphasis.) The article goes on to discuss various ways of evaluating ROI, including one of my favorite ways, ROA (real options analysis).
TTFN. Have a GREAT weekend!