Hitwise: Web Intelligence

On jill/text I cam across an interesting graph about OpenSocial vs. Facebook showing the difference in market share. Hitwise provides statistics and analysis of internet usage. They get their data from ISPs, which sounds like it could be a privacy issue. See their Product Features for the services they provide that most of us can’t afford. See what they say about how they gather information in How We Do It or here is quote from their press release on Hanah Montana Most Searched for Halloween Costume:

Since 1997, Hitwise has pioneered a unique, network-based approach to Internet measurement. Through relationships with ISPs around the world, Hitwise’s patented methodology anonymously captures the online usage, search and conversion behavior of 25 million Internet users. This unprecedented volume of Internet usage data is seamlessly integrated into an easy to use, web-based service, designed to help marketers better plan, implement and report on a range of online marketing programs.

They have blogs by their analysts, most of whom seem to be in the UK, that have interesting notes about trends like iTunes overtakes Free Music Downloads in Internet Searches.

The LongPen

LongPen LogoThe Globe and Mail has a story about Margaret Atwood’s LongPen technology, Border no barrier for Black’s autograph pen. I remain convinced this is a really stupid idea, but I have the feeling no one else does. Exactly why would someone want to not get their book signed by telepresence. The videoconferencing with the author may be a draw, but the remote signing? The answer, according to the site is that,

According to fans, this is a more intimate experience than a traditional signing, as you are looking directly into the face of the fan, as opposed to briefly looking up from your chair when signing in person. The video conferencing also makes it easier for the fan to be expressive about your work, as the technological distance makes them less nervous.

Atwood must really hate book signing tours.

As It Happens, Privacy, and the Mechanical Turk

As It Happens on CBC Radio just played a good double segment on “Google Eyes”. The first part looked at the Amazon Mechanical Turk task looking for Steve Fossett’s plane on satellite images. The second part looked at privacy issues around street level imaging from outfits like Google.

Mechanical Turk (Artificial Artificial Intelligence) is a project where people can contribute to tasks that need many human eyes like looking at thousands of satellite images for a missing plane. It reminds me of the SETI@home project which lets users install a screen saver that uses your unused processing cycles for SETI signal processing. SETI@home is not part of a generalized project, BOINC that, like the Mechanical Turk, has a process for people to post tasks for others to work on.

The Privacy Commissioner of Canada announced yesterday that she has written both Google and Immersive Media (who developed the Street View technology used by Google) “to seek further information and assurances that Canadians’ privacy rights will be safeguarded if their technology is deployed in Canada.” The issue is that,

While satellite photos, online maps and street level photography have found useful commercial and consumer applications, it remains important that individual privacy rights are considered and respected during the development and implementation of these new technologies.

This is a growing concern among privacy advocates as a number of companies have considered integrating street level photography in their online mapping technologies.

In street level photography the images are, in some cases, being captured using high-resolution video cameras affixed to vehicles as they proceed along city streets.

Google, according to the commission on the radio, has not replied to the August 9th letter.

Plagiarism and The Ecstasy of Influence

Jonathan Lethem had a wonderful essay, The Ecstasy of Influence: A Plagiarism, in the February 2007 Harpers. The twist to the essay, which discusses the copying of words, gift economies, and public commons, was that it was mostly plagiarized – a collage text – something I didn’t realize until I got to the end. The essay challenges our ideas of academic integrity and plagiarism.

In my experience plagiarism has been getting worse with the Internet. There are now web sites like Customessay.org where you can buy customized essays for as low as $12.95 a page. Do the math – a five page paper will probably cost less than the textbook and it won’t get detected by services like Turn It In.

These essay writing companies actually offer to check that the essay you are buying isn’t plagiarized. Here is what Customessay.org says about their Cheat Guru software:

Custom Essay is using the specialized Plagiarism Detection software to prevent instances of plagiarism. Furthermore, we have developed the special client module and made this software accessible to our customers. Many companies claim to utilize the tools of such kind, few of them do and none of them offer their Plagiarism Detection software to their customers. We are sure about the quality of our work and provide our customers with effective tools for its objective assessment. Download and install our Cheat Guru and test the quality of the products you receive from us or elsewhere.

Newspapers have been running stories on plagiarism like JS Online: Internet cheating clicks with students connecting it to ideas from a book by David Callahan, The Cheating Culture (see the archived copy of the Education page that was on his site.)

There is a certain amount of research on plagiarism on the web. A place to start is the The Plagiarism Resource Site or the University of Maryland College’s Center for Intellectual Property page on Plagiarism.

I personally find it easy to catch students who crib from the web by using Google. When I read a shift in writing professionalism I take a sequence of five or so words and Google the phrase in quotations marks. Google will show me the web page the sequence came from. The trick is finding a sequence short enough to not be affected by paraphrasing while long and unique enough to find a web site the student used. This Salon article, “The Web’s plagiarism police” by Andy Dehnart, talks about services and tools that do similar things.

Perhaps the greatest use of these plagiarism catching tools is that they might show us how anything we write is woven out of the words of others. It’s possible these could be adapted to show us the web of connections radiating out from anything written.

Note: This entry was edited in Feb. 2018 to fix broken links. Thanks to Alisa from Plagiarism Check for alerting me to the broken links.

WikiScanner

The media has been reporting on a neat tool that Virgil Griffith developed called the WikiScanner which scans the Wikipedia for entries edited by a particular domain. This has allowed people to find that people at the Department of Defence, for example, are hard at work editing the entries for abortion and the pill. The BBC has a story at Wikipedia ‘shows CIA page edits’. Wired has a story, See Who’s Editing Wikipedia – Diebold, the CIA, a Campaign. Wired also has place you can submit interesting Wikipedia Spin Jobs.

All this raises the question of what/who is a legitimate Wikipedia author? Is there something wrong with a company editing its own entry? Isn’t the point of the open enditing to let all the various interests out there negotiate the entries?

I should add that the WikiScanner is an example of the unexpected uses of datamining. It uses information no one expected could be mined and combined to produce interesting results that can be interpreted.

CAUT: Email Outsourcing Threatens Privacy & Academic Freedom

The Canadian Association of University Teachers recent Bulletin has a timely story about Email Outsourcing Threatens Privacy & Academic Freedom. The story is about Lakehead University switching over to Gmail. The switch means that students and faculty now have gigbytes of email space as opposed to the megabytes they had from the campus run service (a situation similar to what we have at McMaster.) The switch also raises privacy concerns because Google’s terms of use includes the following:

As a condition to using the Service, you agree to the terms of the Gmail Privacy Policy as it may be updated from time to time. Google understands that privacy is important to you. You do, however, agree that Google may monitor, edit or disclose your personal information, including the content of your emails, if required to do so in order to comply with any valid legal process or governmental request (such as a search warrant, subpoena, statute, or court order), or as otherwise provided in these Terms of Use and the Gmail Privacy Policy. Personal information collected by Google may be stored and processed in the United States or any other country in which Google Inc. or its agents maintain facilities. By using Gmail, you consent to any such transfer of information outside of your country.

As Google ads functionality so that they can offer more than just email I suspect this problem will be more acute. Soon we might see universities outsourcing calendar, word processing, spreadsheets, and web site functions.

Sweden upstaged by Maldives in virtual diplomacy

Sweden is the second country to open an embassy in Second Life according to this story from the Associated Press, Sweden upstaged by Maldives in virtual diplomacy. The Maldives beat them to it by a couple of weeks. What is interesting is that the embassay will feature an exhibit about Wallenberg.

It provides visitors with information about Swedish culture and history, as well as tips about places to visit and visa rules. It will also host exhibits, including a virtual version of the Budapest office of Swedish diplomat Raoul Wallenberg, who helped thousands of Jews escape Nazi-occupied Hungary during World War II.

Thanks to Jean-Claude Guedon, who told me about this yesterday.

Buying friends online

Do you want more friends for your MySpace presence? The Globe and Mail has an article by Keith McArthur, Trouble making friends online? Buy them (May 22, 2007) about services known as “friend trains” that help people make lots of friends. These services are selling enhanced access so that people can get lots of friends faster. Companies that use Facebook for viral marketing can then get lots of friends who then get their feed.

Obviously you may not be able to buy love, but you can by friends.