As It Happens, Privacy, and the Mechanical Turk

As It Happens on CBC Radio just played a good double segment on “Google Eyes”. The first part looked at the Amazon Mechanical Turk task looking for Steve Fossett’s plane on satellite images. The second part looked at privacy issues around street level imaging from outfits like Google.

Mechanical Turk (Artificial Artificial Intelligence) is a project where people can contribute to tasks that need many human eyes like looking at thousands of satellite images for a missing plane. It reminds me of the SETI@home project which lets users install a screen saver that uses your unused processing cycles for SETI signal processing. SETI@home is not part of a generalized project, BOINC that, like the Mechanical Turk, has a process for people to post tasks for others to work on.

The Privacy Commissioner of Canada announced yesterday that she has written both Google and Immersive Media (who developed the Street View technology used by Google) “to seek further information and assurances that Canadians’ privacy rights will be safeguarded if their technology is deployed in Canada.” The issue is that,

While satellite photos, online maps and street level photography have found useful commercial and consumer applications, it remains important that individual privacy rights are considered and respected during the development and implementation of these new technologies.

This is a growing concern among privacy advocates as a number of companies have considered integrating street level photography in their online mapping technologies.

In street level photography the images are, in some cases, being captured using high-resolution video cameras affixed to vehicles as they proceed along city streets.

Google, according to the commission on the radio, has not replied to the August 9th letter.

The ECAR Study of Undergraduate Students and Information Technology, 2007

The ECAR Study of Undergraduate Students and Information Technology, 2007 is a study from the EDUCAUSE Center for Applied Research about undergraduate IT experiences. They describe the study thus,

This 2007 ECAR research study is a longitudinal extension of the 2004, 2005, and 2006 ECAR studies of students and information technology. The study, which reports noticeable changes from previous years, is based on quantitative data from a spring 2007 survey and interviews with 27,846 freshman, senior, and community college students at 103 higher education institutions. It focuses on what kinds of information technologies these students use, own, and experience; their technology behaviors, preferences, and skills; how IT impacts their experiences in their courses; and their perceptions of the role of IT in the academic experience.

The Executive Summary doesn’t include any surprising insights. I conclude that faculty need to a) keep up with technology like social networking because their students expect it; b) we need to use appropriate IT effectively, but not too much; c) and we need to keep the F2F, because the IT just doesn’t do it all.

While most respondents are enthusiastic IT users and use it to support many aspects of their academic lives, most prefer only a “moderate” amount of IT in their courses (59.3 percent). This finding has been consistent over the past three years’ studies, and students continue to tell us that they do not want technology to eclipse valuable face-toface interaction with instructors. (ECAR Research Study, P. 13)

The irritating thing about the report is that whoever wrote the Executive Summary doesn’t seem to get it. They start with that 1980s type of “the world is changing dramatically” hype which like any calling “wolf” ceases to work after a while. Here is the opening,

Chris Dede’s Introduction to this study argues that the ongoing technology revolution is driving a sea change in communicating, teaching, and learning. Further, while faculty and institutions have automated conventional forms of instruction and made some steps in using technology to expand the range of students’ academic experiences, we have barely scratched the surface. (p. 9)

The results reported, like those on the importance of face time, suggest that we have thoroughly scratched the surface and discovered that IT is only good for certain things. It helps learning, it is convenient, it adds a way of communicating, but it isn’t that engaging compared to a real face. When will EDUCAUSE give us cause to think they are capable of a balanced opinion on technology and education? Who believes that IT is necessarily going to change education any more?

Stop Spam Here – Combat Spam, Spyware and Phishing

Go to Stop Spam Here WebsiteStop Spam Here – Combat Spam, Spyware and Phishing is a web site that grew out of the Task Force on Spam who released a report in 2005 that includes recommendations like:

10. … the federal government should lead in establishing a Canadian spam database (i.e. the “Spam Freezer”).

15. As part of its ongoing effort to increase user awareness and education, the federal government, in cooperation with interested stakeholders, should continue to promote the “Stop Spam Here / Arr?™tez le pourriel ici” user-tips campaign by encouraging others to link to these websites, and through the use of other appropriate methods and media. (Executive Summary)

The report says that spam has hit a ratio of 80% of global e-mail and there seems to be nothing we can do about. The Stop Spam Here web site suggests three tips to protect yourself:

  • Protect your computer with virus scanning software and a firewall.
  • Protect your email address and use expendable addresses.
  • Don’t respond to spam.

The first and third don’t seem likely to make much of a difference. The second option is a form of giving up – it accepts that we have to keep multiple addresses and manage them. The alternative is to stop using e-mail and switch to a secure environment like Facebook where only friends can message you. Is it surprising that youth are not using e-mail the way we do?

ForensicXP for forensic document analysis

Software Screen ImageForensicXP is a device that does forensic document imaging. It combines 3D imaging with chemical analysis to do Hyperspectrum Imaging and Processing. This can be used to recover “obliterated” writing, to figure out the sequence of line drawing (what lines/words were drawn first), and to detect additions and substitutions. Obviously it also helps identify the chemistry (ink) used.

Thanks to John for this.

Plagiarism and The Ecstasy of Influence

Jonathan Lethem had a wonderful essay, The Ecstasy of Influence: A Plagiarism, in the February 2007 Harpers. The twist to the essay, which discusses the copying of words, gift economies, and public commons, was that it was mostly plagiarized – a collage text – something I didn’t realize until I got to the end. The essay challenges our ideas of academic integrity and plagiarism.

In my experience plagiarism has been getting worse with the Internet. There are now web sites like Customessay.org where you can buy customized essays for as low as $12.95 a page. Do the math – a five page paper will probably cost less than the textbook and it won’t get detected by services like Turn It In.

These essay writing companies actually offer to check that the essay you are buying isn’t plagiarized. Here is what Customessay.org says about their Cheat Guru software:

Custom Essay is using the specialized Plagiarism Detection software to prevent instances of plagiarism. Furthermore, we have developed the special client module and made this software accessible to our customers. Many companies claim to utilize the tools of such kind, few of them do and none of them offer their Plagiarism Detection software to their customers. We are sure about the quality of our work and provide our customers with effective tools for its objective assessment. Download and install our Cheat Guru and test the quality of the products you receive from us or elsewhere.

Newspapers have been running stories on plagiarism like JS Online: Internet cheating clicks with students connecting it to ideas from a book by David Callahan, The Cheating Culture (see the archived copy of the Education page that was on his site.)

There is a certain amount of research on plagiarism on the web. A place to start is the The Plagiarism Resource Site or the University of Maryland College’s Center for Intellectual Property page on Plagiarism.

I personally find it easy to catch students who crib from the web by using Google. When I read a shift in writing professionalism I take a sequence of five or so words and Google the phrase in quotations marks. Google will show me the web page the sequence came from. The trick is finding a sequence short enough to not be affected by paraphrasing while long and unique enough to find a web site the student used. This Salon article, “The Web’s plagiarism police” by Andy Dehnart, talks about services and tools that do similar things.

Perhaps the greatest use of these plagiarism catching tools is that they might show us how anything we write is woven out of the words of others. It’s possible these could be adapted to show us the web of connections radiating out from anything written.

Note: This entry was edited in Feb. 2018 to fix broken links. Thanks to Alisa from Plagiarism Check for alerting me to the broken links.

Flores: Upload, Download

Image of Art WorkWhile in Berlin I took some pictures in a gallery of an art installation upload download overload by Dolores Flores. I took pictures of Upload, Overload and Download for the Dictionary of Words in the Wild. What I didn’t realize until I checked the web about the installation, was that the little paper figures were significant. They are “beer people” and, as the web site explains,

The “beer people” were created in the year 1993, in an old Berliner beer “Kneipe”.

Made out of the paper ring that is attached to the bottom of a fresh tapped pilsner beer glass, to absorb the foam drippings. This paper ring was the inspiration behind the making of the “beer people”. Each figure is a unikat, torn, ripped and folded.

As they say, “Do not fold, spindle or mutilate”.

Harley: The Journal of Electronic Publishing

The Journal of Electronic Publishing has an artile, The Influence of Academic Values on Scholarly Publication and Communication Practices, which nicely summarizes the state of perceptions about electronic publishing. The article doesn’t talk much about how they arrived at their conclusions, but the conclusions strike me as likely. Some of the conclusions worth noting for digital humanists:

  • Peer review is still important for tenure and promotion, which makes it difficult for un-reviewable works to be treated as scholarly contributions.
  • Academics are worried there is too much stuff on the web and that lower costs of publication lead to lower standards. Therefore print peer reviewed publications are still taken more seriously and online peer reviewed publications are still viewed as less important.
  • Online publication is seen as a way to make a name, while print publication is seen as how you get tenure.
  • Print is seen as more archival and therefore the best place for finished work while online publication is seen as less likely to survive and therefore better suited to scholarly communication. This, by the way, accords with The Credibility of Electronic Publishing report that I contributed to. Print does seem to last longer and therefore is better suited to final archival publication.

Kirschenbaum: Hamlet.doc?

Matt Kirschenbaum has published an article in The Chronicle of Higher Education titled, Hamlet.doc? Literature in a Digital Age (From the issue of August 17, 2007.) The article nicely summarizes teases us with the question of what we scholars could learn about the writing of Hamlet if Shakespeare had left us his hard-drive. Kirschenbaum has nicely described and theorized the digital archival work humanists will need to learn to do in his forthcoming book from MIT Press, Mechanisms. Here is the conclusion of the Chronicle article,

Literary scholars are going to need to play a role in decisions about what kind of data survive and in what form, much as bibliographers and editors have long been advocates in traditional library settings, where they have opposed policies that tamper with bindings, dust jackets, and other important kinds of material evidence. To this end, the Electronic Literature Organization, based at the Maryland Institute for Technology in the Humanities, is beginning work on a preservation standard known as X-Lit, where the “X-” prefix serves to mark a tripartite relationship among electronic literature’s risk of extinction or obsolescence, the experimental or extreme nature of the material, and the family of Extensible Markup Language technologies that are the technical underpinning of the project. While our focus is on avant-garde literary productions, such literature has essentially been a test bed for a future in which an increasing proportion of documents will be born digital and will take fuller advantage of networked, digital environments. We may no longer have the equivalent of Shakespeare’s hard drive, but we do know that we wish we did, and it is therefore not too late ‚Äî or too early ‚Äî to begin taking steps to make sure we save the born-digital records of the literature of today.

Mashing Texts and Just in Time Research

Screen Shot from PowerPointWith colleagues Stéfan Sinclair, Alexandre Sevigny and Susan Brown, I recently got a SSHRC Research and Development Initiative grant for a project Mashing Texts. This project will look at “mashing” open tools to test ideas for text research environments. Here is Powerpoint File that shows the first prototype for a social text environment based on Flickr.

From the application:

The increasing availability of scholarly electronic texts on the internet makes it possible for researchers to create “mashups” or combinations of streams of texts from different sources for purposes of scholarly editing, sociolinguistic study, and literary, historical, or conceptual analysis. Mashing, in net culture, is reusing or recombining content from the web for purposes of critique or creating a new work. Web 2.0 phenomena like Flickr and FaceBook provide public interfaces that encourage this recombination (see “Mashup” article and Programmableweb.com.) Why not recombine the wealth of electronic texts on the web for research? Although such popular social networking applications as mashups seem distant from the needs of humanities scholars, in many ways so-called mashups or repurposing of digital content simply extend the crucial principle developed in humanities computing for the development of rich text markup languages: that content and presentation should be separable, so that the content can be put to various and often unanticipated uses.

Mashing Texts will prototype a recombinant research environment for document management, large-scale linguistic research, and cultural analysis. Mashing Texts proposes to adapt the document repository model developed for the Text Analysis Portal for Research (TAPoR) project so that a research team interested in recombinant documents can experiment with research methods suited to creating, managing and studying large collections of textual evidence for humanities research. The TAPoR project built text analysis infrastructure suited to analysis of individual texts. Mashing Texts will prototype the other side of the equation ��� the rapid creation of large-scale collections of evidence. It will do this by connecting available off-the-shelf open-source tools to the TAPoR repository so that the team can experiment with research using large-scale text methods.