Social vs. Paper-Based Text Annotation in an ESL Class

The focus of this project was to develop students’ close reading ability by having them socially annotate texts via The students in the course are learners of English as a Second Language at the high-intermediate level. Most of these students (85%) plan to pursue a degree at an American university in the following year, with all but one student at the graduate level. Despite their aspirations and relatively strong conversational skills, these students struggle to understand academic texts. They may be able to understand the main idea of an article and can often identify the topics discussed in a text, but they often demonstrate quite limited close reading ability. Close reading implies an understanding of the text based on on-going interpretation of text elements like transition signals, organization and connotation. To some extent, this is due to a lack of language knowledge. (For example, students may not understand the author’s tone because they don’t know the difference between challenge and obstacle. They may not know some vocabulary altogether, including important transition signals like nevertheless or otherwise. Grammatical elements like verb tenses, negatives, and conditionals can also greatly impact students’ understanding of a text.) At least in part, students’ lack of close reading ability is due to a lack of experience. Though most of the students in this course have completed a bachelor’s degree and some even a master’s degree, many come from educational systems and cultures which value oral expression and rote learning. In this course, 77% of the students were from Saudi Arabia, which largely fits this mold (e.g., Elyas & Picard, 2010). Students who have not been trained in reading strategies may not realize, for example, that thinking about the relationship between paragraphs can enhance comprehension. Even among American students—who arguably come from a school system that de-emphasizes rote learning and memorization—students often focus on surface learning (to remember), rather than deep learning (to understand) (Roberts & Roberts, 2008).

Many teachers and researchers hypothesize a strong link between text annotation and reading comprehension (e.g., Simpson & Nist, 1990; Porter-O’Donnell, 2004). While it is debatable whether developing annotation abilities positively impacts close reading, text annotation at a minimum can provide concrete evidence of close reading processes at work, which is helpful in an instructional context as a basis for discussing and illustrating reading comprehension strategies. Thus, in my reading/writing course, I require students to devise a personal annotation system for marking aspects of the text including main ideas, supports, relationships, personal connections, and questions. Although some students have responded quite positively and constructively, annotation is generally a solitary endeavor in which struggling students are largely left to struggle. provides a platform for engaging in cooperative annotation, so potentially students can model annotation for each other and work together to extend their comprehension of the text.

  • Elyas, T., & Picard, M. (2010). Saudi Arabian educational history: Impacts on English language teaching. Education, Business and Society: Contemporary Middle Eastern Issues, 3(2), 136-145. doi:
  • Porter-O’Donnell, C. (2004). Beyond the Yellow Highlighter: Teaching Annotation Skills to Improve Reading Comprehension. The English Journal, 93(5), 82–89. doi:10.2307/4128941
  • Roberts, J. C., & Roberts, K. A. (2008). Deep Reading, Cost/Benefit, and the Construction of Meaning: Enhancing Reading Comprehension and Deep Learning in Sociology Courses. Teaching Sociology, 36(2), 125–140.
  • Simpson, M. L., & Nist, S. L. (1990). Textbook Annotation: An Effective and Efficient Study Strategy for College Students. Journal of Reading, 34(2), 122–129.

By comparing the annotations and performance of students in a diagnostic reading evaluation, the tasks, and the end-of course reading evaluation, and considering these in light of students survey responses about their reading habits and their opinions of the task, I hope to identify whether and/or how social annotation might be effective scaffolding for developing readers. Specifically, my research questions include the following:

  1. How does social annotation compare with individual annotation in terms of number of annotations, types of annotations, quality of annotations, and time? Is there any evidence of an impact on student engagement?
  2. How do students perceive social annotation and individual annotation?
  3. What evidence is there for a link between annotation and close reading ability? Does student performance show a correlation between annotation and close reading ability?

Data collection began at the outset of the semester with a paper-based diagnostic along with a questionnaire on students’ reading and annotation habits. The reading text for the diagnostic was “Society and the Sexes” (680 words), excerpted from a college introduction to sociology text and reprinted in a high-intermediate ESL textbook. In weeks 7-8 of the semester, students had an initial training session on in which they signed up for accounts, located a text that I had posted on a class page, and independently but synchronously posted annotations to the text, an essay by Jeffery Sachs entitled “Message to Wall Street” (992 words). It was the final reading from a unit in our course textbook, but it was originally published in the Huffington Post. In week 15, as “practice” for their end-of-course reading evaluation, students again participated in synchronous annotating on a text entitled “In India, Caste Discrimination Still Plagues University Campuses” (1,626 words) and published in The Chronicle of Higher Education. Immediately following the reading and note-taking session, students completed reading comprehension questions; they were allowed to view the annotations on Genius but were not allowed to post additional annotations. The “practice” with Genius was followed by an individual, paper-based reading evaluation of the article “Research Ties Economic Inequality to Gap in Life Expectancy” (1,259 words) published in the Washington Post. Students were given the reading comprehension questions at the same time as the article text, but the questions generally required an understanding of the full text and students were informed that their note-taking would be one criteria of evaluation on this final test. The amount of time with each text was not standardized but was roughly between 60 and 90 minutes. Approximately two days after using Genius and one day after the final reading evaluation, students completed a survey on their opinions on the use of the technology in class.

To answer the first research question—How does social annotation compare with individual annotation in terms of number of annotations, types of annotations, quality of annotations, and time? Is there any evidence of an impact on student engagement?—I compared student annotations on the readings to the final end-of-course paper-based reading evaluation.

The difference in the number of annotations was significant, surprisingly so. On the initial text, “Message to Wall Street,” thirteen students produced 69 total annotations. On the second and final Genius text, “India,” twelve students produced 48 annotations. In contrast, on the final reading evaluation, students produced a median 61 annotations each (with a range from 19 to 131 annotations per student), for a total of 857 annotations. At least three factors likely played a part in this wide difference. First, though students did have a training session on Genius, it was still relatively new technology and not always intuitive to use. Second, Genius does not allow for simple underlining (or highlighting), which was the most frequent annotation on the paper text by far. And third, the public nature of annotations in Genius likely worked as an inhibitor for many students.

It quickly became evident that the different formats invited vastly different types of annotations. For example, annotations may be explicit (easily understood by others) or telegraphic (coded with personal markings) (Marshall, 1997). A note in the margin that anyone could read and understand is explicit, whereas underlining is telegraphic; another reader cannot be sure about the reason for or the purpose of the underline. While 100% of annotations on were explicit by their very nature—students were annotating with and for an audience and the platform does not encourage telegraphic marks—the vast majority of annotations on the paper-based evaluation were telegraphic. Of 857 total annotations on 13 paper-based evaluations, only 82 markings (9.6%) were explicit.

Table 1: Annotations Posted to Texts in

In Genius, nearly all annotations involved vocabulary, background knowledge, and paraphrases. Approximately half of all the annotations were vocabulary related. These included both questions and explanations. Another quarter or so were annotations related to background knowledge, including cultural references and finance jargon. Several annotations were paraphrase, though the number reduced significantly from the training text (26% of all annotations) and the final text (8% of all annotations). On the one hand, these paraphrases could be understood as close reading, a valuable skill; on the other hand, it can be an indication that students are engaged in decoding the text bit by bit, which is often a strategy of lower-level readers who are not attempting to construct a larger view of the text. The remaining nine annotations in both texts included prediction, explanation of pronoun referents, remarks on text structure, conclusion based on evidence within the text, and connection to another text.

The annotations on the final paper-based reading evaluation, in contrast, showed a marked focus on annotating main ideas and supporting details, and significantly less focus on vocabulary. Looking solely at the explicit annotations, it appears students were much less likely to make a vocabulary-related annotation, roughly just as likely to write a paraphrase, and much more likely to note the topic or function of a segment of text. Significantly, 3 of the 13 students did occasionally have an annotation in their first language (L1); these were likely direct translations of words, given their length. In contrast, there was no translation on the text even though 10 of the 13 students shared a first language.

Table 2: Explicit Annotations on the Paper-Based Evaluation

As noted above, about 90% of marks on the final paper-based reading evaluation were telegraphic, meaning it is difficult to discern what the student’s purpose was in marking the text. Still, it is apparent that while some students marked main ideas, the vast majority of marks were on small details. For instance, more than half of all students marked more than 10 numbers in the text, with one students marking as many as 30. Similar, 3 students marked more than 10 different names, with one student marking names in 23 instances. In this 88-line text, the median number of lines marked in some way was 40, or nearly half of all lines. The most common marking was an underline, followed by a circle, with squiggly underlining, double underlining, arrows, brackets, boxes, highlighting and stars also common.

Table 3: Annotations on the Paper-Based Evaluation

Given the different contexts for the paper-based and Genius texts, it is not possible to draw any conclusions about time efficiency in taking notes. One repeated complaint, however, was that the Genius platform was slow to update and that the text color was difficult to read. This suggests that the efficiency of taking notes online is highly platform dependent.

The two platforms also heavily influenced potential evidence of student engagement. Clearly, student annotations were much more extensive on the paper text. Yet, the Genius platform allowed students to engage one another: 26% of annotations on the initial Genius text and 56% on the final text were either a question or a response to a question or annotation. As the teacher, I have no anecdotal data or observations to point to higher student engagement with one of the platforms—student tended to voice frustrations about the general skill of reading and not the platform—and the students made no remarks comparing the platforms in course surveys or evaluations.

Table 4: Student Interactions in

To answer the second question—How do students perceive social annotation and individual annotation?—I reviewed student responses to the initial questionnaire on student reading habits and their responses on the course and ITEL surveys administered at the end of the semester, as well as oral remarks made by students.

Again, students made no written remarks comparing social and individual annotation; still, their responses suggest they find both appropriate and useful. On the ITEL survey, 7 of 11 respondents agreed or strongly agreed that was “a good complement to the content.” (Nevertheless, an equal number of students would not recommend to other students. The written comments suggest this is primarily a response to a dislike for the web design (and the color of the text—white on black background—in particular) as well as slow computer functioning.) On the end-of-course survey, students were asked questions about their reading habits similar to the questions asked at the beginning of the course. Of the 10 students who completed the survey, 9 agreed that “taking notes on readings is useful” and that “taking notes on readings helps you understand the text better.” Students specifically commented that taking notes helps them understand main ideas, remember details, summarize, locate information easily, understand the sequencing of the text, and review. Students did not, however, note other positive attributes of note-taking including feeling more engaged with the text or fostering more ideas for writing. In the pre- and post-course surveys, students also identified the reasons they might not take notes, including time constraints, difficulty understanding the text, lack of knowledge about how to take notes, and having a recreational purpose for reading. In the pre-course survey, two students also commented that they prefer to read without the interruption of taking notes; no similar comments were made in the post-course survey. Eight of 10 students agreed that this class has “changed your mind about taking notes” and explained their answer by making a positive remark about note-taking (The other two students’ minds were not “changed” because they already saw value in taking notes.) Nevertheless, only 3 of the 10 students answered that they “often” take notes “if your teacher doesn’t make you”; 4 students do “half the time’ and 4 students do “rarely.”

To answer the third question—What evidence is there for a link between annotation and close reading ability? Does student performance show a correlation between annotation and close reading ability?—I compared the extent/type of annotations on the texts in the diagnostic and final reading evaluations with the student’s performance on each assessment. Using the Excel data analysis tool, I calculated correlation coefficients between each of the following variables: total number of annotations, number of explicit annotations, vocabulary-related annotations, detail annotations (numbers and names), first language annotations, number of different types of annotations, number of lines annotated, and scores on the comprehension questions including subparts (main idea, close reading, vocabulary, paraphrasing, argumentation, short answer). I also rated the quality of each student’s notes on a scale of 1 to 5. There was no clear correlation between a student’s text annotation and their performance on the reading evaluation.

Some variables did show a correlation; however the finding was not replicated in both the diagnostic and final assessments, so even a preliminary conclusion would require further investigation. The most statistically significant finding was a negative correlation on the final evaluation between the score on the main idea section of the evaluation and text annotations in a first language, r(11)= -.76, p<.01. The final assessment also showed a statistically significant correlation between the number of vocabulary annotations and the vocabulary score, r(11)=.58, p<.05. Given the small sample size and the lack of an instrument with established reliability and validity, it is not surprising that the findings were not more informative or conclusive. Still, they do point to potential avenues for further study.

As I began to use in my class, I saw the implementation fall short, and I came to realize that the purpose of annotating in Genius was not entirely clear to me—and thus it was at least somewhat opaque for my students. I suggested to them that by co-annotating, together the students would be able to puzzle out a challenging text—and this was true; students indicated that their understanding of “Message to Wall Street” went from an initial estimated 20% to about 50% after annotation, finally reaching 70-80% after answering comprehension questions. Nevertheless, the experience felt like a frustrating episode of the blind leading the blind. The constructive modeling I had hypothesized might take place did not seem to materialize. Also, in my mind, I had assumed a correlation between annotation skills on a paper-based text and on Genius, but given the differences in telegraphic and explicit marks on each platform, it is unclear what the correlation might be, if one indeed exists. While using Genius did not detract from students’ learning, it did not appear to substantially add to it, and it did introduce an additional small element of frustration.

As in reading, the purpose of annotating must be clear and authentic. As a student, my primary purpose for annotating has been to aid understanding and retention of ideas in a text. As an avid reader, my primary purpose for annotating is to engage in a personal dialogue with the text. Yet as English language learners (in a non-content-based course), my students are generally not trying to learn content. And as indicated in the data above, the students never expressed awareness or interest in dialoguing with a text. Unlike students in a literature course, they are also generally not trying to interpret a text. They are simply trying to understand it. As their teacher, I am trying to move them beyond reading as decoding word-by-word and help them experience how noticing the structure of the text, transition signals, and patterns within the text can build understanding and how making connections within and beyond the text can increase retention, foster learning, and spark ideas for writing. Can help to get them there? Honestly, I’m still not sure. It seems likely that my students’ cognitive load in reading is already so high that an unwieldy technological interface is more hindrance than help. What is clear is that social annotation does not play the same role in reading as personal annotation, and that social annotation must have a purpose.

If I do use again with my students, the implementation will be much more structured. I will make a much greater effort to provide models for my students of appropriate and inappropriate annotations and to have them read Genius-annotated texts before attempting to post their own annotations. The texts on Genius will also be shorter to aid navigation. Based on student feedback during the Genius class session, I would encourage students to discuss the text face-to-face before or while they post annotations. Perhaps pairs or small groups could discuss the text and mark it through a series of lenses (vocab, inference, structure, tone, background information/allusions) to develop their awareness of different aspects of reading. In this project, the students seemed to feel constrained by having to communicate via a computer; they asked if they could talk aloud with their classmates about the text, and when I said no (so that students would be pushed to engage with all their classmates rather than a select few), several still had intermittent communications in whispers. Though I initially dismissed the idea of having my students annotate public texts because of their language and reading skills, perhaps annotating a text about foreigners for an American audience would be a worthwhile and purposeful integrated reading/writing project. However implemented, it has become clear that using Genius in the classroom requires a substantial investment of classtime, and I will have to more seriously consider if I think the payoff will be worth it.

One frustration in this project has been the lack of resources geared toward my teaching context and the lack of awareness in the education community about the needs of English language learners (despite high immigrant populations around the country in all levels of schooling). has an active and enthusiastic “Educator” group, mostly made up of secondary school and college English literature teachers. Though I predicted there may be some differences and some challenges using this technology in an ESL context, advice from other Educators fell short. For instance, I received multiple suggestions to refer my students to “A Students’ Guide to Genius” ( for tips about how to annotate a text. This guide would need to be heavily scaffolded to be accessible for my students, and it discourages practices like paraphrasing a text or asking questions, which I generally encourage my students to do.

In interactions with the Genius community, the gap between research, teachers, and resources has also become more evident. I wrote to both the “Education Czar” at Genius and on the Genius Educator forum to request references linking annotation and reading comprehension. Though teachers praise Genius for facilitating reading, these teachers—like myself—are basing their methods on ‘logical’ assumptions. The resources they referred me to were mostly teacher tips on how to read closely. Given the amount of money invested in and the enthusiasm for using it for learning purposes, the lack of research-based support available within this community is a bit astounding (Related:

In terms of my future teaching, this project did help highlight the extent to which my students are over-focusing on minor details. In the Genius texts, half of the annotations were related to vocabulary. In the paper-based texts, students marked details, including names and numbers, excessively. These findings suggest that I need to emphasize to students that, despite having a system for marking details, not all details need or should be marked. I also may need to re-consider if the difficulty of the texts is appropriate for the students I teach at the high-intermediate level. On the final paper-based reading evaluation, I would assess only two students as having effective annotations, while four had completely ineffective notes and the rest fell in a spread between. Less than a handful of annotations suggested a student might be engaged in reasoning or in making their own connections within or beyond the text—which is what I hope to inspire students to begin to do. The lingering question for me is how to begin to make this kind of engagement feel authentic, and can Genius play a role?

I have no plans to share this work at the current time. I did participate on a Google Hangout on October 15, 2014 with several high school and college English teachers to discuss our experiences using in the classroom (hangout hosted by, available for viewing at

I do plan to create a bibliography—perhaps an annotated bibliography—of research on social annotation and annotation and close reading for distribution to Educators. (The references, which I have collected in Zotero, began with a lit review conducted by a CNDLS graduate assistant. Followed the trail of those 4 original sources, I have generated a focused reference list of approximately 60 sources on this topic.)