The content of courses and the methods by which students learn are crucial in teaching the life sciences (NRC 1999, NRC 2003). Skills in data analysis and graph interpretation are particularly critical, not only in training future scientists (Mathewson 1999) but for all students. As members of the general public, all students must make informed decisions about scientific issues and controversies (von Roten 2006). However, graph presentation and interpretation are difficult skills that require cognitive steps that may be new to college students (Preece and Janvier 1992; Bowen et al 1999; Roth et al 1999: Bowen and Roth 2002; Roth 2004; Roth and McGinn 1997). Faculty teaching ecology and environmental courses should assess whether our courses are improving critical skills such as graph interpretation and should evaluate the most effective practices (D'Avanzo 2000; 2003a; Handelsman et al. 2004). In this study, we assessed changes in graph interpretation skills demonstrated by undergraduate students in our courses at four colleges.
Our study had two goals. The first was to use a variety of quantitative materials to train students to interpret ecological data. We developed analytical and graphing exercises to improve analytical skills, and we integrated these exercises into lectures and labs. The exercises were adapted from the ESA's electronic publication, Teaching Issues and Experiments in Ecology (TIEE). TIEE provides teachers with case studies, graphs, data sets, and essays that encourage active learning and impart a greater understanding of the science behind ecology. We developed exercises that would engage and challenge students with the material through studentactive learning and other strategies demonstrated to be effective for teaching difficult content and scientific skills (EbertMay and Brewer 1997; McNeal and D'Avanzo 1997; D'Avanzo 2003a,b; Brewer 2004; Handelsman et al. 2004). Our exercises required students to interpret scatterplots, line graphs and bar graphs, and to produce their own graphs from data. Several of these exercises are appended as tools for faculty to adopt in their own courses (see Resources).
Our second goal was to develop assessment tools to measure students' abilities to create and interpret graphical information. At the beginning, during, and end of our courses we tested students' analytical skills in order to assess the impacts of our teaching and to reveal which skills were most challenging to our students. Our study was not designed to assess the effectiveness of any particular teaching method we used (lectures, labs, or analytical exercises), but rather the effectiveness of each course as a whole. As such, our study provides tools and recommendations for outcomes assessment, which is increasingly required by state and regional accrediting agencies. Despite extensive experience doing research, most ecologists have little background in educational research and assessment of their teaching (D'Avanzo 2003a,b). Such assessment, however, is an important first step to improve the quality of our teaching and to develop more scientific approaches to teaching (D'Avanzo 2000; 2003a; Handelsman et al. 2004). An example assessment tool is appended (see PrePost Test in Resources).
Most previous work on graph interpretation has focused on middle and secondary students (reviewed in Phillips 1997). Our assessment research contributes to the field of pedagogical research by adding to the few studies that have addressed analytical skills at the tertiary level of education (Bowen et al 1999; Bowen and Roth 2002). By assessing large populations of undergraduates from two different student populations (science majors and nonmajors) at four different institutions, we can draw general conclusions about analytical skills and methods of teaching these skills at this level.
We assessed skills and progress of 240 students at four institutions: Fitchburg State College (MA), Georgia College & State University (GA), Rider University (NJ) and Westfield State College (MA). Most students tested (66%) were nonscience majors in introductory Environmental Science or Life Science courses, and the remainder (33%) were science majors in introductory Ecology courses (Table 1).
Institution  Instructor  Course (year)  Students Tested  Class Size  Analytical Exercise Topics  Lab?  Assessment Tool 

Fitchburg State College  Picone  Environmental Science (2005, 2006)  8 majors, 22 nonmajors  1416 

yes 
Pre and posttest with identical questions;
In 2006, also used exam questions every few weeks 
Ecology (2005, 2006)  56 majors  2631 

yes 
Pre and posttest with identical questions;
In 2006, also used exam questions every few weeks 

Georgia College & State University  Rhode  Environmental Science (2005)  45 nonmajors  45 

yes  Pre and posttest with different questions 
Ecology (2005)  24 majors  24 

yes 
Pre and posttest with identical questions;
In 2006, also used exam questions every few weeks 

Rider University  Hyatt  Life Science (2005)  50 nonmajors  50 

no  Questions at beginning and every few weeks on exams 
Westfield State College  Parshall  Environmental Science (2005)  48 nonmajors  48 

yes  Pre and posttest with identical questions 
Each investigator used several strategies to teach analytical and graphing skills. First, we began with a single lecture or lab that provided background on interpreting and creating graphs. While we each developed this background material independently, it was based on the StepOne, StepTwo
strategy (TIEE 2005). In stepone,
students describe how the graph is set up: the variables, axes, legend, and patterns in the data. In steptwo,
students interpret the graph and the relationships among variables. An example handout from this presentation is appended (see How To Read A Graph in Resources).
Second, we created exercises in which students interpreted data and graphs as a means to learn course content. We included graphs and data sets available from the TIEE site, supplemented with graphs from primary literature. Because our courses covered different content, we did not use identical exercises, although some exercises were shared among two or three investigators (Table 1). Example exercises from four topics are appended (see Examples in Resources). Exercises were presented every few weeks when appropriate, given the schedule of lecture and lab topics. Most exercises only occupied 2030 minutes within a single lecture or lab, while a few required a 23 hour lab period, and a few were assigned as homework. Exercises were designed as small group, collaborative activities in which students presented their work orally in class or as a written assignment. Students received oral and written feedback during class discussions and assignments. In addition to these exercises, every week's lectures included graphs to reinforce principles covered in both the background material and analytical exercises.
Five of the six courses in this study also included a lab (Table 1). In most labs, students created graphs from raw data, including data the students collected. Skills included generating scatterplots and bar graphs of means with error bars, and most importantly, interpreting the trends to test hypotheses. To improve understanding, we required students to first plan their graphs by sketching them out by hand before plotting the data with Microsoft Excel.
To assess whether our courses improved student’s skills, we compared responses to test questions before, during, and after each course. Three investigators emphasized pre and posttests (see PrePost Test in Resources for an example). Two of these researchers used pre and posttests with identical questions, and one changed the questions in the posttest (Table 1). The fourth researcher monitored skills throughout the course with a precourse survey and analytical questions incorporated into course exams every few weeks. Because we used different assessment strategies and may have worked with different types of students, we analyzed the results from each researcher separately.
Despite differences in testing design, we generally assessed similar skills in our students:
We developed rubrics to determine whether answers in posttests could be categorized as Improved,
No change, satisfactory,
No change, unsatisfactory
or Worsened
compared to the pretest. The rubric depended on the skill assessed and the test question. Specific rubrics are provided with their corresponding test questions in the Results.
At all four institutions our courses and exercises improved students' abilities to interpret graphs (Figure 1). Students were presented graphs and asked to explain the patterns among variables. Test questions were either openended (shortanswer) or multiplechoice (e.g., see Example #1 in PrePost Test in Resources). The percent of correct answers varied with the complexity of the graph and with the school or instructor (Figure 1). Prior to our courses, only 2560 percent of students could correctly describe the patterns among variables in a graph (Figure 1). For instance, students' descriptions often omitted trends in a complex graph, or they used imprecise language to describe trends (e.g., this graph describes effects of…
, the variables are related
or the variables are linear
). Sometimes students confused cause and effect, or indicated poor understanding of the figure. After our courses, over 7590 percent of students at each institution were proficient in interpreting graphs (Figure 1). Students were more thorough in their descriptions, and they used more precise language e.g., nitrogen and phosphorous are positively correlated.
Their descriptions indicated they had increased their understanding of the ecology depicted in the graphs.
Our courses also improved students' ability to create graphs, and therefore interpret data. In one example, students were presented with data that should be summarized as a scatterplot (Example #4 in PrePost Test). By the end of each course, more than 75 percent of students could create a proper scatterplot, with the axes correctly placed and labeled, and with accurate descriptions of trends (Figure 2). The number of proficient students increased 3545 percent compared to the pretest. To assess skills in making bar graphs, students at Fitchburg State were also asked to plot categorical data (Example #3 in PrePost Test). Almost 50 percent of students improved in this basic skill (Figure 3).
Our results also indicated several areas where most undergraduates continued to struggle despite our lectures, labs and exercises. First, we tested for both superficial and deeper understandings of independent and dependent variables. This concept may be important for students to understand experimental design and to interpret data. Our students could easily identify independent and dependent variables in simple graphs, but not in graphs with more than two variables. For example, when exam questions asked students to identify the independent/dependent variables in simple graphs, 8090 percent of students answered correctly at Rider University (Figure 4) and at Fitchburg State (N=43; data not presented because it was from a single test.) However, when complex graphs included multiple independent or dependent variables, far fewer students were successful. For instance, Example #1 in the PrePost test presents a scatterplot with two dependent variables (nitrogen and phosphorus concentrations) and one independent variable (biomes tested). When the posttest asked students to list all dependent and independent variables in this figure, only 3040 percent correctly listed and categorized all three variables. Earlier in the semester at Fitchburg State, only a few more students (5057 percent) had accomplished this task with similarly complex graphs on exams, when the definitions of these variables had been recently learned and were easier to recall. Therefore, this concept seems to have been understood by only half the students and retained by even fewer.
Likewise, half of the students struggled with the following multiplechoice question from the pre and posttest (see PrePost Test in Resources):
In the posttest, only 51 % answered correctly (Figure 5). This represents only a slight improvement from the 43 % who answered correctly in the pretest.
A second area in which undergraduates struggled was the ability to discern general trends amid statistical noise
in data. Many students believed that any variation in the data resulted from important factors worth emphasizing. In one example, students were presented the number of days of lake ice on Lake Mendota, WI over the last 150 years (see Climate Change in Resources). An especially warm or cold year (outlier) often distracted them from seeing more important, longterm trends. Similarly, most students graphed every data point in a bar graph, rather than summarize the trends with a mean value. In the posttest, students were given categorical data on the number of eggs laid by Daphnia fed from two sources, and they were asked to summarize the pattern with an appropriate graph (Example #3 in PrePost Test). The replicate number
was listed in the first column of data as a distracter. Most students (57 %) plotted the replicate number as the independent variable on the xaxis (Figure 6A), and most (67 %) did not use a mean to summarize the trends (Figure 6B). Similar results were obtained from questions incorporated into course exams (data not presented). These data from bar graphs and scatterplots suggest that our students generally emphasized individual data points rather than overall trends.
Finally, students seemed to have difficulty interpreting interactions among variables. To test this skill, we presented a bar graph from an experiment with a 3x3 factorial design (Example #2 in PrePost Test). Frog survival was measured in relation to exposure to three predator treatments crossed with three pesticide treatments. Answers were only considered correct (Improved
or Satisfactory
) if students recognized that — according to the graph — malathion increased frog survival in the presence of beetles, and therefore should not be banned to protect frogs. This required students to recognize the significant interaction between pesticides and predators. Answers were unsatisfactory if they were unclear, confused, or incomplete, including statements such as pesticides decreased frog populations
or there is little effect of pesticides,
or if students recognized that malathion killed beetles
while also recommending that it should be banned. In the posttest only 23 of 74 students recognized a likely benefit of malathion, and there was no net improvement in the posttest answers (Figure 7).
Our assessment tools revealed some analytical skills that can be taught to undergraduates with relative ease and other areas where students continued to struggle despite our efforts to include extensive data analysis and interpretation in our courses. In posttests, 7590 % of students were capable of creating and interpreting simple bar graphs, scatterplots and line graphs (Figures 13). Success with simple graphs has also been found in studies of middle and secondary school students (e.g., Phillips 1997; Tairab & Khalaf AlNaqbi 2004).
Our study was designed to determine whether our courses as a whole improved analytical skills, so we cannot compare the relative effectiveness of any particular strategy we used. However, at the end of their courses, students at Fitchburg State were asked to comment if there were any activities, exercises, labs or concepts that helped them with the posttest. All of the strategies we used were praised in their responses. The most commonly cited strategy was the background introduction to graphing (e.g., when to use a line graph vs. a bar graph, and which axes are which
). Some students cited the graphs we discussed from group exercises and lectures. Others noted the benefits from plotting data from their labs as a way to better design and interpret graphs. Several recalled that using Microsoft Excel helped them, even though Excel is very frustrating.
A few students noted how everything combined helped
or that it takes repetition when it comes to understanding graphs.
Although our courses improved some analytical skills, students continued to struggle in several specific areas. First, most students lacked a profound understanding of dependent and independent variables: most could define these variables from simple graphs but not from complex graphs with more than two variables.
We thought that the ability to define and identify independent and dependent variables would be essential to understanding experimental design and the graphs. However, our results suggest that misapplying these terms does not necessarily inhibit general analytical skills. While only 3040 percent of students were able to identify these variables from a complex graph in the posttest, most (75 %) could clearly describe the relationships among those same variables (Figure 1A). Because our goal was to help students improve broad analytical understanding, and to apply rather than memorize definitions, perhaps their understanding of these variable types was sufficient.
A second area in which students struggled was the ability to distinguish trends with statistically variable or noisy
data (Clement 1989). In scatterplots, many students emphasized individual variation, failing to discern general trends or perceiving trends where none existed. When plotting categorical data, most students graphed individual data points rather than summarizing trends with means (Figure 6). During the semester we included several lab exercises in which students plotted means from data they had collected, yet most did not seem to internalize these lessons.
Alternatively, the results in Figure 6 may be due to a poorlydesigned test question rather than poor student skills. Example #3 in the posttest required a bar graph from treatments with only four replicates. In contrast, the bar graphs from lab exercises included treatments with dozens of replicates. If the test question was more like the data that students had collected and summarized, many more students might have chosen to graph a mean.
In any case, the ability to find patterns amid variable data is a difficult skill that deserves special attention in our courses, particularly because variation is the norm in ecological data. In introductory courses and textbooks, students get little exposure to noisy, highly variable data in graphs (e.g., Roth et al 1999). Ecological data, by contrast, are typically noisy because phenomena are influenced by multiple (and often unpredictable) independent factors such as climate, community interactions, and disturbance history. Moreover, different mechanisms will determine the outcome from these factors depending on the scale of space and time. Undergraduate students need more practice plotting, interpreting, and making predictions from such complexity (Brewer and Gross 2003). Ecology and environmental science courses, perhaps more than other areas of biology or the physical sciences, provide valuable opportunities to practice working with these kinds of data sets.
When learning to plot data, computerbased graphing programs must be used carefully to avoid interfering with learning. Software used for graphing, such as Microsoft Excel, can reinforce students' misperceptions about plotting data, or worse, allow them to produce meaningless graphs. We recommend that students first sketch by hand a basic format of their graph before plotting any data on a computer (Roth and Bowen 2006). Quick sketches are sufficient to determine: 1) what type of graph is appropriate (scatterplot, bar graph, etc.), 2) how the data should be organized (as means, with legends, etc.), and 3) how the axes should be placed and labeled. This simple method forces students to think actively about what message they want to convey from the data, rather than passively allowing the computer to produce a graph for them or following a lab manual's stepbystep instructions. Anecdotally, we found that students understood graphing principles better when they started with a quick, handdrawn sketch. Moreover, sketching a graph from their hypotheses or predictions is useful even before they collect data, and may improve experimental design in inquirybased labs.
A third analytical skill in which we saw little improvement was the ability to interpret interactions among variables. Students were presented a bar graph of frog survival with interactions between pesticide and predator treatments (Example #2 in PrePost Test). The interaction was interpreted correctly if the student recognized that one pesticide (malathion) should not be banned to protect frogs. In the posttest, only 31 % of students answered correctly (Figure 7); however, incorrect answers to this question might have been influenced by content knowledge, rather than a lack of analytical skills. Perhaps answers were confounded by knowledge of the typical negative effects that pesticides have on amphibian survival (e.g., Reylea 2005), which was discussed earlier in the course. Alternatively, perhaps students did not understand that beetle predators are typical in environments with amphibians. This example may illustrate how graph interpretation, even of simple figures, is greatly influenced by the experience and context that are familiar to the viewer (Preece and Janvier 1992; Phillips 1997; Bowen et al 1999; Roth 2004). Moreover, interactions can be difficult to interpret for scientists at any level, so the fact that undergraduates showed little improvement is neither surprising nor discouraging.
Assessment of student skills and content knowledge is an increasingly common requirement of college accreditation, and an important component of scientific teaching
to discern effective practices (D'Avanzo 2000; 2003a; Handelsman et al 2004). Like most ecologists (D'Avanzo 2003a,b), we were new to course assessment and educational research when we began this study. We learned from some mistakes in our strategies and assessment tools, and from them we developed the following advice for others beginning to study their own teaching of analytical skills.
Our test instruments (e.g., PrePost Test) covered a wide range of analytical skills (See Methods). This shotgun
approach was useful at the beginning of our study to reveal students' strengths and weaknesses that we would not have predicted a priori. However, this approach quickly generated a large quantity of different assessment materials, and it was difficult to transform that material into useful data regarding student learning. Moreover, long assessment tests can be tiring and annoying for students, especially when they do not count towards a grade. Therefore, shorter assessments that focus on only a few (≤ 3) questions or skills may be more practical for pedagogical researchers, classroom instructors, and students alike. In addition, shorter tests are easily incorporated into midcourse assessments.
Our long list of assessed skills emphasized ecological data with realistic variation plotted as scatterplots or bar graphs with error bars. However, ecology courses also feature line graphs to demonstrate models about fundamental concepts such as population size, growth rates, and diversity relationships. Because line graphs are among the most difficult for students to grasp (Weintraub 1967; Berg and Phillips 1994), they provide another important source of data for assessing development of analytical skills.
In addition to selecting a short list of skills to assess, researchers should carefully choose questions that have something quantifiable in the answers or some objective means to determine whether students improved. Focusing on objective answers will save the researcher time. However, easily assessed answers might come at a cost in accuracy. As questions become easier to score and less openended (such as multiplechoice), answers might not reflect student progress and understanding (Berg and Smith 1994). Students may get the correct answer for the wrong reasons or get an incorrect answer through a sophisticated interpretation. The reasons for student mistakes are more easily discerned with freeresponse questions. If openended questions are used then clear rubrics are needed, and they should be coordinated among all researchers collaborating on a study.
Pre and posttests are useful for several reasons. The same questions can be used in both tests, making it easier to compare answers to assess whether students have improved. Because we placed our posttest at the end of the semester, students were able to draw upon all of the exercises and experiences from the semester, and the course as a whole was assessed rather than a single exercise. Moreover, the posttest came many weeks after some skills were introduced, and therefore it assessed whether the analytical skills were really understood and retained, rather than simply repeated from shortterm memory.
However, posttests also present some disadvantages. Students could improve in a posttest simply from increased content knowledge about the topics or context of the data rather than increased analytical skills (Preece and Janvier 1992; Phillips 1997; Bowen et al 1999). Alternatively, saving all of the assessment for one large posttest might reduce student scores because they are simply tired from a long test. Student schedules are especially hectic at the end of the semester, which can further reduce the number and/or quality of responses for posttests, especially when they are not part of the course grade. Therefore, long, endofsemester posttests probably underestimate skills and progress.
To overcome these drawbacks, data should be collected at intervals throughout a course. Such data can be used to corroborate trends from the pre and posttests or replace a separate posttest entirely. Assessment questions can be incorporated easily into exercises during lecture and/or graded exams, as was done at Rider University in this study. Most importantly, such data provides formative assessment during a course, and corrections can be made before the semester ends (D'Avanzo 2000; Brewer 2004). One mistake we made in our study was to examine the data only after each course was completed, when it was too late to improve our pedagogy for that group of students.
Besides timing, assessment questions must be carefully developed. For example, should pre and posttest questions use identical data and graphs, or should the graphs have similar format but different topics? Using identical questions makes it easier to compare answers between tests, but it runs the risk of students remembering questions they had seen in the pretest (or worse, repeating errors because they had practiced making them earlier on that same question). If pretests have different questions from posttests, then the level of difficulty should remain constant. It is tempting to increase the difficulty in subsequent tests — as we often do when testing content knowledge in a course — but increasing the difficulty of assessment questions confounds interpretation of the data.
Finally, the researcher must decide whether (or how) to separate tests of analytical skills from tests of course content. Some of our posttest results were probably confounded by different levels of content knowledge in the students, and these differences could have masked increased analytical skills. For example, most students may not have really misinterpreted the interaction among variables in the bar graph in Example #2 (PrePost Test). Perhaps they simply did not realize that predaceous beetles are ubiquitous in freshwater habitats. As much as possible, assessment tools should test analytical skills that are independent of content knowledge. Subjects of graphs should be familiar to all the students, or perhaps described on the test. Even with simple graphs, poor interpretation often results from unfamiliarity with the context or topics in a graph, not with poor analytical skills (Preece and Janvier 1992; Phillips 1997; Bowen and Roth 1998; Bowen et al 1999). Indeed, even professional scientists in different disciplines can interpret the same graph in different ways, based on their different experiences and examples they use as references (Bowen et al. 1999; Roth 2004).