October 14th, 2010

Minda: Strategi format baru peperiksaan

 SECARA umum, ujian dan peperiksaan adalah suatu perkara berat yang memerlukan seseorang belajar mengingati dan menghafal serta mengeluarkan kembali apa yang dipelajarinya dalam peperiksaan. Kebanyakan ujian di sekolah sama ada di sekolah rendah dan menengah lebih kepada peperiksaan serius iaitu memerlukan pelajar memerah otak.

Tumpuan pelajar adalah untuk mendapat lebih banyak A+ dalam peperiksaan yang menjadi harapan ibu bapa yang mahukan anak-anaknya cemerlang. Apa pun, belum tentu pelajar yang cemerlang dalam peperiksaan di sekolah rendah dan menengah dapat mengekalkan kecemerlangannya di peringkat pengajian tinggi khususnya di universiti.

Jika selama ini ibu bapa yang mempunyai anak di sekolah rendah dan menengah ternanti-nanti keputusan kerajaan sama ada untuk meneruskan atau menggantikan peperiksaan pada tahun enam dan juga di tingkatan tiga, kini mereka telah mengetahuinya. Perubahan bentuk, format dan gaya pengujian juga mungkin mengalami perubahan.


Tinjauan mendapati ramai ibu bapa lega dengan pengumuman Timbalan Perdana Menteri, Tan Sri Muhyiddin Yassin berkaitan keputusan kerajaan tidak meneruskan peperiksaan Penilaian Menengah Rendah (PMR) dan diubah format baru berasaskan sekolah mulai 2016.


Sementara itu, Ujian Pencapaian Sekolah Rendah (UPSR) pula diteruskan dengan penambahbaikan bentuk dan komposisi subjek. Perubahan format baru bagi kedua-dua peperiksaan berkuat kuasa 2016 termasuk menggunakan nama baru PMR. Langkah ini dapat memperkasa bidang pendidikan.


Tempoh ini bagi membolehkan pihak tertentu melakukan kajian dan tinjauan serta memberi peluang pelajar di sekolah tidak menghadapi kejutan dengan format baru ujian atau peperiksaan yang berasaskan sekolah. Perubahan dan pembaikan ini sejajar dengan tuntutan masyarakat yang tidak mahu sistem persekolahan terlalu menekankan peperiksaan. Perubahan ini juga mendapat sokongan banyak pihak terutama ibu bapa yang menanti keputusan kerajaan selama ini.


Kitaran kepada perubahan dasar dalam dua peperiksaan itu akan mengambil masa enam tahun. Pada ketika itu UPSR format baru dan mungkin PMR dalam bentuk baru. Rasionalnya kepada pelajar jika dimansuhkan, mereka tidak dibebankan dengan sistem pembelajaran berorientasikan peperiksaan. Peperiksaan yang dilakukan selama ini sebenarnya tidak banyak membantu dalam kehidupan masa depan pelajar, terutama bidang pekerjaan yang mereka ceburi. Tetapi jika tiada peperiksaan, pelajar tidak akan menumpukan perhatian kepada pelajaran dan pembelajaran (P&P) di sekolah. Jika dimansuhkan terus, soalnya bagaimana kita hendak mengetahui kebolehan dan pencapaian pelajar dalam setiap subjek yang dipelajari mereka? Ini melegakan semua pihak terutama ibu bapa yang masih mengharapkan ada peperiksaan di peringkat tahun enam dan tingkatan tiga. Perubahan ini sejajar dengan perubahan masyarakat yang mahu bentuk ujian yang baru. Ibu bapa juga tidak dibebankan dengan yuran tuisyen yang membebankan dan menghadapi tekanan apabila anak-anak tidak berjaya dengan cemerlang dalam kedua-dua peperiksaan itu.


Semoga kedua-dua peperiksaan format baru ini akan memberi manfaat keseluruhan kepada pelajar demi masa depan mereka dalam mencapai kecemerlangan melalui pendidikan. Selain itu, guru pula tidak terlalu menumpukan kepada P&P di dalam kelas untuk mencapai kecemerlangan dalam peperiksaan semata-mata.


Bagi guru, mereka juga akan dapat meningkatkan profesionalisme perguruan iaitu mempunyai kecekapan dan kemahiran tinggi dan dapat memanfaatkan P&P sepenuhnya tanpa terikat dengan peperiksaan. Guru juga dapat mendidik dan mengajar pelajar meliputi perkembangan akal, potensi diri pelajar, memperbaiki akhlak dan budi pekerti dengan lebih berkesan tanpa menumpukan sepenuhnya kepada peperiksaan.


Guru yang berkualiti mempunyai pengaruh besar kepada pencapaian dan pembangunan diri pelajar dan kemajuan pendidikan di negara kita. Tugas utama guru adalah mengajar di bilik darjah bagi melahirkan pelajar berilmu pengetahuan dan berkemahiran tinggi.


Tetapi jika ujian dan peperiksaan format baru yang berasaskan sekolah, dikhuatiri guru akan dibebankan untuk menyiapkan modul atau soalan berformat yang disediakan pihak sekolah. Selain itu, mampukah penilaian di peringkat sekolah dipertahankan kewibawaannya? Menurut tokoh Barat, Charles G Colton, ujian atau peperiksaan adalah hal berat, sekalipun bagi mereka yang sudah bersedia kerana orang yang paling pandai pun dapat menanyakan hal yang lebih sulit daripada yang dapat dijawab oleh orang paling bijaksana.


Oleh itu, perubahan dasar dalam dua peperiksaan itu sewajarnya dibuat kajian terperinci terutama melibatkan kandungan ujian atau bentuk peperiksaan yang baru. Ramai ibu bapa berpendapat nama ujian dan peperiksaan tersebut juga perlu diubah demi mengikut perubahan semasa.


Beberapa elemen penting seperti perkara berkaitan dengan akal budi, pembentukan sahsiah dan pemurnian akhlak dan budi pekerti pelajar wajar dimasukkan dalam kurikulum atau kandungan ujian dan peperiksaan yang baru nanti. Pelaksanaan ujian dan peperiksaan tersebut tidak terlalu menjurus kepada peperiksaan seperti dilakukan selama ini.


Memperkasakan bidang pendidikan sangat penting bagi melahirkan modal insan berilmu pengetahuan, berkemahiran dan berakhlak terpuji. Membentuk manusia melalui bidang pendidikan memerlukan kajian dan masa panjang agar mereka seimbang dari segi jasmani, emosi, rohani dan intelek.


Asas pembinaan negara bangsa bermula di sekolah bagi menyediakan rakyat Malaysia ke arah satu bangsa yang bersatu padu dan saling bekerjasama. Ia perkara sangat penting bagi sesebuah negara yang mempunyai rakyat berbilang kaum dan masih mengamalkan ideologi dan kebudayaan masing-masing.

Matlamat pendidikan harus diutamakan selain daripada peperiksaan, iaitu untuk melahirkan individu taat setia kepada negara, berilmu dan beriman, berakhlak mulia, berketrampilan dan sejahtera. Seterusnya dapat menyediakan sumber tenaga manusia untuk keperluan dan pembangunan negara serta menyediakan peluang pekerjaan bagi semua warga negara Malaysia.


Penulis mantan pendidik berkelulusan PhD Persuratan Melayu

Sumber : Strategi format baru peperiksaan

Wise and Positive Words Oct 14

Uplifting words of Praise and Encouragement

It is hard to fail, but it is worse never have tried to succeed. - Theodore Roosevelt

We must have a theme, a goal, a purpose in our lives. If you don't know where you're aiming, you don't have a goal. My goal is to live my life in such a way that when I die, someone can say, she cared.- Mary Kay Ash

He who is kind to the poor lends to the LORD, and he will reward him for what he has done.


He who ignores discipline despises himself, but whoever heeds correction gains understanding.

You will never find time for anything. If you want time you must make it.- Charles Buxton

Great Quotes from Great Women!

Life is like a coin. You can spend it any way you wish, but you only spend it once." ~ Lillian Dickson

We need to find the courage to say NO to the things and people that are not serving us if we want to rediscover ourselves and live our lives with authenticity. ~ Barbara De Angelis

"Sadness flies away on the wings of time." -- Jean de La Fontaine

"Beauty comes in all sizes, not just size 5." ~ Roseanne

School-Based Assessment : East Meet West KL 2005 I

 School-Based Assessment and Assessment for Learning: Concept, Theory and Practice

Dr. Lorayne Dunlop-Robertson, Ontario Institute for Studies in Education: University of Toronto
lorobertson@oise.utoronto.ca


Assessment policy in the province of Ontario, Canada has undergone significant change in the past decade – change that is aimed at literally transforming curriculum and assessment for the approximately two million students in Ontario’s elementary and secondary schools. In a series of sweeping policy changes, the government introduced changes to school finance and governance, while reorganizing the school boards. This shifted the decision making on curriculum and assessment policy to a central authority. This was accompanied by rapid curriculum and assessment policy changes, along with significant changes in the traditional instruments of assessment and evaluation policy implementation, such as revised curriculum guides and new report cards. A range of new assessment policy implementation instruments was introduced also, such as exemplar booklets, and an online curriculum unit planner. This paper seeks to examine the theory behind the direction of the assessment policy changes; to summarize the policy changes as reflected in policy documents; and to reflect on the instruments designed to support the assessment policy changes.

Why Assess? The Changing Purposes of Assessment and Evaluation

Traditionally, student assessment and evaluation information has been collected for the central purpose of communicating the results of student achievement (Marzano, 2000). For decades, the grading, reporting and communicating of student learning has been a key responsibility for teachers (Guskey, 1996). In the province of Ontario, Canada, teachers are required to report student progress to parents or guardians a minimum of three times during the school year. Schools and school districts also use the results of student evaluations for communication purposes when they inform their constituents of the progress of their schools or their districts. A second traditional purpose for student assessment and evaluation has been to “select and sort” students. Based on the results of certain key evaluations, students gain access to various programs or courses, such as entrance to university or college. In Ontario for example, students must pass an exit examination in literacy before they are permitted to graduate from secondary school. Traditionally, assessment results have been norm-reference, comparing students to one another, and have resulted in presentations of results of student achievement in bell curves or “normal distribution” (Marzano, 2000, p. 17).

Within the last decade, however, a strong case has been put forth to broaden the purposes of student assessment and evaluation beyond the traditional ones of reporting and sorting (Marzano, 2000; McMillan, 2004; Shepard, 2000; Stiggins, 1994; Wiggins, 1998). Educators are realizing that student assessment can serve other purposes such as improving student learning, improving teaching effectiveness, and increasing the levels of student engagement with the material. Assessment and evaluation strategies have the potential to be teaching strategies also – another means to educate and to help students understand. Student assessment and evaluation tasks can be used also to support more effective planning toward meeting the learning outcomes of courses or units of study. More recently, it has also been suggested that assessment and evaluation strategies can be used to engage students more deeply in their learning. In this next section, these three broader purposes for student assessment and evaluation are explored.

Wiggins (1998) sees that student assessment can be “educative”. He advocates that student assessment can be used for purposes of educating and for improving student learning, rather than solely for the purpose of reporting. This view of assessment theory can be illustrated in a simple way with two scenarios. In one classroom, students are given a multiple choice test. The test questions are secret because the teacher wants to ensure validity and fairness in the test. The teacher has built the test from item banks of questions that have been provided by the district. The teacher administers the test, scores it, and communicates the results with precision and a degree of certainty that the test has been rigorous. In this scenario, however, the test itself cannot be used as a learning tool, because the answers are carefully guarded to be used another year. The students do not know which questions they answered incorrectly or how their thinking “went wrong”. The teacher is clearly communicating the results to the students, but the results are not a learning tool that they can use effectively to improve. The teacher assigns a mark or a letter grade which could potentially be a motivator to students, but the letter grade or percentage itself does not improve student learning.

Contrast this with a second scenario where the teacher uses a more authentic, performance-based assessment. In this second scenario, the students are required to produce a product – a letter to the editor, which is one of the expectations or outcomes for their grade. The criteria for the assignment and the scoring of the assignment are posted. The teacher provides some models of the task from a booklet of samples of student work that has been provided by the district. The students complete the assignment, and then write a self-assessment of their work on this task. The self-assessment is a key task because the students analyze and explain what they understood and did not understand about the task. The teacher grades the assignments, writes feedback to the students, and reports on the results of the evaluation. This time, the students have some key information – focused, personalized feedback - that will help them to improve. The teacher is using the “test” for multiple purposes: to communicate student progress, to give students feedback for improvement, and to increase students’ understanding of both the subject and the criteria for quality work. The test has become a learning tool, giving students focused feedback to assist them in their learning.

In this second scenario, the assessment task can have further applications if the teacher uses it also to analyze the effectiveness of the teaching. Through an analysis of the errors and strengths of the students’ work, the teacher decides what lessons need to be reinforced or perhaps even taught again in a different way. The assessment has a second purpose – to guide program decisions and to help with planning.

Assessment for planning is perhaps best illustrated through the use of a circle to describe the cycle of planning learning. Picture a teacher who is receiving a new class of thirty-five students at the start of a school year. While she knows that the learning plan for the year must be based on the expectations or outcomes of the curriculum, she does not know the skills of the students on entry to her class. Experience tells her that there will be variations in reading ability and mathematics acquisition that will span several grades. Some of the students will be strong in number skills while others will be strong in the social sciences. It would be a waste of time to re-teach material that students already know, and it would also be a waste of time to introduce topics before students are ready, so the teacher undertakes some diagnostic assessment. This is not complex assessment, but a series of simple assessments designed by the teacher herself, to gauge the prior knowledge of the students. She is also looking for indicators of how the students learn, such as their reading ability; their writing ability; their ability to concentrate; and their ability to listen to instructions and process them (Sutton, 1995). The teacher is also checking the students’ skills in mathematics. Applying this information helps the teacher to select reading resources that her students can grasp, and she has some idea of where to begin in mathematics. She learns which students can handle a significant quantity of printed material on a page, and which ones will need support with written materials.

The teacher uses diagnostic assessment numerous other times during the year whenever she wants to determine students’ prior learning relative to the learning outcomes for history or science or other subjects. With so many outcomes to be taught for the subjects in the grade she is teaching, she uses diagnostic assessment for several purposes:

• To avoid repetition of previously-learned material,
• To determine connections with other subjects and prior learning,
• To plan learning that is an attainable “next step” for the students.

It is important that the learning should be a challenge to the student (not a repetition) and still attainable. Sutton (1995) refers to this intended or planned learning as the learning that is within the “extended grasp” of a student (p.22).

After the teacher has diagnosed the prior learning of the class, she plans a portion of their learning in a subject. This could be a unit, a topic or a module. The teacher asks a key assessment question, “What will students know and understand as a result of the learning in this unit?” She lists the learning outcomes for the unit, and designs a summative assessment task that will allow students to demonstrate what they have learned. This summative task or culminating task is designed to be as authentic a task as possible. The task reflects real life and is engaging or interesting to the students. The task has a defined purpose, and is generally rich or complex in its design. The task is one that is “worthy” of the efforts of the students. The teacher has decided on the intended end product of the lesson. Deciding this, she then begins to plan the unit and the daily lessons, using the summative task as her guidepost.

This is the opposite approach to planning that begins with the consideration of the textbook first or planning the learning activities first. Instead, the first consideration is the end product – evidence of the outcomes of student learning relative to an established standard. Wiggins and McTighe (1998) refer to this process as “backward design” (p.8) because the teacher is going about this in the opposite way to the traditional. The teacher is considering first what can be accepted as evidence that the students have learned and have understood the learning outcomes. Wiggins and McTighe refer to the teacher as the designer who undertakes three steps: the identification of the desired results, the determination of the acceptable evidence of learning, and the planning of the learning experiences and instruction (p.9). In order for teachers to accomplish this, Wiggins and McTighe encourage teachers to decide which of the curriculum outcomes are worth being familiar with, which ones represent important knowledge, and which curriculum outcomes represent “enduring understandings” about the subject or topic – or the knowledge and skills that are at the heart of the subject discipline. The final assessment is designed to allow students to provide evidence that they have grasped these enduring understandings (p. 13).

As the lessons in the unit are taught each day, the teacher uses a third form of assessment, formative assessment, to determine whether or not the students have understood and grasped the material of the day’s lesson. In this case, the teacher is using formative assessment for two purposes: to provide ongoing feedback to the student, and to “inform” the teaching of the next lesson. Sutton (1995) refers to this process as “feed forward to the next learning task” (p. 66). This on-going form of assessment ensures that the teacher does not go on to the next lesson until she has determined whether or not the students have mastered the learning outcomes of the previous lesson. This ongoing assessment can be time consuming, or it may take the form of a quick homework check, or an analysis of the students’ application of the lesson through written exercises. What is key is that the assessment serves a useful purpose for student learning, which is a higher form of accountability to students than completing the daily assessments in order to have a mark in a markbook. The goal is to provide clear feedback that assists the student toward the attainment of the learning outcome (Sutton, 1995). In order to do this, the feedback must be given to students in a timely way, and must be in a format that is meaningful to the students. Hattie (1992) as cited in Marzano (2000) finds in a review of 7,827 studies of education, that “accurate feedback to students can increase their level of knowledge and understanding by 37 percentile points” (p. 25). Assigning a grade with an explanation of the strengths, weaknesses, and next steps is meaningful formative assessment feedback.

The teacher assigns the planned summative assessment in order to capture the extent to which students have grasped the material. She explains the intended outcomes of the summative assessment or culminating task, along with descriptors of the levels of quality in the completed work. During the completion of the summative task, she may organize ongoing formative assessment for learning, in the form of student self-assessment and peer assessments. Two of the implications of this process are that the teacher may “marginally reduce the quantity of the teaching in the interests of the quality of the learning”; and students may need explicit training in assessing against given criteria (Sutton, 1995, p. 69). The teacher uses the results of the summative assessment for purposes of reporting to parents, reporting to the teacher at the next level, and for planning the next unit (the feedforward application). The concept map below illustrates the concept of this broader purpose of assessment: the application of assessment in planning for student learning.

Dunlop-Robertson, 2005

A third broader purpose of assessment for learning is that assessment can be used to deepen student engagement with the learning material. Wiggins (1998) states that, We sacrifice our aims and our children’s intellectual needs when we test what is easy to test rather than the complex and rich tasks that we value in our classrooms and that are at the heart of our curriculum. That is, we sacrifice information about what we truly want to assess and settle for score accuracy and efficiency. That sacrifice is possible only when all of us misunderstand the role assessment plays in learning. In other words, the greatest impediment to achieving the vision described is not standardized testing. Rather, the problem is the reverse: we use the tests we do because we persist in thinking of the assessment as not germane to learning and therefore best done expediently. (p. 7)

Wiggins advocates that once assessment is seen to be educative, it becomes a “major, essential, and integrated part of teaching and learning.” (p. 8). He encourages an examination of current testing practices to move toward the view of curriculum as a set of performance tests of mastery of key outcomes. For example, a test for a driver’s permit requires a performance. Teacher evaluation is based on performance. Many student skills can be demonstrated well only through a performance (such as playing an instrument, or demonstrating skills in physical education). He encourages teachers to make the performance or demonstration of the learning as “adult-like” as possible – stating that traditional tests may engage students’ attention but they do not engage students’ respect, passion and persistence (p. 16). The key to providing tasks that engage students is to use authentic forms of assessment.

Authenticity in assessment involves providing assessment tasks that have a purpose. These tasks mimic real-life and real-world applications of knowledge at a high level of intellectual skill and performance. They are tasks that students find to be engaging because they can see that the content is relevant to them for life-long learning. The tasks are generally complex. Authentic tasks involve application and synthesis and other forms of higher learning, (Bloom, (1956) as cited in Wiggins, (1998)). While there is generally only one right answer in a traditional test, in an authentic task, the result is a quality product or performance that differs from student to student, but the indicators of quality do not change. The scoring for quality in an authentic task is made clear from the outset, and the feedback from the task is designed to provide students with next steps to consider in their learning. McMillan (2004) cites research by Brookhart (1997) finding that,


Recent research on motivation suggests that teachers must constantly assess students and provide feedback that is informative. By providing specific and meaningful feedback to students and encouraging them to regulate their own learning, teachers encourage students to enhance their sense of self-efficacy and self-confidence, important determinants of motivation. (p. 12).

An authentic assessment task also has validity; in other words, the assessment task assesses what it purports to assess. For example, asking students to write an explanation of how a microscope works could be considered more an assessment of writing, than of the actual performance of correct use of a microscope. The authentic assessment task should not limit the student by its design. A valid authentic assessment task is one that allows the student to demonstrate what he or she knows, can do, and understands. Another key criterion of authentic assessment that has not, as yet, been addressed is that the authentic assessment task must be feasible, given the expected workload of the students and their teachers. In summary, authentic assessment tasks engage students for the following reasons: they are purposeful and linked to real life; they are individualized or closer to the student as a person; they allow the student to demonstrate understandings and a grasp of the knowledge and skills; and they provide focused feedback for improving student learning.

These recent advances in classroom assessment and evaluation theory are summarized by McMillan (2004). Traditional assessment of outcomes, (isolated skills and facts) has been replaced by assessments that have integrated outcomes and applications of knowledge. The assessment tasks are more authentic and contextualized. The standards are no longer secret but public. Assessment and evaluation no longer occur after the instruction but during the instruction, and considerable feedback is provided to the students. Single assessments have been replaced by multiple assessments. In other words, assessment of learning is being replaced by assessment for learning.

In an era of increased accountability for student learning relative to agreed-upon international standards, authentic assessment as described in this paper appears to be working against some long-held beliefs about objectivity, fairness, reliability and validity in student evaluation. Shepard (2000) explains how earlier assessment theory was based on theories of motivation, theories of cognitive development and theories of scientific measurement. Many teachers continue to believe that tests must be uniformly administered to ensure fairness and objectivity. Shepard suggests that a reconceptualization of assessment theory is needed to match new conceptions about teaching and learning. She argues that new forms of assessment are needed “to be compatible with and to support” the social-constructivist view of learning that has been advocated by key theorists such as Vygotsky (1978) because fixed theories of intelligence have been replaced “with new understanding that cognitive abilities are developed through socially supported interactions” (p.7).

Stiggins (2002) addresses also the changing landscape of assessment theory. He finds that the assessment landscape in the United States in the past fifty years has led to the clearer articulation of higher assessment standards, more rigourous assessment for those standards and increased accountability on the part of the educators. He sees a flaw, however, in the “belief in the power of accountability-oriented standardized tests to drive school improvement” (p.762). The flaw is that only some of the students are motivated to higher excellence by the high-stakes testing. The testing is having the opposite effect on the motivation of many other students. They are becoming discouraged learners in the face of the intimidation of the tests, and assessment policies do not seem to accommodate this concern. He advocates for a more powerful vision where assessment for learning and the assessment of learning are both important (p. 762). In order for this change to take place, he advocates that teachers need the assessment tools to accomplish this task.

Research has demonstrated that improving classroom assessment – assessment for learning – can have a strong impact on student achievement. Bloom (1984) as cited in Stiggins (2002) demonstrates that changing classroom instructional environment (and one of the changes was assessment for learning) could produce “differences from one to two standard deviations in student achievement attributable to the differences between experimental and control conditions” (p. 763). In a review of literature in 1998, Black and William determine that improving classroom assessment can raise standards and they cite effects of one-half to one standard deviation. More importantly, they found that improving classroom assessment advantages the lower achievers while raising the overall standards. They argue that,… standards can be raised only by changes that are put into direct effect by teachers and pupils in classrooms. There is a body of firm evidence that formative assessment is an essential component of classroom work and that its development can raise standards of achievement. We know of no other way of raising standards for which such a strong prima facie case can be made.” (p. 143)

In summary, there is a theoretical and research basis that points toward the usefulness of a broader set of purposes for student assessment. In the section that follows, changes in the Ontario assessment and evaluation policies and instruments are described relative to these theoretical constructs.



Ontario Education: Curriculum and Assessment Policy Changes

In 1995, a Conservative government with an agenda of sweeping educational reform was elected in Ontario, Canada. For the next five years, the reforms to the curriculum and assessment policy continued until there was virtually a complete reform of curricula for all of the grades in the school system, ending with the publication of a new Grade 12 curriculum in 2001. In published news releases, the Ministry linked some of the changes to an earlier provincial consultation report, For the Love of Learning (Queen’s Press, 1994) while stating that other changes were based on a stated need to show fiscal responsibility while improving the quality of the school system.

One of the earliest reforms was the establishment of both a testing program and an “arms-length” agency of the government, the Education Quality and Accountability Office or EQAO (Queen’s Park, November 1995). At this time, there were no census assessments of Ontario students, and there were no exit examinations for secondary school. According to the press releases, the EQAO was designed to respond to the public’s demand for closer scrutiny and greater accountability. EQAO introduced a system of testing for all students in Grades 3 and 6 in Language and Mathematics, and for all students in Grade 9 in Mathematics. The test instruments are a combination of multiple-choice items and essay items. The result for the individual student is a Level from 1 to 4 reported for Language and for Mathematics. School results and district results are published.

The Ministry introduced a secondary school graduation requirement – a literacy test for students in Grade 10. This test is a performance-based literacy assessment, and the results are reported to individual students as either a successful pass or unsuccessful. The school results and the district results are published. Students who are not successful in the test are encouraged to take remedial courses during their remaining years in secondary school.

Commencing in 1997, the Ministry of Education introduced sweeping changes to its curriculum, commencing with new policy documents in Language and Mathematics for the elementary schools. Prior to this time, the published elementary curriculum policy “The Formative Years: 1967” had remained essentially unchanged for thirty years. This earlier document did not have grade-specific outcomes. Many individual school districts had developed their own grade-specific curriculum outcomes and established their own systems for assessment, evaluation and reporting of student grades. The new curriculum was intended to bring consistency across the province. At the same time, the government announced that the province’s school boards would be re-organized for efficiency. The following year, the Ministry of Education reduced the 129 major school boards to sixty-six new district school boards. The first task of the newly-reorganized school boards was to implement new elementary curriculum and assessment policies, working under a reduced funding model.

By 1998, new elementary curriculum was introduced for all of the subjects in elementary schools. The new curricula included grade-specific learning outcomes organized into strands. An entirely new element was also introduced at this time, intended to assist teachers with the assessment of student performance: “The Levels of Achievement Chart.” This chart is explained in the following way in The Ontario Curriculum Grades 1-8: Language 1997,

The achievement levels are brief descriptions of four possible levels of student achievement…(p.5) A student will be assessed on how well he or she reasons, communicates, organizes ideas and applies language conventions. For each of these categories, there are four levels of achievement. These levels contain brief descriptions of degrees of achievement on which teachers will base their assessment of children’s work. (p. 8)

The introduction of the levels of achievement charts appears to have been an attempt to meet two goals: to bring greater consistency to student assessment across the province, and to broaden the levels of cognitive development at which students in the province were assessed. Teachers were to judge their assessment of student performance, not just on knowledge, but on the student’s demonstrated ability to reason, communicate, organize ideas, and to apply the skills. This was a key change for both curriculum and assessment.

In 1999, the Ministry published revised secondary school curricula for Grade 9, followed by new curriculum in each of the subsequent years for Grades 10 through 12. These curriculum documents also present grade and course- specific learning expectations and achievement level charts for all of the subjects in secondary school. There is one difference: In the secondary school curriculum, the levels of achievement charts are more consistent from subject to subject. Secondary students are assessed across the four categories of knowledge and skills:

• Knowledge / Understanding
• Thinking / Inquiry
• Communication
• Application / Making Connections

Again, the outcome of this change is the requirement for teachers to assess student learning above the level of knowledge acquisition. The Ministry of Education also published a policy document:Program Planning and Assessment (2000) . In this document, the requirements for assessment are prescriptive. In order to ensure validity and reliability, teachers are advised to conduct assessments over a period of time that are varied in nature and “designed to provide opportunities for students to demonstrate the full range of the learning” (p.13). Teachers are advised to give students clear directions for improvement and to use samples of student work to provide evidence to substantiate marks assigned for student achievement. In this policy statement, the final grade for the course is to be determined in the following way: seventy percent of the final grade is to be based on evaluations throughout the course, and 30 % of the grade is to be based on a final evaluation that may be “an examination, performance, essay and or other method of evaluation suitable to the course content and administered toward the end of the course” (p.15). In conclusion, the secondary assessment policy states that “In all of their courses, students must be provided with numerous and varied opportunities to demonstrate the full extent of their achievement of the curriculum expectations, across all four categories of knowledge and skills” (p. 15).

The current iteration of these charts is available online currently for public consultation. In the current version of the charts (Ministry of Education, Ontario, 2004), teachers are encouraged to base their assessments “on clear performance standards and on a body of evidence collected over time” (p.4) The performance standards are presented to give teachers a “common framework” to guide the development of assessment tasks across a variety of aspects, and to assist in the planning instruction in providing meaningful feedback to students. In the latest iteration, the assessment categories have been standardized across all subjects to the following categories: knowledge and understanding; thinking; communication; and application.

With the introduction of new curriculum, the Ministry of Education also introduced new standardized provincial report cards. The report cards use letter grades to report student progress in Grade 1 to 6, and percentages to report student progress in Grades 7 to 12. One of the most significant changes of the new report cards was the requirement for teachers to evaluate students’ learning skills separately from the evaluation of their achievement of the curriculum outcomes. This presented a significant change for teachers, who had traditionally factored in student effort, homework completion and other factors in the composite evaluation (percentage or letter grade) for a student.

To support the implementation of new assessment practices and new report cards for elementary and secondary schools, the Ministry of Education developed an online electronic curriculum planner, a software application for curriculum planning that contains a resource library. One of the resources is the Assessment Companion (2002), which is also available online. With this resource, teachers can review current assessment policy, review assessment literacy terms and see explanations of different assessment methods. They can utilize the curriculum planner software also to construct rubrics (an assessment checklist that provides descriptors of student work at different degrees of quality).
The second major assessment implementation resource for Ontario, initiated at the time of the new report cards, is the project to provide samples or exemplars of student work. Working with teams of teachers across the province, the Ministry collected samples of student work and organized these samples into booklets that present models of student achievement across different levels of achievement in the various subjects and grades. These examples of student work are available both in print format and on the government website, so that students, parents, and teachers are able to view demonstrated student performance at different grades and levels. While there have been numerous other Ministry of Education initiatives designed to support the revised assessment methods, the initiatives described in this paper give an indication that the change was given some support. Whether or not these instruments provided sufficient support for this degree of change over a short period of time is a topic that is worthy of educational research.

Reflection

This paper has attempted to outline some of the current trends in assessment theory, and to outline one government’s approach toward changes in curriculum and assessment policy in order to change teacher practice. At this point in time, there has been insufficient research on the change in assessment practices in Ontario to indicate whether or not teachers have gained in assessment literacy, or that the overall quality of student performance is increasing.

Significant research is required to be undertaken to determine the level of implementation of the current curriculum and assessment policies and to articulate important barriers. Leithwood, Fullan and Watson (2003) caution that, while evidence shows that pressure (such as the recent focus on accountability and student learning outcomes) is helpful to direct attention to priority areas of student learning, the pressure alone is not likely to “lend to substantial positive change, especially in the face of scarce resources and hasty implementation” (p.6). They find that Ontario’s implementation has been “highly problematic, reducing potential benefits that might have accrued.”

One of the most controversial of the changes has been the decision of the Ministry of Education to separate the learning skills from the reporting of student achievement on the report card. It is challenging for teachers to evaluate on achievement alone without including effort, behaviour and homework completion in the final grade. The Elementary Teachers’ Federation of Ontario (2001) has recommended that a section for the reporting of effort should be included in the next revision of the report cards. Marzano (2000), in a review of the factors included in assessment across school districts, finds that while student achievement is generally considered to be the most important factor in reporting grades, the factoring in of student effort has a “relatively broad acceptance”. He also finds “significant support” for the inclusion of behaviour (p.29). These findings would indicate that this is just one of the topics in Ontario’s assessment policy that is ripe for future investigation.

If key criteria for quality assessment are considered to be reliability, validity and fairness, then the changes in Ontario education have created an interesting background for research in assessment policy implementation. Ontario has redefined validity and reliability – moving from external examinations toward an increase in the range and number of assessment tasks administered by the classroom teacher. The Ministry has attempted to build quality and consistency in assessment through policy documents and numerous supports for implementation. Yet, important questions need to be answered. In the new era of educational outcomes and consistent provincial standards, what are the results of these changes? Leithwood and colleagues (2003) have given a mixed review of the changes, finding that there have been some negative consequences. They find that the changes in the assessment landscape have created a “harsh environment for less advantaged and diverse student populations” (p.7). They caution that teachers feel demoralized by the change process and see “few benefits” to most of the changes. Leithwood and colleagues advocate that there has been a “lack of sustained opportunities for teachers and principals to develop the necessary understanding and expertise” (p.7). These cautions need to be addressed through research-informed implementation strategies that include strategies to help teachers to become more assessment literate and to feel a stronger sense of efficacy for curriculum and assessment change.

More research is needed to identify the impact on students from the changes in assessment policy, especially for those students who find learning challenging. Finally, studies need to be conducted in the institutions that receive the graduates of Ontario education. What is the perception of the universities, colleges and workplaces regarding the knowledge and skills of Ontario graduates? The answers to these measures of accountability and quality assurance are a rich source for educational research, and are definitely worth knowing.



References

Black, P. & William, D. (1998). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan. October, 1998,p.141. Retrieved at http://www.pdkintl.org/kappan/kbla9810.htm

Bloom B. S. (1956). Taxonomy of educational objectives, Handbook I: The cognitive domain. New York: David McKay Co Inc.

Elementary Teachers’ Federation of Ontario. (2001). Adjusting the optics: Assessment, evaluation and reporting. Toronto, Ontario: ETFO.

Guskey, T. (1996). Communicating student learning. ASCD Yearbook. Alexandria, VA: ASCD.

Hargreaves, A. (2001). Beyond subjects and standards: a critical view of Educational reform. Toronto, ON: Ontario Association for Supervision and Curriculum Development.

Leithwood, K., Fullan, M. & Watson, N. (2003). The schools we need: A new Blueprint for Ontario. Toronto, Ontario: Ontario Institute for Studies in Education of the University of Toronto.Retrieved at http://schools we need.oise.utoronto.ca

Marzano, R. (2000). Transforming classroom grading. Alexandria, VA: ASCD.

McMillan, J. (2004). Classroom assessment: Principles and practice for Effective instruction. New York: Pearson Education.

Ministry of Education, Ontario. (December 1994) For the love of learning; A report of the Royal commission on learning. Queen’s Park Printer for Ontario. accessed @ http://www.edu.gov.on.ca/eng/document/nr/95.11/eqao1.html

Ministry of Education, Ontario, November, 1995, News Release: Queen’s Park Printer for Ontario. Accessed @ http://www.edu.gov.on.ca/eng/document/nr/95.11/eqao1.html

Ministry of Education, Ontario (2000). The Ontario Curriculum Grades 9 to 12 : Program Planning and Assessment. Queen’s Park Printer for Ontario. Accessed @ http://www.edu.gov.on.ca/eng/document/policy/achievement/charts1to12.pdf

Ministry of Education. (2002). The Ontario Curriculum Unit Planner. Accessed @ http://www.ocup.org

Ministry of Education. (2002). The Assessment Companion. Queen’s Park Printer for Ontario. Accessed @
www.ocup.org/resources/documents/companions/assess2002.pdf

Ministry of Education, Ontario (2004). The Ontario Curriculum Achievement Charts 1-12 (Draft).Queen’s Park Printer for Ontario. Accessed @ www.edu.gov.on.ca/eng/document/policy/achievement/charts1to12.pdf

Principles for Fair Student Assessment Practices for Education in Canada. (1993). Edmonton, Alberta: Joint Advisory Committee. Retrieved @ http://www.bctf.bc.ca/education/assessment/FairStudentAssessment.pdf

Semple, B. (1992). Performance assessment: an international experiment. The Scottish Office Education Department: IAEP Educational Testing Service. Report No. 22-CAEP– 06 U.S. Department of Education and the National Science Foundation.

Shepard, L. ( 2000) . The role of assessment in a learning culture. Educational Researcher 27,7. 4-14.

Stiggins, R. (1994). Student-centered classroom assessment. New York: Macmillan.

Stiggins, R. (2002) . Assessment crisis: The absence of assessment FOR learning. Phi Delta Kappan, 83, 10, 758-765. June 2002. Available online @ http://www.pdkintl.org/kappan/k0206sti.htm

Sutton, R. (1995). Assessment for learning. Salford, UK: RS Publications.

Wiggins, G. & McTighe, J. (1998). Understanding by design. Alexandria, VA: ASCD.

Wiggins, G. (1998). Educative assessment. San Francisco: Jossey-Bass.


Source : East Meet West KL 2005 , An international Colloqium For International Assessment APEC Paper 1

School-Based Assessment : East Meet West KL 2005 II

 School-Based Assessment and Assessment for Learning: How can it be implemented in developing and under-developed countries effectively?

Dr. Gavin Brown and Prof. John Hattie
University of Auckland, New Zealand

Many states in the USA and many nations throughout the world have introduced some form of state-wide or ‘National’ Testing. While there are examples of innovative methodologies (e.g., Kentucky, Maryland, Queensland), most systems employ a variant of the following (but not always in this order of implementation):

a. Create or agree to a ‘National/State’ Standard Course of Study/Curricula;
b. Create ‘National/State’ Tests oriented to a set of outcomes (e.g., numeracy and reading) based on these curricula;
c. Administer these tests to “all” students (usually under test-like conditions on a certain day(s) of the school year, and consider which students should be “accommodated” to be in this “all”);
d. Score, analyse, and distribute a report form to students, parents, schools, and/or Government;
e. Expect that teachers and students will learn from these reports and thence improve the quality of teaching, coverage, abilities, learning, and attitudes;
f. Expect that the reports will lead to the public having renewed and greater confidence in the public school system and support additional resources being spent on such a value-added service.

Darling-Hammond (2003, p. 1) aptly summed up the goals of these ‘National’ models of assessment:

“Advocates hoped that standards outlining what students should know and be able to do would spur other reforms that mobilize resources for student learning, including high quality curriculum frameworks, materials, and assessments tied to the standards; more widely available courser offerings that reflect this high quality curriculum; more intensive teacher preparation and professional development guided by related standards for teaching; more equalized resources for schools; and more readily available safety nets for educationally needy students.”

Linn (2000) has argued that tests are used for this purpose because (a) they are relatively inexpensive to implement, (b) they can be externally mandated, (c) testing and assessment changes can be rapidly implemented, and (d) testing results easily reported to the public.

National assessment programs based on these principles have been shown to have significant and usually deleterious impacts (Hamilton, 2003) on curriculum (Darling-Hammond & Wise, 1985; Herman & Golan, 1993), teachers (Firestone, Mayrowetz, & Fairman, 1998; Smith & Rottenberg, 1991), and teaching (Shepard & Dougherty, 1991 – see also Darling-Hammond, 2003; Klein, Hamilton, McCaffrey, & Stetcher, 2000; Koretz & Barron, 1998; Koretz, Linn, Dunbar, & Shepard, 1991; Linn, 2000; Linn, Graue, and Sanders, 1990; Stetcher, Baron, Kaganoff, & Goodwin, 1998). Though it should be noted that some evidence of positive consequences is beginning to be reported (e.g., Cizek, 2001; Monfils, Firestone, Hicks, Martinez, Schorr, & Camilli, 2004).

We have argued that there are eight principles for building an excellent system of ‘national’ assessment that can help maximise the advantages and minimise the disadvantages of national assessment (Hattie, Brown, & Keegan, 2003). The principles assert that ‘national’ assessment should (a) mirror important rich ideas, (b) make rich ideas rather than items dominant, (c) have low-stakes consequences, (d) use more than tests to communicate standards, (e) ensure ‘national’ compatibility information is available, (f) ensure that teachers value it as part of teaching, (g) assess what is taught, and (h) provide meaningful feedback to all participants. It is the system that we have devised and implemented in New Zealand in conjunction with the NZ Ministry of Education—Assessment Tools for Teaching and Learning (asTTle)—that we wish to use as the basis for our conceptualization of how school-based assessment could be introduced in developing and under-developed nations.

asTTle is a software application that has been released to all NZ schools for free on a voluntary usage basis for the implementation of standardized, yet teacher-controlled and school-managed classroom assessment. asTTle, at Ministry requirement, generates a 40 minute test of any one subject customized to the teacher’s priorities in terms of curriculum content and difficulty. Once student, school, and question performance data are entered into asTTle, teachers and administrators have a wide range of graphical reports by which they can interpret student performance against norms, criteria, and standards and which, in conjunction with a website, can identify appropriate teaching resources. This system supports diagnostic, formative, and summative interpretations and gives teachers feedback as to priorities for teaching and learning activities and reporting to parents, students, administration, and government. This resource is an example of SBA that meets the requirements of accountability at the national level while providing improvement information at the school level. What lessons have been learned from the design and implementation of this system that could be used in implementing SBA in developing and under-developed countries?

First, SBA tools need to be clearly aligned to the jurisdiction’s required curriculum statements so that teachers, students, and parents know exactly what is being taught and learned; otherwise, SBA functions as a proxy for general ability or intelligence. In asTTle we did this through curriculum mapping, item signatures, and an interface for teacher-controlled test assembly.


Curriculum mapping.

Curriculum mapping is a process by which curriculum specialists and experts analyse and describe the rich ideas underlying the content being taught (in asTTle there are no more than 8 such big ideas) and note the major developmental or learning signposts by which progress in those ideas can be identified (see asTTle Technical Reports #4 & 34 for reading, #6 & 37 for writing, #11 & 36 for mathematics, #13 & 38 for pängarau, and #23 & 39 for panui & tuhituhi).

Item signatures.

Item signatures are descriptions of the important educational and technical characteristics of the assessment materials contained within the asTTle software. See asTTle Technical Reports #12, 16, 25, and 28 for description of how these were conducted. An item’s signature or profile in asTTle is a relational database that contains:

a) important curriculum related information about the item:

a. curriculum rich idea,
b. achievement objective,
c. curriculum process (if required),
d. cognitive process (using the SOLO taxonomy—see asTTle Technical Report #43), and
e. Relationship to international curriculum categories (e.g., PIRLS, PISA, TIMSS, etc.).

b) Important psychometric information about the item:

a. difficulty (logit),
b. discrimination,
c. pseudo-chance value (if multiple-choice),
d. response format,
e. percentage correct at each year level, and
f. curriculum level (both as designed and actual through standard setting process).

c) Important item history information:

a. Answer,
b. Answer rules,
c. Writer,
d. Reviewers,
e. Status (in use, in development, retired, etc)
f. Answer image (if required),
g. Stimulus material (if required),
h. Source of stimulus material (if required), and
i. Order of presentation within a testlet (if required).



User Interface.

All of this information is used within the asTTle software to create tests and to report the meaning of student performance. For example, teachers get the power to create a standardised test using curriculum rich ideas and curriculum level difficulty. asTTle generates reports that show performance of students by achievement objective, curriculum process, cognitive process, and curriculum level. Future versions of asTTle, currently on the drawing board, will allow teachers the ability to refine the test by type of stimulus material, to select out or in certain items, and even specify item response format.

Second, for SBA to be powerful to students, teachers, parents, and administrators, it must be calibrated to criteria, norms, standards, and progress. Such calibration requires creation of large banks of valid and valued assessment tasks and items, trialled with representative samples of students, and scored using item response theory. asTTle contains six banks of test questions in two languages (English and Maori): reading (1600 items), writing (60 prompts), mathematics (1500 items), panui (600 items), tuhituhi (31 prompts), and pängarau (600 items). All of these items have been calibrated on the performance of thousands of students (totally currently over 90,000 students) in Years 4 to 12 (ages 8 to 17) and calibrated against the achievement objectives of the national curriculum statements for those subjects at Levels 2 to 6 inclusive. Calibration is done through item response theory statistical analyses to locate items relative to each other using large spa*** matrices of data—this avoids the need for all students to do all items, provided around 250 responses per item are available. Tests of item quality are conducted to ensure that:

(1) students of equal ability are not discriminated against if they are of a different ethnicity or sex (DIF—no systematic bias discovered in asttle reading for sex or major ethnic groups);
(2) chance factors do not confound interpretation of student performance (3 PL IRT pseudo-chance factor identification—determined to be about .18 on asTTle’s 4 option multiple-choice questions);
(3) items discriminate consistently in favour of students who actually have more ability than those with less (2 PL IRT discrimination index—items with low values are eliminated or revised before publication);
(4) wrong options in selected response items perform appropriately (CTT distractor analyses allow elimination or correction of distractors and confirmation of keys before publication);
(5) items assess their intended location in the curriculum difficulty spectrum (standard setting through the ‘bookmark’ procedure—see asttle technical report #22 and asttle V4 Manual); and
(6) the items measure what they claim to measure (multiple validity checks are carried out through review panels, item proof readers, teacher administrators, readability analyses, sensitivity reviews, item signature studies described above, software that validates item information to ensure it complies with rules for items and testlets).

Based on these calibration procedures, users can have confidence that the items are of high quality, that the reports produce valid information about strengths and weaknesses of students, and that alternative explanations for student performance can be discounted. asTTle uses this information to generate IRT based reports (traditionally called ‘kidmaps’) that show student performance on achievement objectives based on the difficulty of the item relative to the student’s overall performance and based on whether the item was responded to correctly or not—in this way, teachers get a report on strengths, achievements, to be achieved, and gaps in a student’s learning of curriculum (see asTTle Technical Report #15 for details). Further, given the large sample of students on which the items were trialed, it is possible to report student performance relative to that of similar students—like with like comparisons help teachers make appropriate interpretations. In other words, if a student is performing below the average, it may be possible to excuse this on the basis that low scores are always associated with students of that kind. With a like-with-like comparison, performance below the average can no longer be excused—the teacher must accept that on average students like mine do better elsewhere. This looking-in-the-mirror effect focuses the teacher’s mind on what the teacher must do differently or better rather than on blaming or excusing the learner for below average performance.

Additionally, this reporting system allows comparison of schools to similar schools without the invidious effects of ‘league tables’; asTTle currently compares schools based on socio-economic status, region, size, ethnic mix, and sector (see asTTle Technical Report #14 for description of how this was done in New Zealand). Again comparing the performance of my students in my school to that of other similar schools focuses the administrator’s mind not on excusing or blaming poor performance based on characteristics of the school population, but rather on identifying in what areas his school might benefit from consulting with other schools who are doing a better job with the same type of students. This comparative information (norm-referenced interpretations) helps identify whether a learning objective or a group of learners represent a matter of concern or pride—but certainly without the criterion-referenced interpretations supported by the IRT diagnostic and the standards-based interpretations supported by the curriculum levels analyses, most users of assessments become complacent about rank-order scores. Further, the expense of asTTle items and software could not be justified if all that were provided to users were a rank-order score.


Third, teacher-administrators of SBA much be given choice over what is assessed, how hard the assessment is, when the assessment takes place, who is given which assessment, what interpretations should be made, and what actions should be taken; in this way, validity is ensured. The great criticism of externally-mandated, centrally-controlled national assessment is the poor fit of the test to the local school context. Teachers treat the test and scores as invalid if the material is “too hard”, “taught last year”, “taught next term”, “too easy” (see asTTle Technical Reports #27, 33, 42, 44, 46, & 49 for New Zealand teacher feedback on this issue). Another facet to this criticism is the impression that standardised testing requires everyone to take the same test (‘one size fits all’). Clearly, not all students in any class of 30 or more students are at the same level of ability in any subject—thus, one size does not fit all—teachers and students deserve a mechanism by which student abilities can be custom-fitted to the difficulty of the test. Generally, in SBA teachers can to some extent do this themselves without a great deal of external assistance. However, if we want these custom-designed tests to have rich norm-referenced, criterion-referenced, and standards-based interpretations, the test questions have to be calibrated to these scales; clearly teachers are unable to do this, even if they had all the necessary training in assessment development. The cost and effort required to produce this information is prohibitive at the school-level; only nations and large jurisdictions can underwrite such activity. But the validity of the test from the centre will always be questioned—thus a mechanism that gives teachers the power to create customised tests that are all standardised would be incredibly powerful for improving the quality of assessment and information available to teachers, administrators, students, and parents.

If the scores (or any feedback) is delayed through large-scale central scoring, data entry, item analysis, report generation (as it is inevitably by some 3 to 6 months), then the potential for that information to actually shape meaningful learning activities is practically nil—the students have changed class or grade, the teachers have moved on to new material, the class may have been successfully taught that content, and so on. Prompt feedback and teacher control over what is in the test are two key features to successful SBA and are delivered in the asTTle software—teachers review and validate the test content and difficulty and get rich interpretive feedback as soon as the test is scored and data entered; certainly well within the time to make a difference.

Fourth, SBA developers and publishers must seek to communicate to teachers, parents, and students in novel and powerful ways. Instead of focusing on numbers (especially rank order values such as stanine, percentile, grade equivalent, etc.), SBA is greatly enhanced with graphical feedback to participants as to where learners are relative to teaching goals (see asTTle Technical Report #15 for details of how and why the reports were developed). Displaying graphically performance relative to a norm or a standard, rather than reporting a numeric score, reduces the need for teachers to be assessment literate in the classical sense and takes advantage of well-developed intuitive understandings of what charts mean. Simple and consistent use of conventions greatly enhances the clarity and communicability of educational reporting. For example, in asTTle we use Tukey’s box and whisker plot to communicate distribution and central tendency of scores, we use red scores on blue fields to show ‘my group’ compared to the norm group, and we use point indicators broad enough to encompass the standard error of measurement. Indeed, the asTTle reports are so transparent that some schools report that students are leading parent-teacher conferences using the Individual Learning Pathways report—a variant of the ‘kidmap’ report.

Research into the accuracy of users’ understanding of the reports certainly improves the quality of the charts (see asTTle Technical Reports #1, 9, & 10) and identifies professional development or training requirements (see asTTle Technical Report #35 where answers to comprehension questions about asTTle reports showed that teachers who had received PD had more accurate understanding than those who did not). Without item response theory scoring and calibration of items educational reporting is more limited since classical test theory can simply provide a test based score only rather than an item based report. IRT is essential to enable meaningful reporting of SBA as students complete different items at different times from a bank of items that covers a much wider range of valued content than simply any one test could ever do.

Fifth, removing central control or central reporting or centrally-mandated consequences, has been found to assist in the uptake of SBA. This policy in New Zealand takes advantage of teachers’ professionalism and respects their vital role in enhancing school effectiveness, while minimising the negative consequences of national testing. Indeed, research into teachers’ conceptions of assessment in New Zealand (Brown, 2004), has shown that primary school teachers not only agree with the goal of using assessment to improve education but also agree that assessment can help identify schools and teachers who are doing a good job. Fundamentally, these teachers are willing to make assessment demonstrate accountability because Big Brother is NOT watching. Furthermore, they have high-quality resources that provide data that external agencies accept as being credible and not just a function of wishful thinking or bias.

New Zealand teachers, on the whole, are not afraid to look into the mirror of assessment and discover that students are not learning as expected or even as well as the norm. This fearlessness comes, not from special attribute of our teachers, but rather from a policy context of high-trust and school-based management of learning against national objectives. Teachers and administrators are expected to be the first to identify learning needs in cohorts and individuals and make appropriate plans and reports long before inspectors or external agencies come along to determine if the school is doing a good job. In such a context, the teacher and school are supported in identifying ‘bad news’, rather than in the classic high-stakes context, reported in many jurisdictions (Cannell, 1989), of make the bad news go away by cheating, making extra accommodations, and teaching to the test. Since the data belong to the school, not the government or the developers, the school can have confidence in inspecting the ‘bad news’ safely without fear of being blind-sided by unexpected public exposure or humiliation. What matters in New Zealand is not that there be NO bad news, but rather that teachers and schools identify needs, implement appropriate educational plans, and monitor the effectiveness of those plans prior to external vetting or inspection. The New Zealand inspectors want to know what evidence was used to identify a problem, what was done about it, and how the school knows that the planned intervention is working—all of which can be answered by using a high-quality SBA resource.

Sixth, the New Zealand experience has shown that making the use of a new government sponsored SBA resource voluntary can have positive impact provided the resource provides new and valuable information to teachers. Related to point 5, the psychology of educators in New Zealand is such that compulsion may have a negative impact. That teachers could choose among the various options available to them, meant that those choosing asTTle did so because they were convinced that it gave them information that they otherwise would not have obtained about the effectiveness of their teaching. New Zealand, consistent with Hamilton’s (2003) recommendations provides teachers with many resources to monitor learning—not just a bank of tests, but rather there are exemplars of learning, a wide variety of assessment resources, means of monitoring system wide developments, and high-quality curriculum based resources. Within this multi-faceted context, teachers are gravitating to the use of asTTle because it does tell them something they didn’t know; for example, surprisingly hard things that students could actually do, surprisingly easy things the students could not do, students who had not made much progress, and so on. Further, the asTTle website provided teachers with access to catalogued, high-quality teaching resources that could meet the identified learning needs of their students—that surprising resource (you are no longer on your own with this unpleasant assessment resource) meant that teachers could close the feedback loop and begin to respond appropriately to learning needs. Truly this is a matter of bringing the thirsty camel to the water and letting it choose which well it would drink from.

Seventh, the use of computer technology is critical to permit customised test creation and sophisticated IRT based score calculations. Teachers simply could not create photo-copy ready classroom assessments of a learning area at a certain difficulty area in the 7 minutes that it takes asTTle to create a 40-minute test. The many hours saved here can be easily transferred to the straightforward problems of scoring and data entry. Much more powerfully, teachers could only calculate total score for a test with the assumption that all items on the test were equally difficulty—although all of us know intuitively that items are not equally difficult even on the highest quality assessments ever made. Thus, the computer can accurately and rapidly estimate what a student’s strengths and needs are, freeing the teacher to concentrate on the important decisions and actions he or she will take based on that information. The transformation of scores into meaningful pictures requires computer technology that captures expert processes—again freeing teachers from the need to be assessment literate, while forcing them to be extremely literate about their teaching content and their students.

Nevertheless, the design, selection, and deployment of ICT must be done in a manner consistent with infrastructure status and development plans of the country. When asTTle began in New Zealand in 2000, all that was required was that tests could be created and reported upon with the kinds of computers schools and teachers already had. Hence, it functioned on stand-alone computers such as Mac Classic and Windows 95. Based on the positive response of teachers to that resource and the call for improvements, versions 3 and 4 extended ICT functionality until now asTTle users have the option of shared database systems (Multi-user asTTle V4) that operate on multiple servers; i.e., Mac OSX Panther and Tiger; Windows 2000, 2003, & NT; Linux Redhat 9 & Enterprise; & Novell Netware. In addition, asTTle V4 operates on laptops, desktops, and still supports Windows 98 while keeping pace with other technical requirements.

However, the requests for new technical features and options never stops. Teachers are now asking for onscreen testing to save paper and data entry work, parents want access to the reports about their children, teachers want more flexibility in test creation, others have asked for computer adaptive testing so that less time is spent testing and more time is spent teaching. Administrators want greater and more seamless interaction with the school or student management systems. These new wish list elements bespeak a fundamental commitment to using ICT to serve educational needs; this is in contrast to the traditional environment of ICT in search of a real application. asTTle is an educational resource that happens to use technology, not a technology resource for education. Assessing real learning is a real educational application desperately in need of better tools to relieve teacher workload and to improve teacher effectiveness. The touchstone of the asTTle project has not been “are we using the newest and best?”, but rather “when teachers use asTTle do they focus on the technology or the education?”. If the answer is the latter, then we have succeeded, and this is the benchmark against which all ‘smart’ education innovations need to be judged.

Finally, the deployment of SBA needs to take place on an incremental basis during which research is conducted, both to ensure that any technology is developed appropriately and to ensure that teachers understand and implement it for the purpose of improvement. The New Zealand experience clearly showed the benefit of gradual implementation. First, it allowed time for teachers to get used to the idea of customised, yet standardised tests that they controlled rather than the central agency. Second, it gave time for item development, trialing, reviewing, calibration, norming, and so on. Third, it gave time for teachers to understand what the system could do and to make clear what they wanted it do—experience-based requirements specification means that what the end user really wants is what is actually developed. Fourth, it gave a robust basis for determining whether what was done, was having the impact expected—research-based evaluations over time meant that those ‘summative’ analyses could feed ‘formatively’ into powerful revisions. Fifth, from a government perspective, it meant that each stage of development had a fixed budget and timeline that was met without fail—success in delivery on this basis is rare in the ICT industry and even rarer in government and so we are justifiably proud of what we did. Sixth, it meant that the level of technology was always at an appropriate level of complexity; it was not overspecified or overdelivered for the current infrastructure capacity at the time the software was released—in other words, we currently supply on-paper testing only because that’s what schools could deliver now and not what we or anyone else could dream of. As infrastructure capacity increases and as demand increases, then the enhancements can be put in place.

We have taken throughout the approach that the design and delivery of asTTle has been on a research basis—our design was informed by new information and theory throughout and we have not mindlessly followed a master plan from the beginning. Agile responsiveness to new knowledge was made feasible by an incremental approach. Further, as a research project we are proud to have made accessible to all users technical reports that provide a robust basis for confidence in our processes and results. This research commitment has extended to publishing articles, completion of research theses, and provision of training—our team teach SBA on the basis of their success at designing and delivering a powerful mechanism for doing it.

To conclude, SBA can be introduced effectively into any jurisdiction provided certain conditions are met. There must be a will to provide teachers with educational resources that help them improve the quality of their work—we suggest by giving them feedback as to goals, progress, and next steps based on real observation of student performance. It is not about giving them technology, nor about implementing a centralised system of checking on teachers. It is about respecting teachers so much that we trust their professionalism to monitor their own work and effectively respond to learning needs. It is not about making teachers into world-class assessors, but rather about helping teachers do what they really exist for—improving the life chances of a nation’s young people by easily, accurately, and appropriately identifying their learning needs and responding appropriately. SBA, developed in this manner, can effectively assist developing and under-developed nations meet the needs of the knowledge economy in the 21st century.



References

Brown, G.T.L. (2004). Teachers’ conceptions of assessment: Implications for policy and professional development. Assessment in Education: Policy, Principles and Practice, 11(3), 305-322.
Cannell, J. J. (1989). How public educators cheat on standardized achievement tests. Albuquerque, NM: Friends for Education.
Cizek, G. J. (2001). More unintended consequences of high-stakes testing. Educational Measurement: Issues and Practice, 20(4), 19-27.
Darling-Hammond, L., & Wise, A. E. (1985). Beyond Standardization: State Standards and School Improvement. Elementary School Journal. 85(3), 315-336.
Darling-Hammond, L. (2003, February). Standards and Assessments: Where We Are and What We Need Teachers College Record http://www.tcrecord.org ID Number: 11109, Date Accessed: 8/2/2005
Firestone, W. A., Mayrowetz, D., & Fairman, J. (1998). Performance-based assessment and instructional change: The effects of testing in Maine and Maryland. Educational Evaluation and Policy Analysis, 20(2), 95–113.
Hamilton, L. (2003). Assessment as a policy tool. Review of Research in Education, 27, 25-68.
Hattie, J.A.C., Brown, G.T.L., & Keegan, P.J. (2003). A national teacher-managed, curriculum-based assessment system: Assessment Tools for Teaching & Learning (asTTle). International Journal of Learning, 10, 771-778.
Herman, J. L., & Golan, S. (1993). The effects of standardized testing on teaching and schools.Educational Measurement: Issues and Practice, 12(4), 20-25, 41-42.
Klein, S. P., Hamilton, L. S., McCaffrey, D. F., & Stecher, B. M. (2000). What do test scores in Texas tell us? Santa Monica, CA: RAND. Available as ERIC Document ED447219
Koretz, D. M., & Barron, S. I. (1998). The Validity of gains on the Kentucky Instructional Results Information System (KIRIS). Santa Monica, CA.: RAND. Available as ERIC Document ED428131.
Koretz, D. M., Linn, R. L., Dunbar, S. B., & Shepard, L. A. (1991, April). The effects of high-stakes testing on achievement: Preliminary findings about generalization across tests. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
Linn, R. L. (2000). Assessments and accountability. Educational Researcher, 29,(2), 4–16.
Linn, R. L., Graue, E. M., & Sanders, N. M. (1990). Comparing state and district test results to national norms: The validity of claims that “everyone is above average”. Educational Measurement: Issues and Practice, 9, 5-14.
Monfils, L. F., Firestone, W. A., Hickes, J. E., Martinez, M. C., Schorr, R. Y., & Camilli, G. (2004). Teaching to the test. In W. A. Firestone, R. Y. Schorr, & L. F. Monfils (Eds.). The ambiguity of teaching to the test: Standards, assessment, and educational reform (pp. 37-61). Mahwah, NJ: LEA.
Shepard, L. A., & Dougherty, K. C. (1991). Effects of high-stakes testing on instruction. Paper presented at the annual meeting of the American Educational Research Association and National Council on Measurement in Education, Chicago.
Smith, M. L., & Rottenberg, C. (1991). Unintended consequences of external testing in elementary schools. Educational Measurement: Issues and Practice, 10, 7–11.
Stetcher, B. M., Barron, S. I., Kaganoff, T., & Goodwin, J. (1998). The effects of standards-based assessment on classroom practices: Results of the 1996-97 RAND survey of Kentucky teachers of mathematics and writing (CSE Tech. Rep. 482). Los Angeles: Center for Research on Evaluation, Standards, and Student Testing.

asTTle Technical Reports Available at www.asttle.org.nz

Curriculum Maps

Christensen, I., Trinick, T., & Keegan, P. J. (2003). Pängarau curriculum framework and map: Levels 2-6(Tech. Rep. No. 38). Auckland, NZ: University of Auckland/Ministry of Education.
Coogan, P., Hoben, N., & Parr, J. M. (2003). Written language curriculum framework and map: Levels 5-6 (Tech. Rep. No. 37). Auckland, NZ: University of Auckland/Ministry of Education.
Ell, F. (2001). Mathematics in the New Zealand Curriculum - A concept map of the curriculum document.(Tech. Rep. No. 11). Auckland, NZ: University of Auckland, Project asTTle.
Fairhall, U., & Keegan, P. J. (2001). Pängarau curriculum framework and map: Levels 2-4. (Tech. Rep. No. 13). Auckland, NZ: University of Auckland/Ministry of Education.
Glasswell, K., Parr, J., & Aikman, M. (2001). Development of the asTTle writing assessment rubrics for scoring extended writing tasks. (Tech. Rep. No. 6). Auckland, NZ: University of Auckland, Project asTTle.
Limbrick, L., Keenan, J., & Girven, A. (2000). Mapping the English curriculum. (Tech. Rep. No. 4). Auckland, NZ: University of Auckland, Project asTTle.
Murphy, H., & Gray, A. (2003). Review of Mäori literacy framework for koeke 2-6 panui/tuhituhi of the Mäori language curriculum statement, Te Reo Mäori i roto i ngä Marautanga o Aotearoa (Tech. Rep. No. 39). Auckland, NZ: University of Auckland/Ministry of Education.
Murphy, H., & Keegan, P. J. (2002). Te Reo Mäori literacy curriculum map. Levels 2-4 (Tech. Rep. No. 23). Auckland, NZ: University of Auckland/Ministry of Education.
Nicholls, H. (2003). English reading curriculum framework and map: Levels 2-6 (Tech. Rep. No. 34). Auckland, NZ: University of Auckland/Ministry of Education.
Thomas, G., Holton, D., Tagg, A., & Brown, G. T. L. (2003). Mathematics curriculum framework and map: Levels 2-6 (Tech. Rep. No. 36). Auckland, NZ: University of Auckland/Ministry of Education.


Item Signatures

Brown, G. T. L. (2002). Item signature study: Report on the characteristics of reading texts and items from calibration 3 (Tech. Rep. No. 28). Auckland, NZ: University of Auckland, Project asTTle.
Meagher-Lundberg, P., & Brown, G. T. L. (2001). Item signature study: Report on the characteristics of reading texts and items from calibration 1. (Tech. Rep. No. 12). Auckland, NZ: University of Auckland, Project asTTle.
Meagher-Lundberg, P., & Brown, G. T. L. (2001). Item signature study: Report on the characteristics of reading texts and items from calibration 2. (Tech. Rep. No. 16). Auckland, NZ: University of Auckland, Project asTTle.
Thomas, G., Tagg, A., Holton, D., & Brown, G.T.L. (2002). Numeracy item signature study: A theoretically derived basis. (Tech. Rep. No. 25). Auckland, NZ: University of Auckland, Project asTTle.

Reports

Brown, G. T. L. (2001). Reporting assessment information to teachers: Report of Project asTTle outputs design. (Tech. Rep. No. 15). Auckland, NZ: University of Auckland, Project asTTle.
Hattie, J. A. (2002). Schools like mine: Cluster analysis of New Zealand schools. (Tech. Rep. No. 14). Auckland, NZ: University of Auckland, Project asTTle.
Hattie, J. C., & Brown, G. T. L. (2003). Standard setting for asTTle reading: A comparison of methods.(Tech. Rep. No. 21). Auckland, NZ: University of Auckland/Ministry of Education.
Hattie, J.A.C., & Brown, G.T.L. (2004, September). Cognitive processes in asTTle: The SOLO taxonomy. asTTle Technical Report #43, University of Auckland/Ministry of Education.
Meagher-Lundberg, P. (2000). Comparison variables useful to teachers in analysing assessment results.(Tech. Rep. No. 1). Auckland, NZ: University of Auckland, Project asTTle.
Meagher-Lundberg, P. (2001). Output reporting design: Focus group 1. (Tech. Rep. No. 9). Auckland, NZ: University of Auckland, Project asTTle.
Meagher-Lundberg, P. (2001). Output reporting design: Focus group 2. (Tech. Rep. No. 10). Auckland, NZ: University of Auckland, Project asTTle.
Teacher Feedback
Brown, G.T.L., Irving, S.E., Hattie, J., Sussex, K., & Cutforth, S. (2004, August). Summary of Teacher Feedback from the Secondary School Calibration of asTTle™ Reading and Writing Assessments for Curriculum Levels 4 to 6. (asTTle Tech. Rep. 49). University of Auckland/Ministry of Education.
Irving, S. E., & Higginson, R. M. (2003). Improving asTTle for secondary school use: Teacher and student feedback (Tech. Rep. No. 42). Auckland, NZ: University of Auckland/Ministry of Education.
Keegan, P. J., & Pipi, A. (2002). Summary of the teacher feedback from the calibration of asTTle v2 pänui, pängarau and tuhituhi assessments (Tech. Rep. No. 27). Auckland, NZ: University of Auckland/Ministry of Education.

Keegan, P. J., & Pipi, A. (2003). Summary of the teacher feedback from the calibration of the asTTle V3 pängarau assessments. (Tech. Rep. No. 44). Auckland, NZ: University of Auckland/Ministry of Education.
Keegan, P.J., & Ngaia, T. (2004, August). Summary of teacher feedback from the V4 calibration of asTTlepänui and tuhituhi assessments for Curriculum Levels 2 to 6. (asTTle Tech. Rep. 46). University of Auckland/Ministry of Education.
Lavery, L., & Brown, G. T. L. (2002). Overall summary of teacher feedback from the calibrations and trials of the asTTle reading, writing, and mathematics assessments (Tech. Rep. No. 33). Auckland, NZ: University of Auckland, Project asTTle.


Source : East Meet West KL 2005 , An international Colloqium For International Assessment APEC  Paper 2

School-Based Assessment : East Meet West KL 2005 III

 The Epistemology of School-Based Assessment And Assessment For Learning In The Eastern Civilization: The Linkage to Current Knowledge and Practices

Jahja Umar, Ph.D Satya Buana Foundation,Jakarta, Indonessia

Introduction

The term “assessment” in educational context could have many different meanings and often, it is used interchangeably with other terminology such as “evaluation”, “educational measurement”, “testing”, and “examination”. It is an elastic word that stretches to cover judgements of many kinds. However, there is one thing in common every time educational assessment is discussed: it always has something to do with information on learning and learning acquisition, which usually based on measures of what individuals know. It is to tell whether and what people have learned. The roles and purposes of educational assessment have not substantially changed throughout much of history, which are educational and occupational selection, placement, making certification decisions, and to promote learning. In many cases (e.g. Little and Wolfe, 1996), the word “assessment” is used broadly, to cover all judgements of educational performance which are used in individual or aggregated form, for one or more purposes and by a range of persons and institutions (Foreword, page x).

School-based assessment is a term usually used for the measurement of knowledge and skills of individual students, at school or classroom level, in order to obtain information on what they have learned as a result of their educational experiences. In this context, the assessment is sometimes labeled according to Scriven’s (1967) “formative” and “summative” evaluation. The term “formative assessment” is used when the purpose is to improve or develop teaching-learning processes, and “summative assessment” is used when it is for decision-making about the end result of the processes. However, again, the term is not used consistently. Brookhart (2001), for example, mentioned that some authors see all classroom assessment as formative and discuss summative assessments primarily in terms of external assessments (p. 153 – 154). In this paper, the author will use the term “school-based assessment” (SBA) for summative assessment done by the school particularly in regard with making decisions on certification and selection (as oppose to external high stake assessment), and “assessment for learning” (AFL) is used when the purpose is formative.


SBA and AFL in the Eastern Civilization

The most frequently cited formal assessment in the eastern civilization is the early Chinese civil service examination or selection. Certainly it was not a school-based summative assessment, neither was it formative. In the old eastern tradition, learning activities were usually take a form of boarding school in which students stay and live with their teachers at a location sometimes apart from the surrounding communities. Or, the learning activities were less structured or even an informal interactions between a student and his/her teacher. The long history of formal education in the eastern civilization mostly associated with religious schools especially Buddhist, Hindu, and Muslim. Quite often, teaching of martial arts was an inseparable part of the school.

Under this tradition, a competency based approach or a principle of “learning for mastery” was strictly implemented. Undoubtedly the qualification of its graduates is virtually under control as intended, depending upon either student potentials or the capability of the teacher. All forms of assessment were internal and the purposes were for both learning and promotion/selection. Public recognition on the school will usually be gained when the graduates or the teachers win a local or national contest or competitions.
Schooling education in the Asian countries as it is today, is in fact a product adopted from the western civilization.

When the schooling education becomes a mass program and no longer in a boarding tradition, quality control and quality assurance activities were getting more and more difficult. At the same time, the need for standards becomes part of a modern world and needed to be set at both national and international levels. In this case, the developed western civilization have set up most of the standards and the certifications, which usually become international standards. Here, some countries in the east adopt standards from their former colonial country in Europe, while others try to develop their own. The school-based assessment is considered as an important part of quality assurance activities, while the external examination using a national or international standard is equally important because of the need for recognition and for survival in competitions which are part of any modern life.

In this regard, school-based assessment and assessment for learning are usually made comprehensive, while an external examination using a national or international standard is typically limited to a small number of subjects considered important for competition and recognition. Therefore, school-based assessments and the public examination are in fact, complementary. Neither one could substitute the other.


National Examination Or School-based Assessment?

In the last 30 years, there is a strong tendency to look at the school-based assessment as the only one that is needed. Many educationists considered that an examination driven learning activities could narrow the curriculum and reduce education into teaching a small number of subjects. In line with the development in education philosophy in which student is to be the focus of education, the phrase “teaching and learning” is replaced with “student learning”, and any high stake examinations tend to be abolished. Any form of learning activities which are not enjoyable to student is often considered as against learning. In its extreme, even the terminology “pass the exam” is taken out from education dictionary. In Indonesia, for example, certificate of competency/ literacy at the primary and secondary levels were changed into a certificate of completion. It took place for more than 30 years. As a result, variability in regard with quality of the graduates was getting higher and can not be measured. Recognition became a difficult matter at both national and international levels, not to mention the global competition.

The development in psychological theories also has its influence on the practices of educational assessment. For example, the situated perspectives of learning (Greeno, 1997) claimed that learning and development should not be viewed as progress along a trajectory of skills and knowledge but rather to be viewed as progress along trajectories of participation and growth of identity. He argued that “learning” should replace “knowledge”, “abstraction” to be replaced with “generality”, and that knowledge can not be transferred. In his view, measurement of knowledge and skills of individual is not important since “learning” always occur in group. In the other hand, the cognitive perspectives of learning (Anderson, Reder, and Simon,1997) commented that it is only a rhetorical language game is being played since Greeno’s claim can be true only if one believes that “knowledge” is not what is learned. It is entirely a matter of the definitions one chooses for “knowledge” and “learning”. The movement against testing might be rooted in the radical behaviorist like Skinner (1948) who wrote: “… Since our children remain happy, energetic, and curious, we don’t need to teach `subjects’ at all. We teach only the techniques of learning and thinking”.

It may be interesting to take Indonesian experience in regard with national examination as a case to look at. Prior to 1970, there was a national examination using items of quite high level of difficulty. All questions were essays. Students had to study very hard in order to pass the exam and the teachers did all their best since otherwise their students might failed. Schools were considered good when many of its students pass the exam even though it lack of resources.

Variability in the quality of graduates was small. In 1970 the national exam was abolished in the reasons of:
(1) mass education,
(2) national exam is expensive,
(3) more humanistic education,
(4) a very detail and highly structured curriculum could improve teaching practices, and
(4) school-based examination was believed to be more appropriate.

After 10 years, it was found that:
(1) quality of graduates was decreasing,
(2) university selection becomes a national and very high stake exam,
(3) variability of quality of the graduates was very high nationally,
(4) school is seen as a good one if it is rich and fully equipped,
(5) most of teacher training were in regard with teaching methodologies,
(6) most of student feel that they don’t need to study hard, and
(7) virtually all student pass the school-based exam..

Since 1981 there was awareness that the variability of quality should be minimize and the national exam should exist. As a result, since then in Indonesia there always a so called “national examination” but the decision to pass/fail the student is left to the school. It is hard to classify whether it is a national or school-based examination.

The variability is still very large, and there still a lot of people who think that learning is more important than knowledge and skills, because they believe that knowledge is not what is learned!


Recent development in the assessment methodology and technology could in fact bring compatibility between school-based assessment and the examination using national standard. Advances in the fields such as Item Response Theory, Computerized Adaptive Testing, and Item Banking, for example, could make the implementation of a national standard not limited to only small number of subjects. It is now possible to apply one item bank for both purposes: school-based assessment and national examination.

A national standard could even be applied at different levels of competency within each of the subject matter, with high degree of flexibility and efficiency. It is also equally possible to provide quick feed-back to teachers and schools in regard with student performance by item or topic. In other words, the role of assessment for learning could be well enhanced through the application of modern technology and methodology.

As a closing statement, I would like to quote a paragraph of Heyneman’s (1997) personal story with Jim Coleman: “ …It isn’t poverty which drives scores of U.S. students down……. But rather impoverished spirit. It is the general lack of desire to learn and this, in turn, is affected by public policy. What differentiates American children from other children in the world ….. is American public policy toward children. In general, children in the United States are provided with too much opportunity and too few obligations; too much choice and too few responsibilities. …. In addition, U.S. School children are influenced by a common assumption that curriculum has to be entertaining, and that there is a scarcity of opportunities to participate in adult roles” (page 29-30).

The statement above can be compared with proverb popular in the South East Asian countries such as (in Malays/ Indonesian): ..” Berakit-rakit ke hulu, berenang-renang ke tepian, bersakit-sakit dahulu bersenang-senang kemudian”, which means …”That some one has to give up what he/she could enjoy today if he/she want to have a great success in the future” This would best described the original philosophy of learning in the eastern civilization.

References

Anderson, J.R.; Reder, L.M.; and Simon, H.A. 1997. Situative Versus Cognitive Perspectives: Form Versus Substance. Educational Researcher, 26, 1, 18—21.
Brookhart, S.M.; 2001. Successful Students’ Formative and Summative Uses of Assessment Information. Assessment in Education: Principles, Policy & Practice, 8, 2, 153—69.
Greeno, J.G.; 1997. On Claims That Answer the Wrong Questions. Educational Researcher, 26, 1, 5—17.
Heyneman, S.P.; 1997. Jim Coleman: A Personal Story. Educational Researcher, 26, 1, 28—30.
Little, A. and Wolf, A. (Eds); 1996. Assessment in Transition. Learning, Monitoring and Selection in International Perspective. Oxford: Elsevier Science
Little, A.; 1996. Contexts and Histories: the shaping of assessment practice. In A. Little and A. Wolf (Eds.): Assessment in Transition. Learning, Monitoring and Selection in International Perspective. Oxford: Elsevier Science
Scriven, M.; 1967. The Methodology of Evaluation. In R.W. Tyler, R.M. Gagne, and M. Scriven (Eds.): Perspectives of Curriculum Evaluation. Chicago: Rand McNally.
Skinner, B.F.; 1948. Walden Two. New York: Macmillan



Source : East Meet West KL 2005 , An international Colloqium For International Assessment APEC Paper 3

Kopi-O Kosong is Good For You


Coffee: Drink to Your Health Wednesday, May 5, 2010 By Sylvia Booth Hubbard, Newsmax 




For many years, coffee was considered a vice, linked with sleepless nights and cigarettes. But scientists have discovered that coffee contains potent antioxidants that can fight numerous ailments, including heart disease and diabetes. According to the American Coffee Association, 54 percent of Americans drink coffee on a daily basis, and they drink, on average, over three cups each. 

The diseases coffee can benefit include: 

• Dementia. Drinking moderate amounts of coffee during middle age — classified as three to five cups daily — can decrease the risk of dementia by 65 percent, according to a 2009 study by Swedish and Finnish researchers. 

• Liver disease. In those who drink too much alcohol, those who drank the most coffee — more than four cups every day — reduced their risk of developing alcoholic cirrhosis by 80 percent. 

• Heart disease. Research associated with The Nurses' Health Study found that women who drank two to three cups of coffee daily had a 25 percent lower risk of dying from heart disease. Along the same line, a Spanish study found that men who drank more than five cups of coffee each day lowered their risk of dying from heart disease by 44 percent, and that women who drank four to five cups each day reduced their risk by 34 percent. 

• Prostate cancer. A recent study from Harvard Medical School found that men who drank the most coffee slashed their risk of developing the fastest growing and most difficult to treat prostate cancers by more than half when compared to men who drank no coffee. 

• Gout. Drinking four or more cups of coffee each day dramatically reduces the incidence of gout, say U.S. and Canadian researchers. Men who drank four to five cups daily lowered their risk by 40 percent, and those who drank six or more cups daily reduced their risk of developing gout by 59 percent when compared to men who didn't drink coffee. 

• Breast cancer. Coffee can either reduce the risk of developing breast cancer or delay its onset, according to Swedish studies. They found that coffee alters a woman's metabolism and produces a safer balance of estrogens. Women who drank two to three cups of coffee a day reduced their cancer risk by as much as two-thirds, depending on the specific type of breast cancer. 

• Diabetes.D Enjoying six or more cups of coffee daily can cut chances of Type 2 diabetes by 54 percent in men and 30 percent in women over those who don't drink coffee. 

• Parkinson's disease. Several studies show that drinking coffee lowers a man's risk of developing Parkinson's up to 80 percent — and the more the better. 

• Colon cancer
. A Japanese report found that women who drank three or more cups of coffee every day slashed their risk of developing colon cancer in half. 

via ahmad rahimi


Taxi in Dubai

While is Dubai , take a ride on one of your favorite car! Now eat your heart out!!! Under capitalism, the Dubai Taxi. Wow!  It not just flying to Dubai for shopping, Taxis can also become a attractions!

 BMW 745, it is common

 Mercedes-Benz 600

 Mercedes-Benz E240 Wagon

 Porsche Cayenne

 Porsche GTS 

 Toyota RAV4 

 Porsche rental company ?

 Women carrying female passengers specifically dedicated leased Land Rover

 Lexus RX400, RX400 ah this is the top

Volvo SC70

 

 Volvo S80 extended the starting price of Dubai is said to 6 yuan

 Hummer H2

 Hummer H2

 Chrysler 300C

 Rolls-Royce Phantom, is said to be the exclusive Burj al-Arab hotel taxi

 Maybach 57

 Audi R8

 Lamborghini ~ ~ This is too much ~ ~ ~ ~

 Ferrari Enzo, the global limit ah ? This is also used as a taxi ?


Can we envy them?  They don't have to pay 200 to 500% tax lah !!

via Ab Hadi Yaacob