Practical Assessments

This section considers the setting of tasks and assessing of achievement by some form of practical or portfolio work rather than a written or oral exam. Methods include essays and prepared presentations as well as pure programming tasks. It is based on examples from the field of computing.

Practical work is popular with many students because it eliminates or reduces the overall weighting of an exam at the end of a teaching unit. This sort of work can be hard to assess because while there will be good and less good work there will be less clear cut answers to the questions. If a question asks what is 2 + 2 then the expected answer is 4. A task involving describing persistence of vision with regard to computer animation is much harder to mark as correct or wrong. This question has been asked and a well stated opinion that animation has nothing to do with persistence of vision would be one of several possible correct answers.

There is also the lack of clarity as to who has done the work. With a set written task in a quiet exam room it will be clear who has written what. If a piece of work is set then collected in it is not easy to prove that some or all of the solution came from the Internet, friends and family or even from the teacher. An example quoted by an exam board inspector noted that in one case assessments submitted for checking included the helpful ‘insert your name here’ phrase in the footer.

One way to reduce the likelihood of the student getting help is to set a time limit to complete the task. The City and Guilds programming modules made good use of this. The time allowed was usually 3 or 4 hours but could be as long as 22 hours for some tasks that required a lot of written planning. When the first of these tasks were written the Internet was not a major assistance. The likes of Yahoo and Lycos were not a good source of quickly solving programming problems. There was no ‘stack overflow’ nor the large body of users to hope for a relevant post reply within a forum. The time required to solve the set problem was just about enough to work it out, write and test a solution. Even with access to the modern Internet it would be good going to Google the code solutions and still complete within a given time. Some of the problems set were rather clever. One was the game of ‘Bulls and Cows’ or ‘Mastermind’. The computer selects a set of 4 colours from a possible 6. The user must guess the colours selected (which could be duplicates) and their correct order. The first guess is going to be random but with each guess the computer feeds back the correct colours (a white token) and correct colours in the correct order (a black token). The player may receive a black token and know that 1 of their 4 colours guessed is correct and in the right position but will not know which of the 4. The computer logic to work this out takes a little thought. This and a graphical front end with the user clicking on colours to select and test them had to be achieved within 4 hours.

This was about the most complex logic test in the City and Guilds portfolio. Some of the seemingly more involved programming tasks such as classes and inheritance were easier to pass if the student understood the concept of object-orientated programming. In general a unit of teaching would require the student to pass 1 or 2 of these tasks out of a bank of up to 4 for each (a possible 8 available tests in total). Where there were 2 tasks to be passed the first would be considerably easier than the second and could be set and assessed at an earlier part of the teaching plan. Each task did not cover the entire syllabus but in total if there were 4 possible tasks then those tasks together would cover all of the syllabus. For example one task might concentrate on file access and another would test programming drag and drop interactions. There would be considerable overlap so some syllabus areas would be covered in all of the tasks. If there were 4 good tasks available one could be used as a mock, set open book but not counting towards achieving the final assessment. If no suitable mock were available at least one would need to be created. The various programming units tended to have similar logical content so it could be possible to modify an assessment set for C programming to use as a Java mock assessment.

These assessments need to be graded. The City and Guilds solution was to identify a number of key points. If the student achieved enough of these they would gain a pass, credit or distinction. The ideal number of key points would be 100 as this makes the grading maths easy but any number above about 40 gives something to work with. To add some constraints a small number of key points must be achieved to pass or credit the unit. Some of these points are easier to check off than others, some are less than ideally clear. There can also be a chain of things to check. For example a website creation test may require a page called master.htm to be created as one key point. Others would be graded on the code within the page master.htm. If the original page had been named home.html than all of the criteria relying on the content of master.htm are in theory failed. In practice some of the original name based key points would be marked as failed but most of the points that follow through from the named page would be noted as passed. In general the number and subject of the various key points ensured that a student would pass first time, pass another assessment soon after or be unlikely to pass without substantial additional learning. Some of the criteria might be unclear but would be marked passed if the general area seemed to be covered by the student. If the student were just on the line of a grading level, perhaps having 40 out of 40 points needed the overall quality of the work would be considered. This could be used to mark up or mark down some criteria for a clear pass or fail.

This timed assessment plan depends on having more than 1 assessment and keeping the nature of those assessments under wraps before they are issued. There is a limit to our imagination in creating programming problems and example solutions to most of these problems will be lurking somewhere on the Internet. It is virtually certain that at least 1 student from a class will fail or simply not show up for the set session. If they then know what is involved they will have an advantage when it their turn to take the test. If the class knows that the initial test will be the easiest to achieve and that subsequent settings are possibly slightly harder then they have some incentive to show up and get it right first time. If the class can be trusted then they can each be set an assessment as soon as they complete enough class work to indicate that they will probably pass. When they have passed they can be set additional work or go through any remaining assessments for fun. The City and Guilds units usually ran at levels 1, 2 and 3. A student that completed level 1 could be set on level 2 work and so on even of they had not been entered for the exam and could not pass out at that level in that particular course.

To summarise here are the key decisions in working out a timed practical assessment. Will the assessment cover all of the syllabus or just part of it? It would be best to cover only part of the whole. There are probably sections that rely heavily on theory knowledge that would be best covered by an essay or exam. In a programming environment there will be a number of skills or techniques to be covered. Jamming all of these into a single assessment may end up with some pieces of work that do not fit in well with whole. Having worked out what tasks will be covered, must all of them be achieved and how will success be measured? A written exam could conceivably cover the whole of a syllabus but the student is not often required to get 100% correct answers. Success could be measured by simple yes or no points as with City and Guilds. Another approach is to set a range of guidelines gradually increasing in complexity. The student might be required to create at least 6 linked web pages for a pass. For a merit correct CSS must be used and for a distinction the html and CSS must pass a code validation check. There should be some deliberate vagueness in these criteria. If it is expressly stated what must be achieved for a certain points grade it may be possible for a piece of otherwise respectable work to not get enough marks to be awarded a pass. With some area of movement in the grade boundaries the colour and design of a website could be used to outweigh some code errors (assuming that design and code are both within the syllabus). The duration of the assessment will be partly determined by the length of class sessions. If work has to be stopped and restarted in a subsequent session the student will have the opportunity to continue working on the problem outside class. Even if the instructions and student work in progress is kept at a location that is only accessible from the classroom the students will still have knowledge of the task between sessions. They can search the Internet, practice and try out similar problems outside the allocated assessment schedule. The final time set should certainly be no longer than that taken by the teacher to complete the tasks. On the other hand giving too much time will allow the students to search for solutions and take away the time pressure that this sort of situation relies on (the task is not too hard but having to do it under pressure tests the depth of knowledge). It is hard to set a time without some experience. If a mock has been previously set with no time limit, the actual time taken can be used to work out how long the real assessment should be. Plucking a number out of the air; twice the time taken by the teacher to complete the assessment is one place to start.

From a programming point of view here are some examples that could be considered as relatively short tasks. ‘Bulls and Cows’ is described above. Data validation is always popular, in real life this is often achieved by running through regular expressions. At lower levels of ability some sort of breaking up Strings and data type checking will get through this. Barcodes in their various forms are an example data type to validate; email addresses are really too complex to do properly without regular expressions. Hangman poses a useful test for working with Strings. Most languages have some sort of String checking library, not so many will cover multiple letters and their positions in words such as ‘initiative’. Hangman examples can be expanded to graphics handling with the gallows being drawn and file handling as the chosen word is loaded from a dictionary. A simpler problem is a lottery simulator; this should avoid the same number being picked twice in the same drawing of lottery balls.

If the estimate of the time required to complete the assessment will run into more than a school day’s work it is going to run without an exact time limit. There will be a date that the assessment is first set and a date when it is expected back but no limit (above 24) to the hours spent per day on that task. If there is no penalty as to when the assessment is completed it could be an endless task. The practical final limit is not even when the teacher has finally gone on holiday and is not around to mark the work. The really final limit will be some set time after the qualification has finally expired and is no longer valid. Practically this works out as in the region of 4 years to complete an assessment if no other constraint is put upon it. This will not be of much help to a student wishing to progress to the next year of study the following September.

A University approach is to reduce the percentage of marks awarded each working day (5% or 10% each day) the work is late until a final date (often 10 days) when a mark of 0 is awarded. I have had to mark work that was submitted 9 days overdue, in that case the work would not have passed even if completed on time. There will be a mitigation process if the student has some very good reason why they cannot complete in time. Not being able to solve the problem does not count as a reason; any application for mitigation requires proof to be provided.

So a time limit needs to be set together with some understanding of the implication of what will happen if that time limit is not met. The assessment can be chunked up into small units taking short time periods or many criteria can be rolled into a much longer piece of work. If smaller chunks are used then the student can see progress as the syllabus is gradually checked off and the marking is broken down into manageable chunks. Combining as many criteria as possible into a single piece of work can reduce a lot of duplication as a single piece of student evidence can be mapped to more than part of the overall syllabus. The student should be creating less overall work than by completing several smaller tasks and will build up some larger product that will be more satisfying for them. This sort of combined portfolio tends to be a nightmare to mark as the individual pieces have to be pulled apart again to map to their respective parts in the original syllabus.

Regardless of what is to be assessed the achievement needs to be quantified. This is to deflect the arguments of certain students who assert that their minimalist approach still satisfies the criteria that they have been set. It is probable that the effort put into these arguments is greater than that required to make a good job of the work in the first place. Occasionally an examination board will provide exact guidance, such as stating that a design storyboard must have at least 6 frames. More often the guidelines are less helpful. If the student is required to produce computer animations then they must produce more than one. Requiring the student to create two is enough. Asking for three and then marking the best two offers some backup in case the student has difficulty coming up to the mark. Having chosen the number of animations their quality and length need to be set. An animated gif together with stop motion short film could make two animations but are these in enough depth and length for the exam board? If there are no clear guidelines then the teacher must make a guess of the standard and quantify it. The minimum length of the animation should be started in seconds or frames. Flash is a common animation creator and that makes use of tweens. Following from the hand drawn animation idea the animation expert draws the key images and a cheaper worker (now the computer) creates the images in between (or tweens). An animation involving many tweens could be relatively long but involve less student work than one that makes less use of tweening. If the Flash approach is required the number of key frames and layers should be stated. Where more than one animation is required these do not all need to be of equal depth. One could be a relatively short practice piece and another a longer story that has been built up over several sessions. The longer piece should be used to showcase the student’s ability with the shorter animation addressing some criteria that are not covered by the longer and of course as a blatant fulfilling of the multiple animation requirements.

Together with quantities such as length or the inclusion of stated programming constructs there are certain presentation aspects that also must be stated. These are with regard to the audience, language used and respect of copyright. If these boundaries are not stated within the original syllabus or even hidden within the exam board rules they should be made clear to the student. At higher levels of work the student should be expected to find these details within a course handbook. Less advanced students will need guidelines spelt out in each assignment. This is to avoid the poor but common excuse that the guidelines did not say that it could not be done. Key factors that should be refused are poor grammar, bad language, swearing, offensive terms, and mobile phone shorthand. If the piece includes images these should equally not be violent or offensive and should be created by the student. The ease of use of graphics packages makes it possible to acquire an image, trace around it and otherwise modify its appearance. There is some skill involved in doing this well but any modifications need to be very substantial to qualify as the creating of new images owned by the modifier not the original author. A target audience may be set for the student work and the language used should reflect that. If an animation is required to teach a subject to secondary or high school students aged from 14 to 16 any speech or documentation would be different than for an audience of 8 to 11 year olds.

When a written piece is required the topics need to be spelt out but also how long and with what degree of analysis. A page count is not sufficient, this might be argued as pages with a large font size or encourage the use of title and index pages with only a handful of words on each. An exam board criterion for a game design document required at least 15 pages of work. The poorer student work included pages with limited text and extensive lists of in-game objects all aiming to bulk up the submission. One teacher appeased the students by interpreting the pages as PowerPoint pages. If the exam board had wanted this they would have specifically stated a presentation of 15 or more pages. The board was partly at fault for stating length in pages. From a quality view 10 pages of good game ideas and a well described setting would be better than 20 pages of lists of game items. If 15 pages of text are required then that approximates to 12,000 words; a respectable word count for a final year undergraduate paper. The teacher needs to interpret what the board requires and set limits. The requirements could be set as at least 15 pages including 2,000 words describing the game setting, 2000 words on puzzles and game interactions, 1000 words on graphics and 1,000 words on game items and character attributes and at least 6 storyboard frames indicating game levels. A suitable design document template can be provided with sections to aid in setting up the content areas and helping with the word count.

Exam boards like to use words to qualify how much work should be put into a written topic. They rarely define these words to help the student and teacher decide if a piece of written work is sufficient. One of the few exceptions is the International Baccalaureate who do define the key verbs for assessment. The depth of analysis required runs roughly in the order, list, describe, compare, explain, discuss or analyse. Ideally a student should be asked to write about and discuss a subject then the quality of writing and analysis can be used to mark the work at a level from fail, through list to analyse and marks are awarded accordingly. Unfortunately board criteria often require one thing to be described and a quite different thing to be analysed.

List is the simplest instruction it may even include how many items should be listed. There is no need to explain why an item has been added to the list. The list might omit some correct items but should not include any incorrect ones,

Describe requires that some wording is needed about the selected subject. This is a little more than a definition but not much. Unlike a list proper sentence construction is required. The information has probably come from class notes or the Internet. The student would need to cite the source and put the work in their own words but no interpretation of the facts is required.

Compare needs 2 or more items to compare and is often achieved by a table with columns for each. As the comparison states what to compare a compare task can be easier than a description because the task is more specific. An exam board example required the features of animated gifs to be described as a simple task. A more complex task was to compare gifs with another animation format. Flash animations make a good comparison and students often did better in this comparison than in the description of the features of animated gifs.

Explain requires some student input as to why the subject has been chosen or why it behaves as it does. This explanation should come from the student not as is often the case from some text or Internet article. The student could cite a range of articles and discuss how each is right or wrong. The clue is in the wording here, that student is moving into the discuss category.

Discuss and analyse require the student’s own views to be made clear and to be justified. This type of work is often substantially shorter than an explanation. The reader is not required to be informed about details of the subject as in the explanation. Sources can be referenced and briefly summarised leaving the bulk of the words available to put forward the student’s opinions and to justify them. A student may require some practice to get the hang of discuss and analyse. A common problem is for them to assume that longer is better. Essays may be submitted of 10,000 or more words when 1,500 would be quite sufficient for the discussion. The extra length benefits no one and is a curse on the teacher who will need to read all of it if only to feedback on grammar and spelling issues. The quality or paucity of the document should be apparent from the first couple of pages. With an increased length of document there is an increased chance of information being included that is irrelevant, misunderstood or just plain wrong.

Comments are closed.