Saturday, 8 December 2007

DUMP2.0 Development News

The next iteration of development work for DUMP2.0 is now underway. This iteration focuses on implementing the technical feature that was judged to be most popular when we surveyed the DUMP user base earlier in the year, namely the ability to export questions from DUMP into other popular quiz systems, such as Respondus and QuestionMark Perception. We are also going to implement the slightly less popular feature of exporting questions in various versions of the IMS standard QTI specification during this iteration as well, since the paths to implementing all of these export formats are actually quite closely intertwined.

Before I go on to explain what's going to be done next in detail, it is worth pausing to look at the export functionality the DUMP already offers. The current release of DUMP (0.9.7) supports:
  1. Exporting a Question Bundle as a custom web bundle for formative assessment.
  2. Exporting a Question Bundle as a paper test, with the option whether to highlight the correct answer.
The nice thing about our custom web bundle is that it gives you a self-contained ZIP file that can be exploded into a VLE or your own web space and made available as a formative test resource for students. No server software is required as all of the answer checking is done by some JavaScript in the browser. The web content itself is also rather "cutting edge" in that it uses MathML for display of mathematics, giving high quality rendering of mathematics on browsers which support MathML (namely Firefox/Mozilla and Internet Explorer 6/7 with the MathPlayer plugin). Indeed, we use this functionality extensively in a number of undergraduate courses here in the School of Physics (most notably in Physics 1A). However, this simplicity also comes with a number of limitations:
  • The questions cannot be used for summative assessment (since web-savvy students can easily find the answers)
  • It's not possible to import our questions into other test systems.
  • Many VLEs and web-based systems that we may wish to house content in have surprisingly inept support for web standards so successfully deploying MathML within them ranges from difficult to impossible. (The University of Edinburgh's chosen VLE, WebCT, is an especially bad offender... which I have ranted at length about previously!)
In order to overcome these limitations, it became evident that we should:
  1. Look at how we can export our questions in alternative formats that can be imported into other test systems.
  2. Look at how we can export our maths content so that it can be imported into other systems.
As a first study, I chose to look at Respondus, which is one of the most popular quiz/test systems in use. Respondus is an easy-to-use authoring tool which allows its content to be exported in a number of formats, including the well-established QTI 1.2 specification, as well as specific targeted output formats suitable for importing into various versions of WebCT and other VLEs. This approach of having multiple exports is pragmatic, since the quality of support for QTI varies amongst different VLEs and test systems. In terms of getting material into Respondus, it accepts CSV files and QTI 1.2 files wrapped in an appropriate structure. Once into Respondus, the data is actually stored in a native binary format, which does not appear to be documented publicly so the export/import option is our only real possibility here. As far as Maths support goes, Respondus does support MathML to a limited extend, but not as an input format and it is either converted to images or applets on output. The applet approach requires end-users to have non-free proprietary software installed on client machines, so is not a viable option for us in my opinion.

As a result of this investigation, I am planning to add the following new functionality to DUMP during this work iteration:
  1. Offer the option of MathML or "maths as images" in all web outputs.
  2. Add functionality to export a single question or question bundle in QTI 1.2 format, wrapped up in various ways, including an IMS Content Package and whatever is required to ensure the question imports into Respondus. (This output format will have MathML converted to images as QTI 1.2 does not support MathML.)
  3. Add functionality to export a single question or question bundle in QTI 2.1 format, wrapped up as an IMS Content Package. (This newer version of QTI does support MathML, so will offer the option of having MathML or not. However, few systems currently support QTI 2.1 so this option is currenly more theoretical than usable.)
Much of this work will actually take place inside the Aardvark Processing Framework, which is the part of DUMP (and our Aardvark content management system) that does all of the heavy lifting and conversion on the raw content. Once Aardvark has been updated, an interim release of DUMP will be issued with the new functionality shoe-horned into its existing user interface.

As of this moment, feature 1 (maths as images) has been successfully prototyped and is nearly completed as a new general Aardvark feature that can be used for all Aardvark web outputs. Further work on refining this will occur in the run-up to Christmas. Feature 3 (export as QTI 2.1) has already been done (with MathML only) so will only need a little further work once item 1 has finished. Feature 2 (export as QTI 1.2) will be implemented shortly after feature 1, concentrating on QTI 1.2 initially and then performing any further work required to get this into Respondus. Do note that one downside of the "maths as images" approach is that it will be much harder to edit or adapt questions once they've been exported as that would involve redrawing or otherwise replaceing the images, rather than editing the MathML. We do not anticipate that this will be a concern for most users though.

These new features will open up a plethora of new ways of using the questions in our database. More news soon...

Friday, 30 November 2007

Online Educa Berlin 2007

David and I have made it to O EB2007 this year for the first time. The first thing to note is that I think I have been pronouncing 'Educa" wrong. General wisdom seems to suggest that it is 'ed-doo-kah' rather than 'ed-you-ca'. So that is that sorted, at least.

It's a very corporate event, so an interesting mix of people some of whom you feel are here to be seen rather than engage in much by the way of meaningful discussion. Networking events abound.

It's also huge; I mean, really H-U-G-E. The opening plenary (more of that below) had all 2000 of us in a single hall. Feeding and watering times are interesting as well, in terms of just the sheer volume of numbers.

So, the opening plenary was plagued by technical problems. In the second talk, Sugata Mitra started his talk with only half a slide displayed. He coped admirably, ("this is going to be a bit of challenge...") reducing the audience to fits of laughter as all projection cut-out (".. now this is going to be an even bigger challenge!")

His talk, on the hole in the wall experiments, bringing computers to remote locations in India, was inspirational, with the warmth of the sustained applause at its end reflecting the fact that it had clearly struck a chord with delegates.

The final speaker in the plenary session was Andrew Keen (and now for something completely different, as the phrase goes). He is widely regarded as Web2.0's most vocal contrarian and he certainly knows how to rattle people's cages. His delivery was very fire-and-brimstone; planted arms on the podium, loud and slow delivery. I felt like I was being condemned as a sinner by an agressive priest. I don't agree with all of what he has to say, and there was not too much of an educational focus in his talk (more cultural). But he has got a point about the read-write-web encouraging 'the cult of the amateur' (the title of his book). It's good to see that someone will take up the challenge of questioning the value of some of these things. (There's still a fair amount of technology-comes-first in evidence here).

He came back later in the day to do a panel session. I missed it as we were out taking in a bit more of Berlin than just the inside of a hotel (David was chief photographer). But I heard it was a lively session.

Speaking of lively sessions (here's hoping), we are doing a best practice showcase today, talking about good ol' Physics 1A and Aardvark in a bistro-style session (their terminology, not mine). Following an initial presentation, you stand at bistro-style tables and talk to whoever wanders past. We'll see how it goes..... David has promised to get some photos of us working here (so it wasn't just a jolly).

Wednesday, 21 November 2007

Get Out of Our Facebooks!

It is surely almost impossible not to have noticed the rise of social networking over the last three years or so and, in particular, the recent dominance of Facebook over competitors such as MySpace and Bebo. This has been accompanied by a number of high-profile headlines in the press - concerning issues of privacy, libellous content, identity theft and online safety - and more general discourse about all aspects of social networking has helped maintain the typically elevated levels of hot air up in the blogosphere.

As we have seen many times in learning and teaching circles, the rise of a new technology results in an initial miasma of experimentation, excitement, confusion and fear as educators look for ways of embracing these new ideas into their teaching. In the case of Facebook, this process appears to have started some time 2006 and, since then, initial results and feedback from early adopters are appearing with increasing regularity.

Recently, we have started to think about whether we could (or should) add Facebook to the mix of technologies offered in Physics 1A. Some of our initial observations and questions were:
  • Are students using Facebook to learn or as a break from learning?
  • We've seen examples where students have created their own informal "support networks" (via Facebook Groups) around certain courses. What should our involvement be with these? Should we encourage them or not?
  • Could a centrally-managed Facebook Group be used as an alternative to some or all aspects of a WebCT Course? Is that even a good idea?
As a quick and dirty attempt at getting some insight into some of these issues, we decided to ask the class about Facebook (and a number of other e-Learning ideas) during a lecture using our electronic voting system. This was followed by a focus group session with a number of student volunteers. The Facebook question we posed was as follows:
If we were to integrate the online course material with social networking sites (Facebook, MySpace, Bebo etc.) would you be...
  1. All for it, great idea
  2. Fine with it
  3. Pretty neutral
  4. Against it
  5. Dead against it; keep social and academic things separate
The response to this question is shown in the bar chart below:

Interestingly, the results were noticeably skewed in the direction of "keep out!". This was perhaps not entirely unexpected!

The follow-up discussion that took place as part of the focus group was also quite revealing:
  • Most (4 out of 5) of the focus group sample use Social Networking (mainly Facebook).
  • They could not see the benefit of integrating Social Networking with the course - Social Networking was very much seen as a place for "friends".
  • Nobody uses Facebook to explictly connect with fellow students on the course.
The idea of creating a student-run "unofficial" Facebook group to complement the official WebCT Course was quite popular and was seen as a good way of helping everyone in such a large class get to know each other. However, it became very evident that more thought would be needed on all sides to address issues such as who initiates the group (us or them?) and who gets access to the group (open? closed? should we agree to stay out?).

Food for thought, indeed!

Wednesday, 5 September 2007

Just In Time

Just as I was starting to get a little bit fed up with the themes that seems to be emerging from this year's ALT Conference, Dylan William saved the day with a wonderfully refreshing, well-delivered keynote speech.

I'll blog more about this over-arching trends that have dominated this year once the conference has properly finished, but one of note has been this idea of putting the learning in charge or learning. This encapsulates the "we don't understand our students so should be more like them", "their attention span is only 5 nanoseconds therefore we can never expect them to be able to concentrate", "the students should decide what to learn even though they probably can't be bothered" and the "if we were more like Facebook and they'll want to learn more" concepts that we've touched on already and seen in other conference over the last year or so. (Indeed, as I write this while sciving the last session of the day, I can hear delegates performing the equivalent of penis envy with regard to Facebook!)

These ideas are not without merit, but I'm still not convinced that we should be tearing everything up to implement them. That's where Dylan's keynote was so interesting. One of his more memorable lines was forcing students to work using various learning styles, including those that they might not necessarily like. "School is where students go to watch teachers work." (I probably did not quite paraphrase this right!) This is at odds with current trends of aiming to get educators to accommodate students' individual learning styles.

Much of his talk was on using face to face contact time to improve "classroom aggregators". In a nutshell, this is data accumulated during classes in response to questions or other activities that all students are required to engage in. (Using clickers is an example of this.) He gave some useful examples of good questions to ask from the field of Mathematics and Physics, including traditional MCQ and MRQ questions through to more complicated examples that might elicit a relatively large but still finite set of responses. This gave instant feedback on the current state of the class' understanding, allowing a (good) educator to take the most appropriate next step. ("Just In Time" teaching!)

The refreshing thing about this, as we know from our previous low-tech "clickers" involving bits of coloured cardboard, is that you don't really need complex technology to try these ideas out as well. And, despite being "disruptive" technologies in many ways, educators do not have to completely redesign curricula in order to try them out, as we've seen with the gradual roll-out of clickers across the College of Science and Engineering. Of course, creating good diagnostic/feedback questions is not easy and requires skilled educators, and technology can't really help here so it's good to see pedagogy having to win over shiny new gadgets.

Where technology will really help here is in William's ideas for modeling student progress using data gathered from these aggregators and analysed alongside existing student data. Using clickers rather than coloured cards already makes student voting data ready for being mined so is a good start. But this is just the soft fluffy edge of a really complex (but interesting) problem which kind of falls outside the usual domain of learning technologists so it's maybe hard to see who'll be rising to the challange.

All in all, a good keynote delivered in an enjoyably dry style with a suitably loud tie!

Tuesday, 4 September 2007

Lost at (Alt-)Sea

So, Alt-C has started and immediately the actual face-to-face conversation dips to nothingness as delegates bury their heads in the huge programme of abstracts and the all-important russian-roulette timetable - which, outdoing last year, has 10 strands - great - even LESS chance of finding a session that'll give me that "Ah - that's interesting" wake up call.

As David said, there's definitely still that feeling of trying to catch up: "Argh - the digital natives are restless!" and indeed the pace of change of technology and practice threatens to thwart researchers and their findings - how can one do a comparative study of students' experience and expectation of technology over time when the web landscape of today has moved on so much over just a period of 2 years - the tools used by digital natives (and some immigrants) are adopted and discarded in almost real time. Such is the length of prepration in writing and presenting at Alt-C that by the time a talk is presented its relevance to today is somewhat diminished. Its a shame but its the way it is - conferences are just too slow and clumsy these days - so 1.0 :(


The Smell of Fear

Well, here we are at another ALT conference! For fun, I'm going to count the number of times I hear a speaker say something along the lines of "I've got a 17-year old daughter/son/nephew/etc. and I just don't understand what they do when they're online". So far, after 3 talks and some introductory sessions, I've heard this three times.

I noticed this at our last conference back in June in Hertfordshire as well. There seems to be this sense of fear that we don't understand what young folks do when they're online these days, and I'm sometimes worried that this fear is being too much of a driver in e-Learning at the moment. "Don't throw the baby out with the bathwater" is a cliché that springs to mind!

Blog feed at ALT-C

I am at ALT-C, along with seemingly hordes of other people from Edinburgh. There is this blog feed where people can register their blog posts and have it aggregated for people to browse.

This is proving a source of high amusement as we sit outside (in the car park, on the kerbside, sunny, cup of coffee) and have a good ol' laugh at the inanity of some of the posts. People talking about the deficiencies of the sinks in the student digs, the M1 on the way down here, safe sex simulations in Second Life. Innovators in e-learning, the lot of 'em, I am sure.....

And even me talking about people talking about it.

I should go and figure out which one of the myraid of parallel sessions I will go to next.....

Wednesday, 15 August 2007

"Homogeneity is great for milk, but not for ideas"

I have to give a presentation in a few weeks time at ALT-C, in Nottingham. I started to think today about getting the details together to build my talk with. The nice people at ALT have sent some presenters guidelines which I decided to read - they're only 3 pages long and really rather useful (I'll come back to reading documents of greater than 10 phrases in length later....)

One of the links was advice from Peter Norvig, Director of Research at Google. I am effectively his warm up man at ALT-C (I speak in the session before him.... if you don't count the theme summaries which I am not....)

So there is a page about PowerPoint presentations (Shot By Its Own Bullets) which I found myself thinking "Oh dear, that's me" as I read through it.... It is also where the title for this post came from....

Delving a bit deeper, I found Edward Tufte's page where he looks the way NASA use ppt to deliver (or rather obscure) technical information. Look out for 6 levels of heirarchy to represent 11 phrases....fantastic. And only ever concepts that fit onto one slide!

And a while ago I found this on YouTube. I've sat through talks like this and not laughed (but I think I will be unable to help myself next time).

All this got me thinking as to whether I could (I knew I should) try and do something different this year in my ALT-C talk. I'll probably still use ppt but will try and move away from BulletLand and ... not read out every word on every slide. This could be painful, but worth it.
It's worth a shot anyway.....

This particular disease (ppt that is) could be in part responsible for something else I am coming up against more and more these days. Twice in the last week I have been told by colleagues:
"You can't possibly expect me to read all that / find it in that / notice that - it is N pages"

Here N is usually between 5 and 25. Twice this week when I have encountered this, N was 6 and 16. Interestingly it was (supposedly) academic colleagues who could not be expected to read 6 pages and students who were not going to read 15.

Why is this? Is it because we all get force-fed bullet points and snippets? I shall go and try and learn the art of speaking again for my talk instead of reading......

Thursday, 26 July 2007

DUMP User Survey Opens

As a result of DUMP 2.0 being funded, we are now gathering feedback from our user base on the outcomes of the first iteration of the DUMP project and system, which has been available since early 2007.

All registered DUMP should have received an email about the survey and how to participate in it. If you have registered with DUMP then please do take the effort to complete the survey - it really won't take long and the results will be extremely valuable in helping us to steer the next development cycle of DUMP to help ensure that it does what you - its community - wants.

If you have not tried DUMP, there's still time to give it a try and participate in our survey! See the DUMP Home Page for more information on how to register and use DUMP.

Monday, 18 June 2007

Fast time and slow time

(Not a lecture on Special Relativity. Nor does it have "2.0" appended to it.......)

I was at a colloquim today on Research-Teaching Linkages. Amongst all the talks that claimed what a Good Thing it is (and I agree with that), there was a question from Ray Land that threw up an interesting tension.

What constitutes "scholarship" and mastery of the subject (Physics in my case, but could apply to almost anything) is built on time spent engaged with material, in a somewhat cloistered, solitary environment (the lone scholar). Mastery is based on authority. In this world, time runs rather slow.

In contrast, acquisition of knowledge and information in today's world is based on immediacy, with authority based on consensus and trust (the wisdom of crowds). Chunked knowledge and information is everywhere: Wikis, Blogs (ahem....), Google, IM, chat. Here, time is fast. Really fast. Eriksen has written about this in his book The Tyranny of The Moment.

So here's the tension; research is traditionally predicated on slow time. Student engagement with information and their learning is now based on fast time. How to solve? Dunno. But we at least need to acknowledge that it is there. This in a way underlines the anecdote of a colleague of mine, who recently who reported how aghast his students looked when he told them a problem would take an hour to solve. "An hour?? One problem??"

Friday, 8 June 2007

DUMP2.0 is funded

(First, a title disclaimer; it may seem like every posting on this blog gets "2.0" triumphantly added to it. But in this case there's a good reason for the 2.0. Two, in fact.... )

DUMP2.0 has been funded by the HEA Physical Sciences Centre as a development project, building on last year's project in this area (called, wait for it DUMP - that is Database of Useful MCQs for Physics) I am sure that DUMP as a project acronym contravenes some rule of not having acronyms that contain acronyms.....

DUMP aimed to take a collection of ad hoc MCQs (and MRQs) that we have developed to support a first year Physics course over a number of years, and turn them into an online, browseable library for the HE community to be able to make use of. It did that and you can see the system for yourself (following a simple registration procedure on our e-learning site). It's got elements of well-known online shopping and auction sites that makes it easy to discover and browse materials.

So the first claim to appending a 2 (but perhaps not the ".0"?) to DUMP is that the new project builds on the development of the first. We have nearly 500 questions within the system, but the coverage is very heavily biased towards the syllabus of the first year mechanics course many of the resources grew out of. Having populated it with a reasonable volume of useful content, we are keen for the project to be developed further, not as “more of the same”, but as a unique opportunity to take something from cottage industry to more widespread adoption. The pedagogical spirit of the questions that are currently within DUMP is just as transferable to thermodynamics, electromagnetism and quantum mechanics (all topics which students find challenging!) A related project to produce more content like this has been awarded to Bruce Sinclair and Antje Kohnle at the University of St. Andrews.

Previous experience has taught us that such projects require a critical mass of users and involvement to succeed; otherwise they are destined to become stale and stagnate. In the case of question banks or online repositories, a key issue (aside from technical concerns such as interoperability etc) is the bottleneck of content creation / provision. There are good examples of worthy systems or tools that lie sparsely populated, serving as a real disincentive to wider uptake amongst the academic community.

So here's the second "2.0"; we want to take the spirit of the wave of Web2.0 tools that are currently widely used to foster and build and online community to support and sustain the respository. We plan to use existing open source environments to develop a community around DUMP, to facilitate sharing of contact details, interests, best practice and use cases for questions. We also aim to use this to develop a ‘content ecosystem’ where users who download material from DUMP are encouraged to add-in a small amount of new content, thus ensuring the sustainability of the system and its value in the long term.

The project doesn't formally start until September 2007 and runs for one year, but we hope to have some time over the summer to start thinking (and maybe even doing). More to follow.

Tuesday, 29 May 2007

Poor Web Standards in WebCT Content/Learning Modules

One annoyance we've encountered a number of times with WebCT (both our "old" WebCT 4 Campus Edition and "new" WebCT Vista) is its surprisingly poor support for basic web standards in its Learning Modules (formerly known as Content Modules).

For those who are unaware, a "Learning Module" is essentially a bundle of learning materials that WebCT aggregates into a tree structure for easy navigation. At its simplest, this allows you to build up some kind of structure from a bunch of disparate standalone resources such as PDF files, PowerPoint presentations, images and suchlike. At its richest, it allows you to build complex learning structures from linked hypertext resources like HTML files. It's at this end of the spectrum that significant flaws start to appear in WebCT's delivery of these modules.

The content we deliver in Physics 1A is highly granular in nature with a lot of links to related material. The core of the matherial is highly mathematical in nature and is deployed in standards compliant XHTML+MathML using our Aardvark Course Content Management System.

While we found it easy to get WebCT to deliver "single pages" of this type of content, we soon realised it was impossible to properly use this type of material inside Learning modules.

How WebCT Learning Modules Work

WebCT classifies content inside Learning Modules as either:
  1. "HTML"
  2. Not "HTML"
Both types of content can be added to the module's Table of Contents but WebCT does different things with them:
  1. HTML content is dynamically altered as it is delivered to add in some JavaScript and rewrite hyperlinks so that the WebCT navigation and breadcrumb frames are all updated correctly. Unfortunately, WebCT does this in a bizarrely cack-handed fashion and breaks all standards-compliant HTML or XHTML in the process. Most people don't notice this as most of the web is made of broken HTML and browsers are therefore very good at handling this kind of stuff by going into something usually called "quirks" mode. But this is a showstoper for us as our mathematically rich content must be delivered as well-formed XHTML+MathML in order for browsers to render it correctly. (It also doesn't help in making your content accessibile).
  2. Non-HTML content is delivered unchanged by WebCT so doesn't get mangled like HTML content and actually displays correctly. However, hyperlinks followed from these pages do not correctly update the WebCT navigation and breadcrumb frames and, for complex bodies of material like ours, this is a huge usability flaw.
As a result, we ended up having to recreate the Learning Module functionality using a lot of client-side JavaScript trickery and link to it from WebCT. (A positive outcome from this is that our content can be deployed to any web server as a rich, fully integrated frameset without requiring any server-side software, which is actually quite nice.)

If you reflect about this for a minute though, this situation is actually absurd: a web-based Virtual Learning Environment that claims to be serious about supporting standards can't even support the most basic web standard - HTML - correctly. The sad thing is, it would be reasonably easy for WebCT to be a bit more sophisticated about its HTML handling and fix this issue. Let's see if it ever happens...

More details on this issue can be found at a short note we wrote.

Wednesday, 23 May 2007

XML Pipelining in Aardvark

One thing I've tried to do with Aardvark is identity where bits of code are truly reusable and factor them out so that they can be used in other projects. From this, we've built up some nice general utility classes for doing stuff with Strings and Objects (everyone else has probably done this too!), some helper classes for doing nice things with XML and a simple framework for doing cheap and efficient databinding (that is, converting Objects to and from XML). These reusable classes are collected under the package hierarchy.

One such generalisation that's proved really useful is our class for doing XML pipelining (

What's XML pipelining?

All of the text-based Knowledge Objects in Aardvark are ultimately stored as XML, which is great for representing the underlying structure of the content. (For example, lists, paragraphs, key points, mathematics, ...). On its own, this XML is a bit abstract so needs to be processed to turn it into the various outputs the Aardvark produces (e.g. nice web pages, digital overheads, PDF files). XML pipelining basically works like a traditional factory conveyor belt: the raw XML gets passed along the conveyor belt and gets gradually refined into the target output format. Why do it like this? Well, the factory analogy applies here too. People in a factory generally get very good at doing one thing repetitively and that works with XML pipelining too - we can create "pipeline steps" that do a single thing rather well, and then join all of the required steps together to build up something more complex. This is good for a number of reasons:
  • Breaking a complex process down into steps makes it easier to work with;
  • Individual steps are usually simple so can be verified to work correctly and do their job well;
  • Steps can be reused in related pipelines;
  • Steps are often so general that they can be refined for reused in other projects.

How XML Pipelining works

(Warning: the rest of the post is very geeky!)

An XML pipeline normally consists of 3 components:
  1. A "source": that is, information flowing into the pipeline. In Aardvark, we assume that this is something which generates a stream of SAX events. (e.g. a SAX parser)
  2. Zero of more "handlers": these take incoming SAX events, do stuff to them, and send possibly different SAX events on to the next handler.
  3. A "serializer": This takes incoming SAX events and turns them into some kind of finished article. For example, it might create an XML document file or even use the incoming SAX events to build a Java Object or perform some kind of configuration work.
Not all components are necessary. For example, you can have a pipeline with no serializer. In this case, all of the data will "wash away" as it falls out the bottom of the pipeline. That sounds daft but can be useful if some of the handlers are building up information about the incoming data, such as hunting out hyperlinks or suchlike. An explicit source is also optional: we can simply fire SAX events directly at the first handler in the pipeline. We can also have pipelines with no handlers, which means that the data flowing out will be exactly the same as the data flowing in. Again, this sounds daft but can be a simple way of turning incoming SAX events into an XML document and is used in the Aardvark databinding classes. (The vanilla XML APIs in Java make this more awkward than it should be!)

What kind of handlers can we use?

Handlers generally fall into 2 categories:
  1. A SAX filter. This is a low-level filter that simply receives SAX events, does stuff to them and fires out new SAX events. SAX filters are great if you want to make minor perturbations to a document (e.g. do something to hyperlinks, miss sections out).
  2. An XSLT transform. This lets you make really major changes to the incoming data. In Aardvark, we use these to go from the "raw" document formats to more polished output formats. XSLT is much more expensive than SAX but is often necessary and actually performs very well, especially if you reuse your stylesheets.
It's common for there to be a mixture of these two types of handler in a pipeline. Be aware that most XSLT processors will build a DOM tree from incoming SAX events so it makes sense to group XSLT handlers together and have SAX stuff before and/or after all of the XSLT.

XML pipelining in Java

It's possible and fairly easy to do XML pipelining using the existing Java APIs but it's not quite as nice as it should be. One reason for this is that setting up a pipeline often requires a mixture of the standard SAX API and Java TrAX API (used for XSLT) and, being designed by two completly different bodies, they're not at all alike: a filter handler is represented by the org.xml.sax.XMLFilter interface; an XSLT handler is represented by the javax.xml.transform.sax.TransformerHandler interface. Making the pipeline work consists of configuring each handler to ensure it passes its output on to the next handler in the pipeline and the resulting code can be a bit messy. This is where XMLPipeline comes in.

Our class

The design of XMLPipeline is intentionally simple. (My first stab at this tried too hard to be clever and suffered as a result, so I learned from the mistakes made there!) It follows the 'builder' design pattern and is just a thin wrapper over all of the gubbins we usually need to do pipelining. Its main advantage is that it makes it really easy to assemble a pipeline, making the resulting code very easy to understand and less prone to errors and future changes.

To get started, create a new XMLPipeline(). You can then build the pipeline by adding a number of handlers using zero or more of the following methods:
  1. addFilterStep() lets you add a SAX filter to the pipeline. This is overloaded to accept either a "standard" org.xml.sax.helpers.XMLFilterImpl or a more general "lexical" filter ( The difference between the 2 filters is that the latter also receives information about comments, entities and DTDs.
  2. addTransformStep() lets you add an XSLT transform to the pipeline. This is overloaded to take either an implementation of javax.xml.transform.Source, which locates the stylesheet to be read in or loaded, or a javax.xml.transform.Templates, which is a stylesheet that has already been compiled for reuse.
Calls to these methods simply ensure that each handler gets configured to pass its output to the next handler "downstream".

Once you've added a number of handlers, you can choose to terminate the pipeline as follows:
  1. addSerializer() will serialize the resulting XML into the javax.xml.transform.Result you pass to this method. This is the most common way of terminating the pipeline - passing a allows you to save the resulting XML to a String or file, which is a common use scenario.
  2. addTerminalStep() takes a generic SAX org.xml.sax.ContentHandler or org.xml.sax.ext.LexicalHandler and makes that the receiver of the pipeline's output. This can be useful if you want to plug a pipeline into another pipeline or someone else's SAX input.
Once you've added a terminal step, the pipeline will not allow you to add any more handlers. You can also choose not to terminate the pipeline, as mentioned earlier.

Once set up, you can run the pipeline in two ways:
  • Call execute() passing either a or org.xml.sax.InputSource. This will parse the incoming XML and pass it to through the pipeline.
  • Call getStep(0) to receive the first handler in the pipeline and fire your own SAX events at it. (This is how our Object -> XML databinding works.)
And that's it! It's nice and easy. The XMLPipeline class also tries to help with any runtime XSLT errors by unravelling any Exceptions that are produced; in normal pipelines they tend to get wrapped up by each step in the pipeline and get lost in stacktrace noise. For other goodies, have a look at the JavaDoc or source.

Friday, 11 May 2007

Learning Content 2.0

There's been a lot of excitement about "Web 2.0" in e-Learning circles over the last year or so. Some of this is undoubtedly hot air but there's a lot of interesting stuff happening and I find the social side of things very interesting. (At least, interesting enough to be trying it all out here!)

One area of e-Learning where Web 2.0 hasn't really permeated yet is in good-old "Learning Content". (This is the name we'll use for the stuff we supply to our students to support their learning.) I've always felt that Learning Content is thought of as untrendy and uninteresting in e-Learning circles and it doesn't get the attention it deserves. There's probably many reasons for this. In some disciplines, Learning Content is not actually all that important and the role of the educator is more concerned with "facilitating" or managing students into finding, curating and assimilating materials from elsewhere, be it the web or the library. Therefore, Learning Content is not seen as important as it's always something that can be imported from elsewhere.

In Physical and Mathematical sciences domains like ours, we've traditionally attached more importance to Learning Content and it's common for this to form a fundamental component of the teaching experience we give to students. Why's this? One reason is that, especially in early undergraduate courses teaching fundamental concepts, we feel that it's important to be very clear about what students need to know and the Learning Content we give them quite explicitly maps out the required depth and breadth profile they are required to follow.

The traditional form of Learning Content in our domain has been good old "Printed Lecture Notes". These are normally produced by lecturers, written in LaTeX and turned into PDF files using the normal LaTeX processing workflow. Early use of the web for teaching involved dumping these files on the web, which is equivalent to the (sadly) ubiquitous "PowerPoint slides on the web" that permeates a lot of so-called e-Learning even to this day. Many Courses haven't gone any further this (and some don't really need to). With Physics 1A, we've gone further by offering richer, more interactive XHTML+MathML-based content for a good few years now. We think it's quite good and the students like it too.

Over the last few months, we've been thinking about how this can be improved. One criticism of the existing content is that it's fairly static and rigid. This was actually intentional as it allows the content to be deployed online (both within or outside VLEs) or offline with virtually no fuss and no effort. We now think this can be done better and that we can make it easier for students to navigate and organise what is actually quite a substantial body of content. So we're going to look at improving things for the students in this area. Example possibilities are letting students build "To Do lists" or attach notes or reminders to pages, better breadcrumb and contextual navigation and revision aids like "Build me a random self test and let me know my score".

We also want to see if we can make it easier for students to "make the content their own".
There are lots of things we can do here. One is to allow students to weave private annotations into the notes, for example, by writing down the conclusion of a rare but important "A ha!" or even "D'oh" moment. Similarly, we could also allow students to make public comments about sections in the notes in order to get feedback or assistance from their peers, making the notes a bit bloggy. Educators could use this feedback to improve notes for subsequent years. This draws on the increased use of Wikis in educational contexts and even things like the MySQL online handbook, which has been around for years now and is often highly praised for allowing the community to build on and enhance the core material within. There are lots of other possibilities which we plan to look into. All of these are made do-able by the way the existing material is constructed and deployed (using Aardvark) so we've already got a good foundation to build on.

Another aspect worth looking at is whether we can exploit students' use of social networking tools within an educational context. Our Physics 1A course usually attracts around 250 students and it's very difficult to form a sense of community in such a large course. If lots of our students currently use Facebook, then it's worth looking at whether we can use Facebook to help bond a little better. Public annotations in notes could very easily link to a little "profile" page for each student that has links to their favourite networking sites. That avoids us having to try to build our own networking tools (which probably won't work as well and won't be used) and is actually quite cheap for us to implement. Will students like this? Or will they resent us encroaching into aspects of their online lives that they consider to be separate from learning?

We're calling all of this "Learning Content 2.0" and will hopefully be able to study this in more detail pending funding becoming available! I'll leave it up to a tag cloud expert to create a formal definition of Learning Content 2.0... I'm off to have lunch, which I've been looking forward to all morning.

Thursday, 10 May 2007

Aardvark 0.13.3 hits the wild!

After a couple of months of fairly intensive work, I finally got the latest release of Aardvark out the door onto our teaching server last month. Yay! Today sees the 3rd update since then, which adds a number of conveniences when creating new Nodes that might benefit people who are new to Aardvark. It also adds a bit of debugging code to diagnose a bit of occasional (but minor) strangeness that has been identified on the teaching server. Hmmm...

I think there will be a few more minor releases over the next few weeks as Aardvark is currently being used in anger by a new user for a brand new Course and it's been quite interesting getting some feedback from newbies. (Actually, I've already found a silly bug in the caching code in the Aardvark Content Manager that will be fixed tomorrow so expect 0.13.4 to appear very soon. D'oh!)

Want to find out what's new in Aardvark? Then read the release notes. If you don't know what Aardvark is, you can find out at our e-Learning site.

Erm, welcome?!

Despite worries about contributing to global warming in the blogosphere, we thought it might be a fun idea to start a blog for our group in the School of Physics at the University of Edinburgh.

What possessed us? Well, it seemed like a good idea at the time! It's also a potentially useful experiment in a number of ways:
  • We're planning to do some work over the next few months concerning the use of social software and so-called "Web 2.0" tools by our students. Attempting to use this kind of stuff ourselves is the best way of finding out how well it all works. (Or doesn't...)
  • It's interesting to see how well we can use tools that are not explicitly provided or supported by the University's computing services for work purposes. How do we integrate them? How do we manage having some of our work stuff scattered over the internet? How do we export stuff from these tools if something else comes along? ...
  • I'm interested in how a Blog is an interesting tool for reflection, learning and discussion within (and outwith) our group.
So, erm, welcome!