Folksy specification

Folksy is a framework for educational games.

Introduction

The idea is to develop a simple framework to run graphical educational games on a variety of platforms. The first priority is the web, then smartphones.

The games should be easily modified and personalized. For example, you should be able to create an alphabet learning game for your kid with images and sounds of relatives, "A for Amanda" etc.

Game

A game is a complete game of some game type with media files.

To the game engines — the code that runs the game on various platforms — the game is a JSON file with metadata and links to images and sounds.

File formats

JPEG should be used for photographic images. PNG should be used for computer generated images, such as letters.
- Future: GIF might be usable for simple animations (although looping through PNGs with code might work just as well?). I hope to also provide SVG support soon.
For sound files, both Ogg Vorbis and MP3 should be provided. I really wish we could stick with Ogg Vorbis, since MP3 has patent problems. Unfortunately, that would leave a large part of our audience in the cold. The web engine should try HTML5 audio and as a fallback use Flash.
- I’d like more specification here on what bitrate, sample rate etc. we want to use — based on what works best in browsers
I’m guessing the same kind of issue will appear when we want to support video. Theora should be our first hand choice, but needs to have a fallback.

JSON structure

These attributes should be set at the root of the JSON.

id: A simple string identifying the game. (Maybe use Java style unique identifiers, like org.folksy.* ??)
name: A more descriptive name.
format: File format version, an integer. This should only be bumped when truly incompatible changes to the format are introduced. Features can be added by simply adding fields. (Unknown fields should be silently ignored by game engines.)
gametype: See below for available gametypes.
gametype_format: Like format, but just for this gametype.

Game types

The currently implemented game type—see {Folkes Folk}--is whatletter. Currently reworking into the one-to-one gametype.

whatletter: The game consists of a set of images (of faces, objects…) with corresponding sounds, typically voices that say: "A is for apple", and so on. The player then has to click on the right letter.
one-to-one: This is where we’re going. See below.

One idea is to separate the game logic into platform independent JavaScript.

One to one

A learning tool that works by establishing relations between stimuli. Let’s define some terminology:

A stimulus (plural: stimuli) is, as you’ll find in any psychology textbook, an energy pattern (e.g. light or sound) which is registered by the senses (definition from Wikipedia). We’ll use it broadly. The following are some more specific terms used in this project.
At the moment, we use image stimuli (still images) and sound stimuli (simple sound files). An atomic stimulus is either an image stimulus or a sound stimulus; basically, a stimulus that will never be torn apart into several sub-stimuli. A composite stimulus is a stimulus made up from one or several atomic stimuli.
Borrowing from experimental psychology research procedures (e.g. Dymond and Whelan, 2010):
- a cue stimulus is a cue for what the task at hand is, e.g. "choose the similar object". This is, at the time of writing, implicit in Folksy games. It may be any composite stimulus.
- a sample stimulus is a question to be answered. For example, an image of an apple, where the task is to respond with the word "apple". It may be any composite stimulus. We will also call this a prompt stimulus, since the word sample is confusing here.
- a comparison stimulus is an answer. For exampe, the word "apple"; usually presented next to other comparison stimuli. It may conceptually be any composite stimulus, but in practice it is difficult or unintuitive to choose between sound stimuli (although I have some ideas), so it might be presented as just the image part of a image+sound stimulus.

Specify the game logic algorithms. Say we have a set of sound stimuli, S(i) for i = 1…n, and a set of image stimuli in the form of glyphs, G(i) for i = 1…n. We want to learn the relation R(S, G); presented with sample stimulus S(i), the user learns to respond with G(i).

We measure the strength of each relation as p(i), a value that varies from 1 (completely unlearned) to 0 (theoretical max; fully learned). Values above 1 represent anti-learning, when the user performs worse than would be expected by chance. The strength can be interpreted as the probability of being correct by chance. We’ll get to that. Initially, p(i) = 1 for all i.

Picking a relation. We assume p(i) > 0 since we never reach perfect learning. Add up all p(i) cumulatively to psum(i) so that psum(1) = p(1); psum(i + 1) = psum(i) + p(i + 1). Pick a random value r from [0, psum(n)]. The smallest i such that psum(i) > r is our pick.

With this algorithm, any relation can get picked at any moment, but the more learned a relation is, the less likely it is picked.

However, we might want to limit the set of relations chosen from, so that we do not introduce too many new stimuli at the same time. How about we try to keep the unlearned mass — psum(n) — beneath an upper bound? This could possibly be adjusted for the cognitive capacity of the learner.

Statistics. If we correctly respond when presented with n comparison stimuli, the probability that this was pure chance is 1/n. If this happens m times in a row, the probability of the null hypothesis ("no learning") is 1/mn. We could use this as a measure of the strength of the relation; the lower p-value, the stronger relation. If a false answer is emitted, the p-value should be adjusted upward. We then multiply it by n. One way to make sense of this: If we keep giving random answers, our p should oscillate somewhere around 1.

(I guess any algorithm like this needs to be adjusted heuristically after observing how it works.)

Picking the comparison stimuli. Consider the situation where we want to learn a completely new relation between unknown stimuli such as, for me, the relation between Arabic glyphs and their vocalization. We want the tool to not only help us memorize, but to teach the relation. It should be theoretically possible to learn the relation without ever having to guess. (The game needs to continuosly adapt so that it’s at the right level of difficulty to stay interesting.) Therefore, on the very first trial, the user is presented with a sample stimulus (a voice saying "’alif") and only one comparison stimuli (an ’alif glyph). The user responds, and the relation is then marked as teached. On each following trial, at maximum one unteached relation should be exposed, allowing the user to logically conclude the right answer.The second trial might then be the vocalization of "bā’" as sample stimulus, and ’alif and bā’ glyphs as comparison stimuli. The bā’-bā’ relation is then considered teached (even if the user does not initially give the right answer). However, the second trial might also be the ’alif sample stimuli again, with alif and bā’ glyphs as comparison stimuli — this does not make the bā’-bā’ relation teached, obviously.

Some stimuli are more similar to each other then others, making them difficult to differentiate. For example, when learning Swedish script, it might be hard to learn the difference between Å and Ä. To make it easy on the learner, we could make Å and Ä less likely to occur in the same comparison trial. A stimulus set S might specify a similarity function sim(i, j) that gives the similarity between 0 and 1.

After a few relations are taught, we need to decide what the size of the comparison set should be. I’m playing with the idea of letting the user simply decide for herself, even during game. Yes, this allows you to "cheat", but the premise of this game is that you’re curious, want to learn, and know best yourself what kind of cognitive load you can handle at the moment. This should all be very configurable for various situations. One possibility is to have a "parent"/"teacher" console.

Anyway, a good choice could be csize = 2 for preschool kids and csize = 4 for the rest uf us.

Composing the comparison set then becomes a random pick weighted with p (we want well-learned alternatives) and similarity with the correct stimulus.

Scoring. A mean value of the p’s could measure how well the whole relation set is learned. There should be continuous rewards, e.g. a star every time you reach below pmean = 0.9, 0.8, 0.7…

Themes

Graphical and auditive themes in common for many games.

The typeface used for letter images is GNU FreeFont Sans, which is licenced with GNU GPL, but allows any licence to be used for documents where the typeface is used.

Game engines

There is currently only the web game engine. A mobile version will be written next.

The web game engine

The game engine for the web is written in JavaScript using jQuery. Audio on the web is hard to get right. We’re using http://www.createjs.com/SoundJS which uses HTML5 audio or Flash.

jsdoc-toolkit is used for documentation. (Ubuntu package: jsdoc-toolkit) (Actually, it’s not, and I’m not so sure about it. Seems inflexible. I think an explicit documenting system, like PDoc, is a better idea for a language like JavaScript. Google Closure Compiler, however, uses jsdoc tags for controlling optimization in detail. So I’m not sure what to do.)

Jasmine (included) is used for unit testing.

{jquery.transform} is a nice plugin to allow animations of CSS transforms such as rotations.

SoundManager problem: What I’d like is if I could give it an ogg file and an mp3 file, and then it would use the ogg wherever possible, otherwise use mp3. I therefore don’t want either mp3 or ogg to be required. But SoundManager forces this. This doesn’t work:

soundManager.preferFlash = false;
soundManager.audioFormats = {
    'mp3': {
        'type': ['audio/mpeg; codecs="mp3"', 'audio/mpeg', 'audio/mp3', 'audio/MPA', 'audio/mpa-robust'],
        'required': false
    },
    'ogg': {
        'type': ['audio/ogg; codecs=vorbis'],
        'required': false
    }
};

Update: Now using SoundJS instead of SoundManager2, haven’t tested it thoroughly in many browsers though!

W3 HTML5 spec for audio — WHATWG spec
type for ogg is type=audio/ogg; codecs=vorbis (spec for codes
Speex is a free codec specifically designed for speech. Probably no browsers support this though…

Mobile game engine

Possibly using PhoneGap!
Possibly also jQuery Mobile!

Game tool

The game tool is a tool to generate games from source files, written in Python. Here we use the very readable and editable file format YAML.

The game tool can be used in different ways depending on your setup. The simplest is to type

folsky.py buildzip my-game.yaml

This will give you a zip file to uncompress at your web server, ready to run.

Note	This is not how it’s working at the moment.

Libraries and other tools used:

Python Imaging Library for image manipulations
PyYAML 3.10 — Ubuntu package: python-yaml
pycountry — Ubuntu package: python-pycountry [not used at the moment]
ffmpeg — Ubuntu packages: ffmpeg libavcodec-extra-53 (there is a Python wrapper called PyFFmpeg, but shell calls suffice)

PyYAML 3.10

On a Ubuntu system, run sudo apt-get install python-yaml to install.

On a system where you’re not root, download PyYAML-3.10.tar.gz from the PyYAML, unpack it and install with:

python setup.py --without-libyaml install --user

Game designer

A web application where you can design your own games.

Use wami-recorder to record audio

References

Dymond, S., & Whelan, R. (2010). Derived relational responding: a comparison of match-to-sample and the relational completion procedure. (available from Simon Dymonds homepage, http://psy.swan.ac.uk/staff/dymond/ )

Appendix A: Internationalization

In many ways, the games are localized in themselves. A game is usually in one specific language. There are messages from the game engines that need to be translated.

What’s a good way to handle internationalization in JavaScript?

jquery-i18n-properties is a "lightweight jQuery plugin for providing internationalization to javascript from ‘.properties’ files, just like in Java Resource Bundles".
Stack Overflow question with some good thoughts.
A blog entry describing a simple method. I’ll go for something like this. Just need a good sprintf!
- JavaScript sprintf().