More import and export formats

Registered by Marten de Vries

It would be nice if OpenTeacher supported more formats, i.e:

- Overhoor (.ohw, .oh, .oh4)?
- Anki?
- Pauker (.xml.gz)?
- Audivididici (.avd)
- Overhoringsprogramma Talen (.ovr)?
- Vocabularium (.voc)?
- backpack?
- Ludem?
- ProVoc (.pvoc)?
- ?
- Open Exam Format (

- Mnemosyne (open source, Python)?
- CueCard (open source)?
- SuperMemo (closed source)?
- DingsBums?! (open source, discontinued in favor of Anki)

- .kvtml:
 - KVocTrain
 - KVTML 2.0
  - KWordQuiz
  - Parley
- pdf -> done
- excel/odf? -> not necessary with csv
- csv? -> done
- ABBYY Lingvo Tutor (import is already available) -> no need for export?

Blueprint information

OpenTeacher Maintainers
Series goal:
Accepted for 3.x
Milestone target:
milestone icon 3.1
Started by
Marten de Vries
Completed by
Marten de Vries

Related branches



The most interesting formats are supported now, I guess.


- zip file
 - info.txt
  - text based
  - line 1: word count
  - repeating lines:
   1. question (rtf encoded)
   2. answer (rtf encoded)
   3. empty line
   4. audio file
   5. unknown (separator?) 'LL'
 - other files (resources)
  - wav
  - other?

RTF -> HTML: , otherwise heavier tools are needed. (Maybe not worth the effort.)

== Ludem ==
Just a question = answer\n file. Seems inactive, let's mail ~dsprenkels if the project is still alive, if not, only import support is needed.

Open-source program

See for a lot of test files, XML based.

Open-source programs

See for test files.

Both the extensions .pau.gz and .xml.gz seem to be used, the first one is newer. Both contain XML, but the first seems to use a superset of the format used by the second. Seems parsable by one parser, though.

Open-source program.

The program itself seems death, but there are some programs that export it. It uses a simple format: question\tanswer, without spaces. Maybe import only is a good idea.

.oh4, .ohw and .oh are all used extensions.

.oh4 and .ohw are both text based formats (question = answer\n), but the .oh4 file includes a [FONT:Arial,9] = [FONT:Arial,9]
 header which can be stripped.

.oh4 seems UTF-8 encoded (not sure), .ohw uses a DOS code page. (Which one I didn't find directly.)
.oh is according to the site also an extension, it looks like it's equivalent to .oh4.

Closed source program, not updated (website mentions Windows 7, however) but still available.

==Overhoringsprogramma Talen==

Not updated since 2005, some online word lists and WRTS exports it. Only importing should be enough.


Closed source program


Only import?

Some lists available. Simple file format with this structure:
1: the text 'Vocabularium 2.0'
2: questionLanguage %space% answerLanguage
3: empty line
4: !space List description (name + version)
5: repeating: question\tanswer\n
6: 2 newlines (needed? Maybe.)

Closed source? program

By default, you download online lists. Let's support this API! (Start Wireshark/download source :D.)

It uses a sqlite database (with quite a lot columns) to save to the hard disk. It also has the possibility to open some other programs their lists, maybe ideas for us :P?

Open-source program

==ProVoc (.pvoc)==
Project is dead, some files available however. Import support is enough: for some files.

File format seems binary, maybe the source can provide some clues...

Open-source program
Closed source webbased service, but active. No idea if they have a file format, but an API seems more appropriate. Let's make a test account and mail the owner if we still want to support it.


Work Items

This blueprint contains Public information 
Everyone can see this information.


No subscribers.