Linux optimize

How to use the open source XOWA interface for Wikipedia

Marco Fioretti explains how to use a program that creates offline Wikipedia resources for users that don't have reliable Internet access. It could be a valuable tool for teachers in some circumstances.
Every second, lots of people worldwide misuse Wikipedia by citing it as their only source, even if Wikipedia itself says that this is wrong. In spite of this huge issue, Wikipedia remains a great, immensely beneficial resource, even for people without Internet access: there are several ways to make and use offline copies of Wikipedia. This is really important both for developing nations (check the Afripedia project to know what I mean), and for the many "first world" students still without broadband.

This week I'll present a way to use Wikipedia offline that is simple enough for personal use, or to be implemented by most teachers. They may use it to distribute reading material, or to teach their students how to create and share it.

Enter XOWA

XOWA is a Free Software, multiplatform, offline Wikipedia browser. In order to use it, you must download the ZIP archive for your operating system and architecture, and unpack it somewhere. Just remember that, on this point, the documentation of XOWA 0.2.3.0 for 64 bit Linux, which is the version I tested, is wrong. The README file explicitly says to "unzip to '/xowa/'" and run the software from there. While this is possible, installing any software in "/" goes against the Linux Filesystem Hierarchy Standard and the other software management practices of (almost) every Gnu/Linux distribution. Besides, that advice implies that users without root access couldn't install XOWA.

The good news is that not only you should ignore such a bad advice, but you can, without problems. The screenshots here come from a XOWA installed in $HOME/xowa, that works just fine. You could also install XOWA on an USB stick. Uninstalling XOWA is as simple as erasing the whole folder in which you unpacked it.

To start XOWA you must run this command:

java -Xmx256m -jar XOWA_DIR/XOWA_JAR

where XOWA_DIR is the complete path to the folder where you unpacked the archive, and XOWA_JAR is the Java jar containing the code: xowa_linux_64.jar on 64 bit systems, xowa_linux.jar on the others.

Right after installation, you must "fill" XOWA, so to speak, with a whole copy of the Wiki you want to access with it. Only after that operation will you be able to use that Wiki offline. In practice, you must simply follow the clear instructions in the XOWA initial window (Figure A).

Click to enlarge.

As they are, they download the Simple English Wikipedia, which is sort of a simpler, much smaller subset of the English one. Installing that Wiki is the simplest and fastest way to test XOWA. If you like it, the procedure to install Wikipedia in other languages is practically the same, just much longer because you must download much more data.

In fact, if you wanted, you may access several Wikipedias with the same installation of XOWA, as well as Wiktionary, Wikisource and Wikiquote! XOWA will also interlink these Wikis: you will be able to look up a word in your offline Wiktionary when you select it in your offline Wikipedia. If you plan to use non-English wikis though, click on, "Set up non-English languages" in the initial page and follow the instructions.

All the Wikis you download end up in the "wiki" subfolder of your XOWA installation. Remember to exclude it from your backup procedures, or they'll have to deal with whole Wikipedias every time!

The XOWA interface and its gotchas

XOWA renders Wikipedia pages so well (Figure B) that I forgot it's not my browser. It took me a few seconds to understand why middle-clicks wouldn't open new tabs anymore!

Figure B

The menus that only make sense in the online Wiki are missing, but almost everything needed for offline usage is there. The main exception, at least in version 0.2.3.0, is printing, although a somewhat dirty workaround exists (see below). Also, I found no way to zoom the page text. Apart from those minor complaints, XOWA works well, but... some parts of the interface are invisible. This may be a general Java issue with my current version of Linux (Fedora 17 x86_64), but still, it can be pretty confusing until you figure out what is happening.

Inside XOWA you can both search for text in the current page (Ctrl+F) and for pages related to some keyword (Ctrl+Alt+S). However, the corresponding text input boxes are invisible. Besides being small and at the extreme corners of the window, they have no border, background, title or anything else to prove to users they actually exist and are ready for input (Figure C).

Once you accept that, search works just fine (Figure D).

Even the navigation "buttons," that is, the clickable <" and "> characters in the upper left corner, are really hard to notice. The XOWA Help Page, however, lists all the shortcuts and tricks you need to do almost everything with your keyboard.

The last hard to find part of XOWA can be... the actual Wikis! When the download is complete and you restart XOWA, you get (apparently) the same Welcome page as before. Don't be worried, though. The reason is simply that XOWA puts the links to the local Wikis at the very bottom of that page, which is not a big deal. As you can see in Figure E, you can edit that or any other page inside XOWA just as you would on the live Wikipedia!

Figure E

Alternatively, you may tell XOWA to directly open some Wiki with the --url option:

java -Xmx256m -jar XOWA_DIR/XOWA_JAR --url 'simple.wikipedia.org/Main Page'

Of course, the command line!

I'll close this short introduction to XOWA with its geekiest feature. You can run it at the prompt to get copies of each page in plain HTML (also usable for printing!) or Wiki format:

java -Xmx256m -jar ~/xowa/xowa_linux_64.jar --app_mode cmd --show_license n --show_args n --cmd_text "app.shell.fetch_page('simple.wikipedia.org/wiki/Linux' 'html');" > linux.html

The command above tells XOWA to start in command line mode, instead of using the default GUI. The last three options disable printing of license and command line arguments and send to STDOUT, in HTML format, the content of the given page. Using wiki instead of html would return the actual Wiki source of that page.

About

Marco Fioretti is a freelance writer and teacher whose work focuses on the impact of open digital technologies on education, ethics, civil rights, and environmental issues.

2 comments
gnosygnu
gnosygnu

I'm the developer for XOWA. I'm hoping it's not bad form for me to comment, but I read your article, and really do appreciate the review. I also wanted to provide some additional info, in case you're curious. [quote]The README file explicitly says to “unzip to ‘/xowa/’”... (which is) bad advice[/quote] Sorry, this was definitely a mistake on my part, and I'll correct it for the next release. I was aiming for simplicity, but I shouldn't have sacrificed correctness. [quote]I found no way to zoom the page text.[/quote] This may be clumsy, but you can edit the xowa.gfs file to increase the size of the text. Search for the following text "/*body, td {font-size: 16pt;}*/" and remove the "/*" and the "*/". This will change the font size to 16. I'll make this an option within the app for the next release. [quote]This may be a general Java issue with my current version of Linux (Fedora 17 x86_64)[/quote] No, sadly, your version works fine. XOWA uses SWT, and I found its basic widgets difficult to customize (its web-browser is fantastic though). I tried to "minimize" them as much as possible, since 90% of the time the user would be looking at the wiki page, and I didn't want to distract them with stock app widgets. However, as you point out, making them near-invisible is quite awkward. I'll add some basic border/tooltips for either the next release or the one afterwards. I'll also put in real back / forward buttons. [quote]the command line!... also usable for printing[/quote] Wow, I'm impressed that you're using the command-line option for it. Regarding printing, I'll put in a link that will pop up the Print box (similar to doing File -> Print). Again, thanks for the article and the feedback. I will definitely incorporate your points into upcoming releases. If you have any others, let me know!

mfioretti
mfioretti

Looking forward to readers comments