Distributed Proof-reading at Project Madurai
"Distributed Proof-reading" is a web-based method (adopted from Project Gutenberg)
for preparation of etexts of Tamil Literary works for Project Madurai. By breaking the
work into individual pages several volunteers (based in different parts of the world) can
be working on the same book at the same time. This significantly speeds up the
keying in and proof-reading parts of the etext creation process.
Scanned image files of individual pages of printed version of Tamil works are
stored in the web-server. Promads simply pick up one of these image files in a
split screen window, where equivalent Tamil Text can be keyed in directly.
The frame for display of the image and Tamil text can be in horizontal or vertical mode.
For Project Madurai, equivalent etext is keyed in Tamil script as per Unicode encoding.
Several freeware Text Editors (e.g. Murasu Anjal, ekalappai) are available for use in
Windows, Macintosh and Unix platforms that allow direct keying in of the Tamil text
in the browser display window.
Click here to see
a screen shot of the split screen display of scanned image and equivalent Text
When a proofer elects to proofread a page for a particular work, the text and image
file are displayed in the same way. This allows the text file to be easily reviewed
and compared to the image file, thus assisting the proofreading of the text file. The edited
text file is then submitted back to the site via the same webpage that it was edited on.
Once all pages for a particular book have been processed the Project Manager joins
the pieces, properly formats them into a PM E-Text and submits it to the PM archive.
Advantage of DPPM are: i) image files are available 24 hours online
and accessible to anyone anywhere (no need to have a printed copy
of the book yourself to participate in etext preparation; ii) Promads
deal with only one page at a time (hence time commitments are less);
iii) several Promads can work same time on the same work sharing
the load. At Project Gutenberg, Distributed Proofreaders posted their
5,000th book of total 14000 in their collections; about 300 new books
were being finished each month.
For Project Madurai, Distributed Proof-reading Web-server can be accessed using
the following URL
- DP-PM server at Virginia Tech, USA
What is the entire process for creating an etext via Distributed Proof-reading?
Here is a list of the steps involved in the preparation of an etext by yourself independent of DP-PM implementation.
- Locate a target Tamil literary work for etext preparation . Target works must either be
in public domain (free of all copyrights) or those where the author (or legal heirs) willing
to give permission to Project Madurai to reproduce the work electronically and to distribue
them free on the Net. For the type of works that has been covered earlier, please consult
the webpage
Listing of Project Madurai Etexts Released (opens in a separate window).
Prior to starting etext preparation, send details to the Project Coordinators
(email to pmadurai AT gmail.com) and get their approval. This is mainly to ensure that we do respect
copyright restrictions if any and that other PM volunteers are not concurrently involved
in the preparation of the same etext.
- You need to have acces to an authentic version (printed copy) of the literary
work you are interested to prepare etext. Here again, please consult Project Leader
by providing him with the details of the source work you are planning to use.
For Distributed Proof-reading, scanned image files of each page of the printed
version needs to be produced. Scan resolution need to be at least 150-200 dpi,
preferably 300 dpi. If you have access to image cleaning software (such as Adobe
Photoshop), try to use it to increase the contrast.
- If you can scan pages of Tamil literary works from print editions for use in DPPM, please contact the Project Coordinators by email so that we can provide guidelines to upload by ftp the image files to the DPPM web-server upload the images and text to the web site.
- Promads (PM volunteers) visit the DPPM web-site and after registration/log-in,
view image files of individual pages and key in the corresponding Tamil Text in Unicode
format.
- For those for which keying has been done, Promads call up the same
pages at DPPM web-server and proof-read/correct the equivalent Tamil text.
- When image files of all pages corresponding to a single Tamil work has
been keyed in and proof-read at least once, Project Manager assembles all the
Tamil text and compiles a single etext file. He then forwards the same to the
Project Leader for preparation of the html and PDF versions and release of
the work to the general public.
This file was last updated on 20 September 2010.
Email to reach Project Coordinators: pmadurai AT gmail..com