wiki:PPTConversion
Last modified 8 years ago Last modified on 01/28/09 16:44:31

This page will have instructions for how to set up open office so that it can convert powerpoint presentations to LeMill-usable images. While developing this feature, this page will contain information about all relevant methods and tech details.

Related methods in PresentationMaterial.py:

  • createPieces -- This is currently used to receive image files from upload form and to convert them to pieces. This should eventually be replaced.
  • createExternalImagesFromForm -- This is the replacement for createPieces. This takes files from upload form and stores them as external image files.
  • convertPiecesToExternalImages -- This is a converter method to check if pieced in this presentation are only used by this presentation. If so, then those pieces are stored as external images and deleted, and the presentation is changed to refer to those external images.
  • createPieceFromSlide -- Reverse conversion from ext image to piece, we have no plans to have this option in UI, so don't care about this.
  • createExternalImages -- image conversion and save function used by createExternalImagesFromForm and convertPiecesToExternalImage. As you see from the code, this has problems. This takes file objects that forms give, and tries to do few different sized png:s out of them and save them to external file storage. Unfortunately this gives a PIL error at various points for about 30% of images. Usually if conversion scripts fail, they store it as a piece instead. I'm not comfortable working with io streams and whatnot, it may have some very simple problem there. We don't have so many problems with conversions required for saving images to pieces, afterall.

Conversion process

OK, the basic idea is that we have one virtual server running a non-GUI instance of openoffice.org. The conversion happens something like this:

  1. Assume a common shared folder available to all Zope instances
  2. LeMill product stores PPT (or Impress) file as SHARE/queue-in/${UID} and marks the resource as "pending conversion", showing this to the user.
  3. Script (as pasted below) goes through queue-in, processes the files one by one, converting them using python-uno (attached to this message) to PDF files, then uses pdftoppm and ppmtojpeg to create individual jpg files of the slides, putting the slides into SHARE/queue-out/${UID}/p-N.jpg where N is the slide number in 6 digits.
  4. The script would after conversion call a method in zope, which will then set the state to "ready" and send an email to the author.

Here's a sample script that assumes the existence of ooextract.py, which connects to openoffice.org via py-uno.

#!/bin/bash
for inp in queue-in/*
do
 [ -f $inp ] || break
 inp=${inp#queue-in/}
 echo "Processing: " $inp
 if file queue-in/$inp | grep -q 'PDF'
 then
  echo "PDF file, no OO conversion needed"
  mv queue-in/$inp queue-in/$inp.pdf
 else
  echo "Non-PDF file, attempting OO conversion..."
  python ooextract.py --pdf queue-in/$inp
  rm queue-in/$inp
 fi
 mv queue-in/${inp}.pdf processing/
 echo "Converting PDF pages to images..."
 pdftoppm -r 75 processing/${inp}.pdf processing/p
 mkdir queue-out/$inp
 echo "Image recoding..."
 for img in processing/*.ppm
 do
   f=${img#processing/}
   f2=${f##p-0*0}
   ppmtojpeg $img >queue-out/${inp}/${f2/.ppm/.jpg}
 done
 echo "Cleanup..."
 rm processing/*
 echo "Done processing " $inp
done

Implemented solution

Need OpenOffice?.org 3 server. starting server:

/opt/openoffice.org3/program/soffice -nofirststartwizard -invisible -headless "-accept=socket,host=0,port=2002;urp;"

Configuring LeMill

Two configuration directives must be added to zope.conf file.

shared_presentations_path (SHARED_PRESENTATIONS in LeMill config.py) - this is where converter-script stores converted files

external_resources_path (EXTERNAL_RESOURCES in LeMill config.py) - this is where LeMill is looking for external files (two subfolders exist there: pieces and presentation).

When converter-script finishes, it notifies LeMill about presentations that are converted. LeMill then moves files from SHARED_PRESENTATIONS folder to EXTERNAL_RESOURCES folder. It is important that Zope have read AND write permissions to both of these directories.

Create a user, with Manager role, for converting-script.

Configuring converter-script

Converter-script is located in utilities directory, named lemill-convert.py It is recommended to copy it somewhere.

It is important that LeMill's SHARED_PRESENTATIONS and script's SHARED directives are the same. Again, script must have read AND write permission to SHARED directory.

Schedule this to run with crontab or some other scheduling software.

Possible integration with apache or other web server

If LeMill sits behind Apache or some other web server then it is possible that web server serves presentation files directly without disturbing Zope. For that, in web server conf, catch URLs that start with /ext/ and point web server to serve files from EXTERNAL_RESOURCES directory.