User:Mattwj2002/1911 Encyclopedia scripts

These are some scripts for working on the 1911 Encyclopedia project
Here is my script for doing OCR. This OCR script is version 1.

ddjvu -format=tiff eb1911-vol01-a-androphagi.djvu eb1911-vol01-a-androphagi.tif tiffsplit eb1911-vol01-a-androphagi.tif eb1911 rm eb1911-vol01-a-androphagi.tif sleep 2 let i=1 ls -1 *.tif | while read line; do echo $i; tesseract $line page$i -l eng; let i++; sleep 1; done sleep 2 rm *.tif
 * 1) !/bin/bash

Here is my script for doing OCR. This OCR script is version 2.


 * 1) !/bin/bash

ddjvu -format=tiff volume1.djvu eb1911-vol01-a-androphagi.tif tiffsplit eb1911-vol01-a-androphagi.tif eb1911 rm eb1911-vol01-a-androphagi.tif sleep 2

let i=1 ls -1 *.tif | while read line; do echo $i if [ $i -le 32 ]; then tesseract $line page$i -l eng else convert $line -crop 50%x100% +repage tmp%02d.tif tesseract tmp00.tif tmp00 -l eng tesseract tmp01.tif tmp01 -l eng cat tmp00.txt tmp01.txt > page$i.txt fi mv $line $i.tif let i++ done

This is my script for taking tiff files and converting them to a djvu file.

let i=1 ls -1 *.TIFF | while read line; do cjb2 $line $i.djvu; let i++; done
 * 1) !/bin/bash

djvm -c volume1.djvu 1.djvu

for((i=2;i<=1029;i+=1)); do echo $i djvm -i volume1.djvu $i.djvu done

Here is my script for crop images.

ls -1 *.TIF | while read line; do convert +compress -crop 100%x99% -gravity South $line $line.TIFF; done
 * 1) !/bin/bash

PNG files to PDF files
let i=1 ls -1 *.png | while read line; do convert +compress $line $i.pdf; echo $i; let i++; done mv 1.pdf outputfile.pdf let i=2 ls -1 *.pdf | while read line; do pdfjoin outputfile.pdf $i.pdf --outfile outputfile.pdf; echo $i; let i++; done
 * 1) !/bin/bash