Back to Projects
AI

LaTeX OCR Model

Reduced handwritten equation transcription time from 20 minutes to under 30 seconds with an 8.3% Character Error Rate.

TensorFlowPythonCNNLSTMCTC LossNLP

The Problem

Researchers and students manually transcribing handwritten mathematical equations into LaTeX spent 15–30 minutes per page, introducing errors and slowing publication workflows.


The Solution

Built a CNN + LSTM + CTC Loss deep learning pipeline. CNN extracts spatial features from equation images; LSTM models sequential token dependencies; CTC Loss handles variable-length output without explicit alignment. Custom StringLookup tokenizer and beam search decoding for accurate sequence generation.


The Impact

Achieved a Character Error Rate (CER) of 8.3% and Word Error Rate (WER) of 12.1% on benchmark datasets. Reduced transcription time from 20 minutes to under 30 seconds per equation.


Tech Details

  • CNN backbone for spatial feature extraction from equation image patches
  • Bidirectional LSTM layers for sequential token dependency modeling
  • CTC (Connectionist Temporal Classification) Loss — eliminates need for explicit input-output alignment
  • Custom StringLookup tokenizer mapping LaTeX tokens to integer indices
  • Beam search decoding (beam width = 5) for optimal sequence generation
  • Data augmentation: rotation, noise injection, contrast variation for training robustness
  • Evaluated on IM2LATEX-100K benchmark: CER 8.3%, WER 12.1%

Interested in working together?

I'm available for freelance projects and full-time roles.

Hire Me