Outcome/Impact
Michigan Medicine implemented a scalable OCR pipeline to process high volumes of inbound clinical faxes. It currently handles approximately 2,000 faxes per day, with plans to scale to 10,000. Using a CPU-based model (Rapid OCR), the system extracts text from documents and applies a confidence threshold to identify low-quality or handwritten sections. These low-confidence regions are then selectively routed to a vision-language model via the U-M GPT Toolkit to improve accuracy without processing entire documents unnecessarily. Designed with strict PHI constraints, the solution uses local models and approved U-M tools, ensuring compliance while delivering advanced AI capabilities. This approach enables high-throughput, secure document processing and significantly improves data quality from traditionally difficult-to-parse sources like handwritten faxes.