What does 93% accuracy mean here? My understanding is that tech like LLMs are not able to give a confidence score unlike traditional OCR methods. How was 93% calculated? How are the parsing errors surfaced?
Great question.
The 93% refers to field-level accuracy on a labeled test set; a field is counted as correct only if it matches the ground truth (with basic normalization for dates/currency, etc). It’s not document-level accuracy.
We don’t use raw LLM probabilities for confidence. The score is based on additional validation checks (cross-field consistency, format rules, reconciliation). Each field is returned with its confidence and any validation flags so errors are visible and reviewable.
What does 93% accuracy mean here? My understanding is that tech like LLMs are not able to give a confidence score unlike traditional OCR methods. How was 93% calculated? How are the parsing errors surfaced?
Great question. The 93% refers to field-level accuracy on a labeled test set; a field is counted as correct only if it matches the ground truth (with basic normalization for dates/currency, etc). It’s not document-level accuracy. We don’t use raw LLM probabilities for confidence. The score is based on additional validation checks (cross-field consistency, format rules, reconciliation). Each field is returned with its confidence and any validation flags so errors are visible and reviewable.