FoVer: First-Order Logic Verification for Natural Language Reasoning

Yu Pei; Yongping Du; Xingnan Jin

Vol. 13 (2025)

TACL approved

FoVer: First-Order Logic Verification for Natural Language Reasoning

Published 2025-12-25

Pei, Yu
Yongping Du
Xingnan Jin

Pei, Yu
Beijing University of Technology

Yongping Du
Beijing University of Technology

Xingnan Jin
Beijing University of Technology

Abstract

Large Language Models (LLMs) have shown remarkable capabilities in various tasks, including logical reasoning. However, their propensity for generating incorrect or inconsistent responses remains a significant concern. To address this issue, we propose FoVer (First-order logic Verification), an automated pipeline that verifies the logical correctness of reasoning texts using first-order logic. The pipeline operates in two main steps: (1) LLM-driven translation of natural language into executable logical expressions, and (2) automated logical verification using the Z3 theorem prover. We evaluate FoVer on specialized logical datasets (ProofWriter and FOLIO) and real-world LLM outputs (REVEAL). The results demonstrate that FoVer significantly outperforms existing methods in logical verification, showing notable improvements in reliability and accuracy across both ideal and practical scenarios. The pipeline also demonstrates its potential in identifying annotation errors in existing datasets, and it could be utilized for constructing new logical reasoning datasets. This work presents a significant step forward in enhancing the reliability and trustworthiness of LLM outputs, particularly in tasks requiring logical integrity.

Article at MIT Press Presented at EMNLP 2025