- OpenAI and PNNL have teamed up to assess AI coding agents for environmental review drafting tasks
- DraftNEPABench supports federal efforts to speed up environmental impact statement development and permitting workflows
- The benchmark includes 102 real-world test cases from environmental impact statements across 19 federal agencies
The Department of Energy’s Pacific Northwest National Laboratory and OpenAI have partnered to evaluate how artificial intelligence coding agents could assist federal agencies in drafting environmental impact statements under the National Environmental Policy Act, or NEPA.
PNNL said Monday the research effort centers on DraftNEPABench, a benchmarking project developed by PNNL’s PermitAI team and OpenAI to assess whether AI coding agents originally designed for software development can generate structured draft sections for environmental impact statements while supporting faster federal environmental review workflows.

The benchmark project highlights how AI is being tested to speed up environmental review and permitting workflows across agencies. As tools like these continue to shape how government processes complex documentation, leaders across the federal civilian space are expected to examine broader applications of AI in mission support and operations. Reserve your seat now for the 2026 FedCiv Summit on Oct. 29 and join experts as they discuss emerging technologies, modernization priorities and the future of digital government.
What Is PermitAI?
PermitAI is a data platform that uses AI tools to streamline and speed up the review process for critical federal infrastructure. Backed by the DOE’s Office of Policy, it started as a pilot project designed to centralize NEPA decision data.
The platform uses a testbed equipped with large language models and a large repository of historical environmental review data. It also developed the NEPA Text Corpus, a machine-readable dataset containing more than 120,000 searchable environmental review documents and decisions. PNNL said releasing the dataset has made historical NEPA data and decisions easier to access by simplifying searches.
What Can DraftNEPABench Do?
According to PNNL, DraftNEPABench evaluates AI systems that can generate targeted sections of environmental impact statements based on detailed prompts rather than simply summarizing information. The benchmark also measures a system’s ability to incorporate information from multiple sources while producing drafts supported by technical references and citations that allow reviewers to trace the original documentation.
PNNL said coding agents differ from conventional AI chat systems because they can complete a task through a series of steps, retrieving relevant information, processing multiple documents and refining draft content through repeated iterations in a manner similar to a human analyst.
“Our evaluation showed that AI coding agents can generate structured and domain-specific draft sections for environmental impact statements with promising results,” said Anurag Acharya, a data scientist at PNNL and DraftNEPABench research lead. “While the systems still require human oversight, the benchmark highlights both the potential and current limitations of these approaches.”
How Was DraftNEPABench Evaluated?
According to PNNL, DraftNEPABench was evaluated using 102 test cases derived from published environmental impact statements produced by 19 federal agencies. The dataset covers a variety of federal actions, including energy development, restoration and waste-related projects.
PNNL said 19 specialists with experience preparing environmental impact statements designed the evaluation tasks. According to the national lab, the benchmark has shown promise in reducing the time required to prepare draft documents, enabling subject matter experts to devote more attention to technical analysis and document review.
“The combination of data science expertise and private-sector AI prowess has produced a system that empowers the federal professionals responsible for scientifically credible and publicly accountable decision-making,” said Sameera Horawalavithana, a principal investigator of the PermitAI project.






