Judiciary — Teaching the Bench
Working with the federal judiciary on the practical use of AI tools — through training, research, chambers-facing tools, and structured feedback.
The Lab’s judiciary workstream — the PennLaw-Judiciary AI Testbed — provides federal judges and their chambers with structured AI access, training, and ongoing support, governed by a formal Memorandum of Understanding. The work spans use-case research, chambers-facing tools, and a continuous feedback loop with participating chambers.
Per the Lab’s pillar architecture, this work sits in Teach — judicial AI training is teaching the bench, not partnership-mediated Building.
What it is
A research partnership across multiple federal courts and the Delaware Court of Chancery, organized around shared infrastructure, structured onboarding, and a regular cadence of feedback.
- Participating courts: E.D. Pa., D.N.J., the Third Circuit, and the Delaware Court of Chancery — 15+ chambers and Pro Se offices across the four jurisdictions.
- Infrastructure: Shared workspace per chambers; orientation materials maintained across model upgrades; onboarding scripts; a formal Memorandum of Understanding.
- Cadence: Monthly Zoom feedback meetings, plus one-on-one sessions with individual chambers as needs arise.
- Deliverables: Orientation guide, best-practices documents distilled from clerk interviews, activity reports.
What we’ve learned (so far)
The Testbed is producing a working empirical picture of where AI helps in chambers and where it creates risk.
Where chambers get genuine value:
- Section-by-section opinion drafting (full drafts hallucinate; section-by-section consistently works)
- Procedural history and factual background from uploaded pleadings
- Summarizing party arguments across multiple briefs
- Oral-argument question generation
- Plea colloquy and scheduling-order scripts (repetitive / template tasks)
- Timelines and charts assembled from case records
- Proofreading and citation formatting
- Digesting voluminous pro se pleadings
- Rewriting content for different audiences
- Custom GPTs for specific motion types or doctrinal tests
Where AI fails — and where the failure mode matters:
- Full opinion drafts (hallucination)
- Independent legal research and case law retrieval (high hallucination rates on case citations)
- Nuanced or cutting-edge legal analysis
- Writing-style mimicry
- Stream-of-consciousness organization
- Working with sealed or multimedia content
Emerging governance issues:
- AI-generated filings from litigants — both pro se and represented — with fabricated quotes and plausible-sounding but legally unsound arguments.
- Judicial-ethics questions reaching the Judicial Conference level.
- The transparency question — what should the public know about how chambers use AI?
Status
Active and expanding. Monthly meetings with 15+ chambers continue; the orientation materials have been updated through model generations; new chambers are being onboarded.
Sensitivity
The most external-facing of the Lab’s workstreams. Chambers context is presumptively confidential. Public materials describe the program structure and aggregated findings — not the work of any individual chambers.