Legal AI Glossary
100+ legal AI and legal technology terms explained in plain language for lawyers, paralegals, and legal operations professionals.
Artificial Intelligence (AI)
Computer systems that perform tasks typically requiring human intelligence, such as understanding language, recognising patterns, and making decisions.
Large Language Model (LLM)
An AI model trained on massive text datasets that can generate, summarise, and analyse human language. Examples: GPT-4, Claude, Gemini, LLaMA.
Retrieval-Augmented Generation (RAG)
A technique that combines an LLM with a search system to ground AI responses in specific documents, case law, or statutes rather than relying solely on training data.
Hallucination
When an AI model generates factually incorrect or fabricated information that appears plausible. Critical risk in legal contexts where accuracy is essential.
Prompt Engineering
The practice of crafting input instructions to get better, more accurate responses from AI models. In legal AI, this includes jurisdiction-specific and practice-area prompts.
Vector Embedding
A numerical representation of text that captures semantic meaning, enabling AI to find conceptually similar documents even when they use different words.
Fine-Tuning
Additional training of a pre-trained AI model on domain-specific data (e.g., legal documents) to improve its performance on specialised tasks.
Natural Language Processing (NLP)
A branch of AI focused on enabling computers to understand, interpret, and generate human language. Core technology behind legal document analysis.
Optical Character Recognition (OCR)
Technology that converts scanned documents, PDFs, and images into machine-readable text. Essential for digitising legacy legal documents.
BYOK (Bring Your Own Key)
An architecture where law firms use their own AI provider API keys, ensuring complete control over data routing, costs, and provider selection.
Token
The basic unit of text that LLMs process. Roughly equivalent to ¾ of a word. Token limits determine how much text an AI can process in a single request.
Context Window
The maximum amount of text an LLM can consider at once, measured in tokens. Larger context windows allow analysis of longer documents.
Temperature
A parameter controlling AI response randomness. Lower temperature (0.0–0.3) produces more deterministic, consistent outputs preferred for legal work.
Inference
The process of running input data through a trained AI model to generate output. Each AI query involves an inference step.
Agentic AI
AI systems that can plan and execute multi-step tasks autonomously, such as researching a legal question, drafting a memo, and filing it — with minimal human intervention.
Semantic Search
Search that understands the meaning behind queries rather than just matching keywords. Finds relevant case law even when exact terms differ from the search query.
Machine Learning (ML)
A subset of AI where systems learn from data to improve performance over time without being explicitly programmed for each task.
Data Residency
The physical or geographic location where data is stored. Important for GDPR, data sovereignty laws, and client confidentiality requirements.
Encryption at Rest
Protecting stored data by encoding it so it cannot be read without the decryption key. Standard requirement for legal data security.
Encryption in Transit
Protecting data as it moves between systems (e.g., between a lawyer's browser and the server) using protocols like TLS 1.3.
API (Application Programming Interface)
A set of protocols that allows different software systems to communicate. Legal software APIs enable integrations with document management, accounting, and court filing systems.
LEDES (Legal Electronic Data Exchange Standard)
An industry-standard format for electronic legal billing. Ensures invoice compatibility between law firms and corporate legal departments.
IOLTA (Interest on Lawyers' Trust Accounts)
A program where client trust funds are pooled in interest-bearing accounts, with the interest funding legal aid programs. Strict compliance rules vary by jurisdiction.
e-Discovery
The process of identifying, collecting, and producing electronically stored information (ESI) in litigation. AI-assisted review dramatically reduces document review costs.
Predictive Coding
An e-discovery technique where AI learns from human reviewer decisions to automatically classify documents as relevant or irrelevant.