How Lawline Builds a Case Timeline from Raw Documents
November 20, 2024When an attorney uploads a discovery package to Lawline, the system doesn't see a PDF — it sees a structured problem waiting to be solved.
**The Extraction Pipeline**
Every document passes through three stages: structural analysis, entity extraction, and chronological anchoring.
In the first stage, Lawline identifies the document type — contract, deposition transcript, email chain, police report — and applies the appropriate parsing heuristics. A contract is read differently than an email thread.
In the second stage, the system extracts named entities: people, organizations, dates, dollar amounts, and legal references. Each extraction is kept with its source location — page number, paragraph, and bounding box.
In the third stage, all date-anchored entities are sorted and deduplicated into a master timeline. Conflicting dates (e.g., a document signed on one date but transmitted on another) are flagged for attorney review.
**Source Linking**
Every event in a Lawline timeline carries a citation object: the document filename, page number, and the original text that produced the extraction. This is non-negotiable. An unsourced fact in legal work is worse than no fact at all.
**Speed**
The average timeline from a 200-page discovery package completes in under 10 seconds. This is possible because the pipeline runs in parallel across document sections, not sequentially.
The goal has always been: give the attorney the review problem, not the organization problem.