Large and Complex Mobile Phone Investigation: 88% Review Volume Reduction
When a government fraud investigation required the analysis of communications across 12 mobile devices, the legal team faced a challenge that most eDiscovery platforms are not equipped to handle: massive, overlapping mobile data from multiple custodians with a critical need to surface only the most relevant evidence.
The matter involved complex financial fraud allegations. Communications were spread across iMessage, SMS, and third-party messaging applications on devices belonging to key individuals across the investigation. The sheer volume — nearly five million messages — made traditional review economically untenable and practically unworkable without intelligent preprocessing.
The Scale
The investigation produced data from 12 mobile devices belonging to custodians identified as central to the matter. The raw dataset included:
- 4.9 million messages across iMessage, SMS, and third-party messaging apps
- 500,000 distinct conversations
- Communications spanning multiple years
- Overlapping conversations across custodians with significant duplication
Standard mobile eDiscovery practice would have converted the full dataset to RSMF and presented 741,000 review records to outside counsel. At standard per-document review rates, that volume represented a six-figure review cost before any substantive legal analysis began.
The Approach
StreemView processed the mobile extraction data through a multi-stage pipeline designed to identify relevant communications before RSMF conversion.
Comprehensive Data Consolidation. All 12 device extractions were ingested and normalized into a unified dataset. StreemView's pipeline handled the format variation across extraction sources, standardizing message metadata, conversation threading, and contact attribution across all devices.
Intelligent De-Duplication. The same conversation thread appearing across multiple devices — a common occurrence when custodians communicate with each other — was deduplicated to a single review record with full custodian attribution. This alone eliminated a substantial portion of the raw volume without losing any evidentiary record.
Profile Normalization. StreemView identified and normalized 23 key individuals across the investigation. Mobile data presents a particular challenge here: the same person may appear under different phone numbers, contact names, or identifiers across different devices. Profile normalization ensured that communications involving the same individual were correctly attributed regardless of how they appeared in each custodian's contact data.
Keyword Filtering. Investigation-specific search terms and keyword sets were applied across the normalized, deduplicated dataset. Only conversations containing responsive terms — or providing essential context to those conversations — were flagged for RSMF conversion.
Pre-RSMF Reduction. The filtering and reduction process operated on the full conversation graph before any RSMF documents were created. This meant the expensive conversion step was applied only to the subset of communications that had already been identified as potentially responsive.
The Results
After consolidation, deduplication, profile normalization, and keyword filtering, the 741,000 potential RSMF records were reduced to 86,000 — an 88% reduction in review volume.
Direct cost savings: approximately $100,000. The reduction from 741,000 to 86,000 review documents translated directly to avoided review cost. At standard outside counsel review rates, the savings on document-level fees alone approximated $100,000. When processing, hosting, and QC costs are considered, the total impact was substantially higher.
Investigation efficiency. Beyond cost, the reduction in volume meant the legal team could move faster. Reviewers engaging with 86,000 focused, relevant documents rather than 741,000 raw message records spent their time on substantive analysis rather than filtering noise.
Defensible workflow. The pre-RSMF reduction process produced a documented, reproducible methodology — every filtering decision was logged and auditable. For a government investigation where the collection and processing workflow itself may be scrutinized, that defensibility was not a secondary concern.
Why Mobile Data Is Different
Mobile eDiscovery presents challenges that do not exist in enterprise messaging platforms. Enterprise Slack or Teams data is collected from a central server; mobile data is extracted from individual physical devices using forensic tools. Each extraction reflects a different device, a different OS version, a different application state.
The result is format inconsistency at scale: the same conversation may look structurally different in two custodians' device extractions. Contact normalization is manual without a purpose-built pipeline. Deduplication across devices requires conversation-level matching rather than simple hash comparison.
StreemView's mobile pipeline was designed for this environment — ingesting extraction formats from Cellebrite and other standard forensic tools, normalizing across device and OS variation, and applying the same pre-RSMF reduction methodology that has delivered 90%+ volume reduction across enterprise chat matters.
The 88% reduction in this investigation was not an outlier. It was the result of applying a consistent, conversation-first approach to a data type that had previously resisted it.
See StreemView in Action
The best time to validate your modern data workflow is before a preservation notice lands.
Request a DemoMore Insights
While You Were Awai: eDiscovery Landscape Evolves
Streamlining Massive Video Surveillance Review
Introducing StreemView: Pioneering a Data-Centric Future in eDiscovery
Hiding Below the Surface: StreemView Uncovers 500% More Relevant Messages
Navigating the Challenges of Modern ESI: Why We Need a Scalpel, Not a Hammer
The Significant Cost of Going Direct to RSMF: $1.1MM Saved
Hidden Data in Slack Exports: The Enterprise Grid Workspace Problem
When Discord Becomes Discoverable: 9M+ Messages Reduced to Defensible Evidence
Slack Attachment URLs in Exports: Tokens, Access, and the Hidden Risk to eDiscovery
Microsoft Teams Discovery: Why Native Processing Is the Only Approach That Works
Tackling Costly Slack Data Surprises: 96% Reduction in One Week