Back to InsightsCase Study

Hiding Below the Surface: StreemView Uncovers 500% More Relevant Messages

December 8, 2025·7 min read

An AM Law 200 firm engaged StreemView to analyze a Slack data population comprising over 700,000 messages across 5,400 distinct conversations and channels. The goal was to apply negotiated Boolean search terms in a defensible, efficient manner to identify responsive evidentiary content without overrunning downstream review processes. The court had mandated a ±10-message context window around each search hit—a requirement that exposed a fundamental problem with how traditional eDiscovery tools handle modern chat data.

The Problem with 24-Hour RSMF Search

The client needed to apply negotiated Boolean and proximity search terms against a data population of nearly three-quarters of a million messages scattered across thousands of channels. The conventional approach would have been to convert that data into 24-hour RSMF transcripts and run keyword searches against them.

That approach has a structural flaw that becomes critical when Boolean or proximity operators are involved. When conversations are segmented into arbitrary 24-hour blocks, AND and proximity operators can only match terms that appear within the same daily transcript. A conversation that begins on Tuesday evening and continues Wednesday morning becomes two separate documents. A search for “Project X AND settlement” will miss the hit entirely if “Project X” appears in Tuesday’s transcript and “settlement” appears in Wednesday’s.

This is not a hypothetical edge case—it is an inherent structural limitation of daily transcript segmentation. Any proximity or multi-term Boolean search applied to 24-hour RSMF data will silently miss hits that span a time boundary, with no indication that relevant content exists on the other side.

The StreemView Approach: Conversation-Native Search

StreemView’s Hit Window technology addresses this by operating on conversations as complete units rather than as time-segmented transcripts. Search terms are applied across the full natural span of each conversation, meaning AND and proximity operators can match across days, threads, and any temporal boundary that the conversation itself crosses.

When a search hit is identified, the Hit Window expands the result to include the ±10-message context window mandated by the court—capturing not just the matching message but the surrounding discussion that gives it meaning. The expansion follows the conversation’s natural boundaries rather than an arbitrary 24-hour cutoff, preserving the full flow of decision-making and ensuring that context is never artificially severed.

This approach also enables a precision mechanism the 24-hour model cannot offer: the ability to scope results to a defined hit window—for instance, limiting results to conversations where a hit occurred within a rolling 30-day period—eliminating peripheral content that falls technically within the dataset but well outside any relevant discussion.

Results: The Numbers

The side-by-side comparison between StreemView and the traditional 24-hour RSMF approach on the same dataset was stark.

Direct search hits identified:

  • StreemView: 1,206 hits across 41 conversations
  • Traditional 24-hour RSMF: 192 hits across 33 conversations

Total messages and attachments identified for export (applying the court-mandated ±10 context window):

  • StreemView: 10,563 items — representing a >99% reduction from the 700k+ messages ingested
  • Traditional 24-hour RSMF: 2,012 items

Relevant message yield:

  • 500% more direct hits with StreemView
  • 81% of relevant content by message volume was missed by the traditional approach but successfully identified by StreemView
  • 76% of relevant content by RSMF count was missed by the traditional approach
  • 20% of distinct conversations with hits were missed by the traditional 24-hour approach—meaning the traditional method left the firm without knowledge of who participated in those relevant discussions

Precision gain from scoped hit window:

  • Eliminated 13,467 peripheral items that fell outside the 30-day hit window
  • Improved result-set precision by 56%

What the Numbers Mean

The 192 hits the traditional approach returned were not wrong—they were just profoundly incomplete. The 1,014 additional hits StreemView identified weren’t marginal or borderline responsive content. They were substantive communications about the disputed issues that the 24-hour segmentation model had placed out of reach of the firm’s own search queries.

The 81% miss rate by message volume is the figure that matters most. In practical terms, it means that a legal team relying on traditional 24-hour RSMF search certified its search methodology while working with less than one-fifth of the genuinely responsive content in the dataset. The remaining 81% existed in the export. It had been processed and ingested. It simply could not be found by tools that apply search against artificial daily segments rather than actual conversations.

The 20% of distinct conversations missed is equally significant. Missing a conversation entirely means missing not just messages but participants—losing visibility into who was involved in responsive discussions, which matters for custodian identification, litigation hold completeness, and deposition preparation.

Precision Without Sacrifice

One of the consistent concerns about conversation-based search is over-inclusion—that capturing full conversational context will expand result sets to an unmanageable size. The precision numbers here address that directly.

By scoping results to a 30-day hit window and applying the ±10-message context requirement, StreemView eliminated 13,467 peripheral items and delivered a 56% improvement in result-set precision—while simultaneously capturing 500% more direct hits than the traditional method. The firm ended up with a larger hit count and a more precise, defensible result set, not a sprawling over-inclusive one.

The >99% reduction from 700,000+ ingested messages to 10,563 export items demonstrates that conversation-native search, applied with appropriate scoping, achieves both goals: maximum recall on responsive content and maximum reduction of non-responsive volume.

Strategic Takeaway

This matter demonstrates that StreemView’s conversation-based search and Hit Window approach are the only reliable methods for applying Boolean and proximity terms to modern text message and chat data. By breaking free of rigid 24-hour segmentation and targeting only the content that matters—augmented by a court-compliant context window—legal teams can dramatically boost hit rates, reduce review volumes, and control costs without compromising defensibility.

For litigation teams handling modern disputes, the lesson is clear: traditional eDiscovery search methodology designed for email and documents significantly under-discovers chat evidence. In high-stakes matters, the difference between finding 192 hits and finding 1,206 is not an optimization—it is the difference between a defensible, complete production and one that quietly omitted 81% of the responsive record.

Share this post