servient client and partners

"We highly recommend Servient for its leading-edge Predictive Review tools, lightning fast searches, superb user interface and outstanding customer support." Arthur S. Linker, Partner, Katten Muchin Rosenman LLP

Understanding Statistical Validation

Statistical validation is used to measure the effectiveness of the Predictive Review process. The statistical analysis measures the extent to which the automated document decisions correlate to the document decisions that would be made if the documents were to be manually reviewed.

Servient uses two interrelated statistical methods in the validation process.


First, Servient implements K-Fold Cross Validation to measure the effectiveness of the learning model. To perform K-Fold Cross Validation, the reviewed documents are split into an equal number of subsets (the membership of which is random). Servient holds out one of the sets and trains the learning model on the remaining documents. Servient then applies the model to the held out-set, generating automated review calls for each document in the held-out set.

The manual review calls applicable to the held-out set are compared against the automated review calls to measure the consistency between the manual and automated review decisions. This process is repeated for each subset of documents. The results of each run are averaged to generate an overall consistency measurement for the learning model.

The consistency rate produced by K-Fold Cross Validation serves as an indicative measure of the model's effectiveness of distinguishing between relevant and irrelevant documents. As Servient's "active" learning technology is premised on the intelligent selection of documents to review, the set of reviewed documents provides a statistically relevant overview of the important features contained in the entire data set. Thus, K-Fold Cross Validation produces a meaningful measure of effectiveness of the model.

However, because K-Fold Cross Validation focuses only on the documents that have been reviewed, the Servient validation process also includes additional statistical sampling of the non-reviewed documents. Servient automatically generates a random sample of non-reviewed documents based upon the desired confidence measure and acceptable error rate. The results of the manual attorney review of the sample are compared against the automated document decisions and various quality measures are generated.

The combination of K-Fold Cross Validation and statistical sampling provide a transparent analysis of the quality of the process. Predictive Review is based upon a sound, repeatable and measurable process.

Quality Assurance >