Disclosure – pulling your head out of the sand


In this article Cloisters’ barrister  Paul Epstein QC comments on what disclosure actually means and what the obligations are in the Employment Tribunal. He discusses the different types of disclosure, the new CPR test and what parties need to do.


1. What is disclosure?

An obvious point, really, but sometimes little appreciated. To a purist, a party discloses a document by stating that it exists or has existed. This is conventionally done by list. Often parties dispense with lists and inspection and simply provide copy documents to the other side.

The other party has a right to inspect documents disclosed in a list, except for documents such as privileged documents or those subject to public interest immunity.ent?

2. So, what is a document?

Another obvious point: Documents can be pieces of paper, but they can be much more than that. If documents are electronically stored, they can include emails, text messages, word processed documents, databases and also voicemail and other audio files.

The documents may be stored on servers, computers, laptops, memory sticks, mobile phones, back-up systems, the cloud and elsewhere.

3. What’s the disclosure obligation in the ET?

Rule 10 of the ET rules says that the ET has the same power to order disclosure and inspection as the County Court. But the ET rules do not specify what orders for disclosure or inspection should be made. My experience is that the ET will often order disclosure and not specify what type of disclosure that should be. This can be problematic, since if disputes about disclosure occur, the order made by the ET can give no help about what should happen. The different types of disclosure are mentioned below.

4. Standard disclosure

This is one type of disclosure. Up until 1 April 2013 it was the default order under CPR r.31.5. Possibly when an ET makes an order for disclosure and no type of disclosure is specified, this is what it has in mind.

There seem to be many misconceptions about the test for standard disclosure. The test is not Peruvian Guano or what in Berezovsky v Abramovich [2010] EWHC 2010 (Comm), Gloster J called “enhanced disclosure” (Peruvian Guano was about disclosure of documents which it is reasonable to suppose may contain information which enables a party to advance its own case or to damage that of any other party, or which leads to a train of enquiry which has either of those consequences).

Often parties argue in correspondence about whether the documents are “relevant”. But under standard disclosure the test is not relevance – “relevant is irrelevant.”

Instead, the standard disclosure test is to be found in r.31.6. Under this rule, standard disclosure requires a party to disclose only documents on which he relies and documents which adversely affect his or another party’s case or support another party’s case.

This is spelled out by Jacob L.J., in his dissenting judgment in Nichi Corporation v. Argos Limited [2007] EWCA Civ 741, CA. At [45] he said “Prior to the CPR the test under the rules was that any document “relating to any matter in question” was discoverable. The courts took a very wide view of what was covered by this. The test was laid down a long time ago when no-one had the quantities of paper they have now”.

He then quoted from Peruvian Guano.

He commented at [46]: “It is manifest that this is a much wider test than that for “standard disclosure”. I have a feeling that the legal profession has been slow to appreciate this. What is now required is that, following only a “reasonable search” (CPR 31.7(1)), the disclosing party should, before making disclosure, see and consider each document to see whether it adversely affects his own or another party’s case or supports another party’s case. It is wrong just to disclose a mass of background documents which do not really take the case one way or another. And there is a real vice in doing so: it compels the mass reading by the lawyers on the other side, and is followed usually by the importation of the documents into the whole case thereafter – hence trial bundles most of which are never looked at.”

In Shah v HSBC Private Bank (UK) Limited [2011] EWCA Civ 1154, Lewison LJ said at [25], regarding Rule 31.6, “It is noticeable that the word “relevant” does not appear in the rules. Moreover the obligation to make standard disclosure is confined “only” to the listed categories of document. While it may be convenient to use “relevant” as a shorthand for documents that must be disclosed, in cases of dispute it is important to stick with the carefully chosen wording of the rules.”

What attempt must the disclosing party make to obtain documents for standard disclosure? The answer is, a reasonable search – CPR r.31.7

5. The new CPR test for disclosure

Even though the introduction of standard disclosure in April 1999 was intended to reduce the numbers of documents disclosed, that has not happened in practice.

So, as part of the Jackson LJ reforms, a new CPR r.31.5 rule on disclosure came into effect on 1 April 2013. This rule does away with the default approach of standard disclosure in multi-track cases, and instead requires the court to identify specifically what order to make: whether an order dispensing with disclosure, or disclosure issue by issue, disclosure of documents on which the disclosing party relies plus its request for specific disclosure, Peruvian Guano, standard disclosure, or any other order.

6. So what should you do?

The Agenda For Case Management PH at London Central is typical: “6. Documents and expert evidence. 6.1 Have lists of documents been exchanged? If not, date/s for exchange.”

When you go to the ET, please do not simply agree that (or argue about whether) there should be disclosure by date [X].

Instead, identify what you are seeking, bear in mind you may wish to prepare a witness statement in support, and file and serve it in advance eg

[standard disclosure relating to all the issues]

[specific disclosure of X, Y and Z]

[all documents relating to issues A, B and C documents which it is reasonable to suppose may contain information which enables the claimant/respondent to advance its own case or to damage that of the other side, or which leads to a train enquiry which has either of those consequences in the case]


7. e-disclosure[i]

e-disclosure is the term sometimes given to disclosure of electronic documents, sometimes also known as electronically stored information (“ESI”).

The problems presented by e-disclosure will vary enormously. In one ET case in which I was instructed a few years ago, disclosure was electronic only, it ran to some 200 bundles of documents, and all bundles at the hearing were electronic. It is possible to fit more than 400 pages into one electronic bundle.

In other cases, the disclosure may involve only two lever arch files and a couple more on a memory stick.

But e-disclosure can present specific problems that are worth considering, since the volume of ESI, even in small organisations, can be immense, often, as in the case of email, because of the huge quantities of documents created (including wide scale duplication) and the fact that the documents can exist in many different forms and locations, so that they are not readily accessible except at significant cost.

A further issue might be that not all forms of ESI are searchable.

Often the parties do not know where to begin their searches. In the case of email, for example, the relevant severs are often not in their possession and sometimes not even in the jurisdiction.

The resulting two central problems with ESI disclosure are “recall” and “precision”.

“Recall” means the percentage of documentation required to be disclosed that is in fact disclosed. It is a measure of the completeness of disclosure of the necessary documentation.

“Precision” is a measure of the unnecessary (or irrelevant) documents that are disclosed at the same time as the necessary (or relevant) documents. Low precision involves disclosure of a large fraction of unwanted documentation with nothing to do with the issues.

8. Some cautionary tales

There may be very serious consequences in terms of costs, the outcome of the litigation, and professional reputation, if there are failures by solicitors and barristers regarding disclosure. A few of the more recent examples are summarised below.

1. Costs thrown away by inadequate search

Digicel (St. Lucia) v. Cable and Wireless [2008] EWHC 2522 (Ch), Morgan J.

In this complex telecommunications dispute, D and its solicitors had carried out a vast e-disclosure exercise. 1 million documents were provided to D’s solicitors, 625,000 documents were provided by D’s solicitors to an external litigation support provider. These were further reduced electronically to 197,000 documents. These 197,000 documents were subject to manual review. Ultimately, D disclosed to C 5,000 documents consisting of about 30,000 pages, or the equivalent of 83 lever arch files. The total cost of the e-disclosure exercise was over £2 million.

C argued that the search was inadequate, and the court granted C’s applications for D to re-search.

2. Costs wasted by receiving party in considering irrelevant disclosure (aka carpet bombing by disclosing party)

Vector Investments v. Williams [2009] EWHC 3601 (TCC) Ramsey J.

In this case the disclosing party disclosed irrelevant documentation, the documentation was not disclosed in a consistent chronological order, there was duplication of documentation provided since the disclosing party passed on to the receiving party bundles provided to it by others (which duplicated each other), and the rough categorisation of the documentation was unhelpful. Ramsey J. awarded the receiving party £20,000, pursuant to Rule 44.3(6)(b), in respect of its costs wasted in dealing with that disclosure.

West African Gas Pipeline Company Limited v. Willbros Global Holdings Inc. [2012] EWHC 396 (TCC) Ramsey J.

This case – which struck fear into the legal community – was a large international construction dispute, where D had guaranteed to C the performance of certain obligations regarding construction of a pipeline. The Judge found that C’s disclosure was unsatisfactory, since it had failed properly to de-duplicate documents, there were serious inadequacies in the provision of certain categories of documentation and in the failure to provide documentation from certain custodians, and the redactions produced by a manual review were inconsistent and unsatisfactory.

On application by D he considered that there were considerable costs wasted by D in dealing with that disclosure, which would be the subject of detailed assessment. He ordered C to pay to D £135,000 on account of those costs.

3. Losing the case

Earles v. Barclays Bank plc [2009] EWHC 2500 (Merc), Birmingham District Registry, HH Simon Brown QC

This was a fairly typical commercial dispute, in which C, a customer of D, claimed that D had carried out five unauthorised transfers to or from his accounts with D, which caused him considerable loss. The central issue was whether C had given authorisation. It would have been prudent for D to have retained and disclosed phone and email records, and D’s instruction sheets which D said would show that C had given authorisation for the transfers.

In the result the court declined to draw inferences against D arising out of the non-disclosure of relevant documentation, though it considered the case-law and considered whether it was right to draw such inferences.

Al-Sweady v. Secretary of State for Defence [2009] EWHC 2387 (Admin)

This was a judicial review application of the decision by D not to hold an inquiry into the circumstances concerning whether one Iraqi soldier died whilst on the battlefield or in British custody, and whether others were unlawfully detained and tortured by the British Army before being turned over to Iraqi custody.

D defended the claim on the factual basis that the deaths took place during hostilities. If the court found as a fact that they had occurred during hostilities, the claim would fail.

The disclosure by D was so deficient that the court concluded that it could not be satisfied that D was correct to say that the events were during hostilities, and accordingly it granted the application for review against the decision not to hold an inquiry.

Rybak v Langbar International [2010] EWHC 2015 (Ch), Morgan J

Marius Rybak had had an unless order made against him to deliver up various computers for imaging, including his Mac. Forensic examination showed that Mr Rybak had used file erasure software on the Mac to delete certain files. The judge found this was intentionally done, ordered Mr Rybak’s claim and defence to be struck out, and declined to order relief from sanctions under r.3.9.

4. Professional reputation


At [42] the court drew attention to the obligations of solicitors, “We conclude that the Secretary of State’s agents had simply failed for no good reason during that lengthy period to carry out these critically important and obviously highly relevant searches and this failure in our view constitutes a serious breach of their duty to give proper disclosure. It must not be forgotten that Salmon J. explained in Woods v. Martins Bank [1959] 1 QB 55 at page 60 that “It cannot be too clearly understood that solicitors owe a duty to the court, as officers of the court, to go through the documents disclosed by their client to make sure, as far as possible, that no relevant documents have been omitted from their client’s [list]”. This duty requires a solicitor to take steps to ensure that their client knows what documents have to be disclosed.”

9. What do I do when litigation is contemplated?

It is a mandatory requirement of PD31B, para. 7 that, as soon as litigation is contemplated, the parties’ legal representatives notify their clients of the need to preserve disclosable documents (NB, I would add that, in this connection, there are many steps that wise solicitors advise their clients to take).

The documents to be preserved include ESI which would otherwise be deleted in accordance with the document retention policy or otherwise deleted in the ordinary course of business.

In his NLJ article, in a section entitled “Court Scrutiny: Retention and Destruction”, Master Whitaker wrote that “The courts must scrutinise the parties’ document retention policies. Many disclosure disputes are about the destruction of data, often because of differing national laws in relation to the protection of personal data .… The courts, in appropriate cases, must stress the need for transparency and reasonableness in the policies businesses adopt for archiving and destroying their ESI. If the policy is clear and defensible, the amount of data retained and easily accessible should be greatly reduced and well-organised. In this context, the Court should not, other than in exceptional circumstances, contemplate authorising the resurrection of and search for deleted material.”

The message from this article is clear: disclosing parties, particularly those that may regularly find themselves involved in litigation, should put in place information governance strategies, including policies relating to retention and destruction. This will place them at an advantage if documentation is unavailable at trial, and the opposing party would otherwise invite the court to draw adverse inferences, or seek costs orders.

10. Possible general document advice to the client?

It can make sense to advise on the client’s overall information governance strategy, in order to avoid unnecessary document retention and destruction, and to ensure that information is stored sensibly and accessibly.

Master Whitaker, who chaired the working party that produced PD31B, in an article in NLJ, Vol. 162, Issue 7507, 30/3/12 gave a steer about how the courts might approach litigants who fail to adopt such a strategy:

What is needed is better management of ESI from creation to maintenance and disposal. This puts parties in a good position to search for and produce relevant documentation, and also controls volume and reduces storage. Businesses should create information governance strategies and the courts must be less sympathetic in making orders and imposing costs sanctions when they have taken no steps in this regard. The fact that this works is demonstrated by Earles v. Barclays Bank plc [2009] EWHC 2500, in which the Judge’s criticism of the successful bank’s preparedness to produce ESI, and the costs sanction imposed, led the bank to institute a programme of internal training in respect of e-disclosure.”

11. Practical Steps

It is important to scope out what needs to be done.

CPR Practice Direction 31B has a questionnaire attached to it with handy pointers about identifying which documents are needed, over what period, who has the documents, and how to obtain them, and TeCSA/SCL/TECBAR have produced a useful protocol and supporting guidelines. This paper draws on both.

1. Locations and Nature of Key Documents/Custodians

A. General:-

The problem with ESI is that parties often do not know how much ESI they have or where it is. They might have an idea as to which server it is on or which personal computers it is on, or which back-up tape it is on, but without a great deal more information it is very difficult for them to know how much documentation will be revealed by searches of the media on which their ESI is stored, and how much it is going to cost to search it, and what the end result is going to be.

In the days of paper documentation, at an early conference solicitors would sit down with the client and relevant witnesses and attempt to identify where the documentation was stored. It would then be reviewed for possible disclosure.

However, a frequent error made by those involved in e-disclosure exercise is to omit the conference with the witnesses regarding disclosure and go straight to a conference with IT. This will often overlook relevant sources of documents, and can cause endless troubles later.

B. Location:-

Are they stored on servers, tapes, hard drives of desktops, laptops, tablets, mobile devices, removable storage devices, the cloud? Are they in this jurisdiction or elsewhere? Do special rules apply about accessing the documents that are outside the jurisdiction?

C. Nature of Key Documents:-

What are the key documents? E-mails and word processed documents may be the most important for some cases. But were those documents created in now obsolete formats, or are there other types of documents, such as audio files, images, spreadsheets, CAD documents or drawings, Facebook, LinkedIn, Twitter, or instant messaging?

D. Custodian:-

A custodian describes the person who has or had control of the relevant documentation. A critical question, especially in e-disclosure, relates to the identities of those custodians in respect of whom disclosure will be given.

A disclosing party (and sometimes a receiving party) will often wish to limit the custodians whose documents are disclosed.

A useful case to support this approach is Goodale v. MOJ [2009] EWHC B41 (QB), decided by Master Whitaker.

This case involved complaints by several convicted prisoners, who were opiate-dependant, and either on admission to prison were being weaned off them in the community by methadone, or on admission were dependant on illicit street drugs such as heroin, and who were all subject in prison to a one-size-fits-all detoxification regime which they claimed caused them unnecessary pain and suffering, and in one case, death.

C and D had agreed the disclosure of paper documents. However, D had stated it would not undertake any e-disclosure.

C suggested that in the first instance there be e-disclosure in relation to four custodians. The Master agreed with that approach.

At [22] he held: “I am quite content that the four key witnesses that have been named by the Claimants are the right people whose ESI needs to be searched. Numerous other witnesses and custodians of documents have been mentioned but in a case like this, I do not think that searching ESI of all of them immediately is the right way to go about this exercise. In terms of a search one should always start with the most important people at the top of the pyramid, that is, adopt a staged or incremental approach. Very often an opposing party will get everything they want from that without having to go down the pyramid any further, often into duplicate material. If necessary we can go on to consider other documents such as minutes of meetings, etc. that may be held centrally which might show what, if any, discussions took place as to what the policy and practice of the Defendant should be.


2. Collection of documents


Electronic documents can be collected electronically. Care should be taken that electronic documents are collected in native format and that metadata are not altered in the collection process.

Hard copy documents may be capable of being scanned and converted into electronic documents (pdf, tiff), and made searchable via optical character recognition (OCR) or they may not.

If hard copy documents are electronically converted, they are treated as just as much electronic as native ESI, as explained by Birss J at first instance in Re Atrium Training Ltd, Smailes v McNally [2013] EWHC 2882 (Ch) [56] – [58].

Hard copy documents or documents collected in non-native format will need to be “coded” ie have descriptions applied such as date, author, addressee, document type, file name, or e-mail subject line.

3. Processing of Documents

A. Date Ranges:-

After collection, the date ranges to be applied to the documents must be identified. They may differ according to the custodian.

B. Keyword searches of ESI:-

(i) Introduction

In the early days of e-disclosure, keyword searches was about as sophisticated as processing got. The dataset had keyword searches applied, and the resulting documents were then disclosed or not disclosed as the case might be.

Things have moved on considerably since those days. Keyword searching is now just a part of the e-disclosure process, as explained below.

(ii) What does keyword searching mean?

A keyword search of ESI is a search carried out by using certain keywords, to see which documents are produced. It may be a search using keywords to produce (1) potentially disclosable documents, (2) potentially non-disclosable documents and (3) privileged documents.

(iii) What’s the problem with keyword searching?

The real problems with use of keyword searches are vividly described by one of the leading US Judges involved in e-disclosure, Magistrate Judge Andrew Peck[ii].

Lawyers are used to doing keyword searches in clean databases, such as Westlaw and Lexis, which use full sentences, full words (not abbreviations), and largely the same word to describe the same concept.

Email collections and other ESI are not clean databases. People use different words to describe the same concept; even business emails are informal, rampant with mis-spellings, abbreviations and acronyms. This will be particularly true of ESI such as SMS.

Absent co-operation, the way most lawyers engage in keyword searches is at best guesswork. The receiving party guesses which keywords might produce evidence to support its case without having much, if any, knowledge of the disclosing party’s cards (i.e. the terminology used by the disclosing party’s custodians). Indeed, the disclosing party’s own lawyers often do not know what is in the cards.

Symbols can cause problems @ – , as can searching for numbers.

(iv) Case-law

The practical difficulties can be seen in Digicel.

The case concerned interconnection of C and D’s fixed and mobile networks. D was under a statutory obligation to make those interconnections. C’s case was that there was an unlawful conspiracy by which D (consisting of some seven or eight companies) delayed the interconnection, causing it loss, and increasing D’s own profits.

D had carried out a keyword search with ten keywords in relation to the Caribbean territories: Digicel, interconnect, interconnection, licence, liberalise, liberalisation, strategy, competing, competitor and competition.

C applied to the court for D to carry out further keyword searches. C observed that some of the ten words shared a common stem reducing the ten words to six stem words.

In the draft Order attached to its application C identified an additional 34 keywords. By the end of the hearing, C had refined that to a further 19 keywords. Amongst other words, C sought a search using the words: delay, obstruct, impede and stall.

C also claimed that other means of identifying it were probably used in emails, and it might have been called “digi” or “our Irish friends”, or references to C might have involved the use of the name of individual employees.

(v) Lessons learned

On the facts of that case it was unsurprising that the Judge ordered further keyword searches after D’s unilateral decision to search using its own keywords.

The difficulties for C in identifying keywords, for example even in relation to how it was described in emails, show the importance of the early co-operation between the parties in seeking to agree keywords.

Though from the judgment alone we cannot know the detailed background to the disclosure application by C, nevertheless might C have achieved better results by proposing a staged approach by which it proposed keyword searches of a sample of the documents, to be quality checked by human review for relevance? The quality checking might have revealed that further refinements of the search terms were necessary.

This approach is envisaged by PD31B, paras. 25-27.

It is also anticipated by Goodale, a case that predates PD31B, though decided by Master Whitaker. Master Whitaker ordered that a 31 term keyword search be carried out on the ESI for the four custodians, as well as the same searches on a centrally-held system.

He contemplated that there might be a further stage in the search, since he said at [27]: “At the moment we are just staring into open space as to what the volume of the documents produced by search is going to be. I suspect that in the long run this crude search will not throw up more than a few hundred thousand documents. If that is the case, then this is a prime candidate for the application of software that providers now have, which can de-duplicate that material and render it down to a more sensible size and search it by computer to produce a manageable corpus for human review – which is of course the most expensive part of the exercise. Indeed, when it comes to review I am aware of software that will effectively score each document as to its likely relevance and which will enable a prioritisation of categories within the entire document set.”

Might C also have achieved better results by building into the process a requirement for D’s key custodians (if still available and employed by D, or if otherwise willing to co-operate) to give information about suggested search terms (such as how the project and C were known or referred to?)

(vi) Other tools to use with keyword searching

Parties may each choose to apply a list of keywords and then further consider the results. Parties may choose to carry out a test on a sample of documents to see what particular keywords throw up, and then refine their lists.

Stemming – see Digicel.

Fuzzy word searching – ke%%ord1, ke%%word2, ke%%ord3 – where % stands for any letter, may return nonsense. There needs to be agreement about how nonsense returns will be dealt with.

Concept searching/clustering – where there is frequent use of particular words in a document.

Searching for numerals within a certain number of characters of a word eg “1234 within 5 interconnection”.

Ultimately, keyword searching is usually best treated as an iterative process, which may be followed by computer review (see below) and/or human linear review.

C. De-duplication

(i) What does this mean?

De-duplication is the automated process by which duplicates are removed from the initial documents identified.

(ii) What’s the problem?

Great care must be taken with the de-duplication process. Sometimes it will be important for exact duplicates to be removed, sometimes it will be useful for near-duplicates to be removed, sometimes it will be essential for near-duplicates to be disclosed.

(iii) What level of de-duplication?

Thought is required here. Sometimes it will be appropriate to agree that all freestanding duplicates will be removed, and all duplicate families; sometimes that duplicates will be removed but the attaching or attached document will be disclosed, for context; sometimes that the duplicate will be removed and the attaching or attached document will be disclosed unless that document does not itself come within the scope of disclosure.

D. Email threading

How often have you read an email, only to find it reproduced in disclosure, or even a trial bundle, many times over? Threading software identifies the end of the e-mail thread. It removes duplicate emails, and produces the entire thread just once, plus any unique emails.

E. Predictive coding

This may be the next step in the processing after the data has been subject to keyword reduction.

(i) What does this mean?

Predictive coding (also known by other names, such as technical assisted review, or TAR, or computer assisted review, and others) is the process by which technology is used to identify relevant documents, and prioritise their relevance (N.B. “relevance” is used in the Shah sense, as above). A document is coded as relevant or not. (It may also be issue tagged.)

This is usually for larger cases only.

(ii) What’s the problem?

In larger document-heavy disputes there may be such a volume of documents that there is insufficient time for a manual human review of the disclosure.

(iii) How does it work?

An experienced lawyer, with detailed knowledge of the issues in the case, is presented by the software with a selection of documents which he codes as relevant or irrelevant.

The software produces a further selection of documents, which it has coded, and the human reviewer re-codes them.

Gradually, depending on the number of times the review process is carried out, the software produces and refines a set of rules for scoring the relevance of each document.

When the human review is satisfied that the predictive coding carried out by the software is sufficiently accurate, the software is then used to code the documents harvested.

Ultimately, each document is given a percentage score, based on the prediction by the software of the document’s likely relevance. It is up to the parties to determine whether there will be disclosure of documents coded as having a certain percentage range of relevance e.g. over 20%, over 60%, etc.

(iv) How accurate is it?

There have been some illuminating studies carried out regarding the recall and precision of software compared to human review.

One article concerning such studies is “Document Categorization in Legal Electronic Discovery: Computer Classification vs Manual Review”, Journal of the American Society for Information Science and Technology, 61(1): 1-11, 2010, Roitblat, Kershaw and Oot.

The article refers to a study from a 1985 pre-predictive coding experiment that instructed experienced lawyers and paralegals to use keyword searches to try to obtain relevant documentation from a collection of 40,000 documents from a San Francisco Bay Area rapid transit accident. The average recall was 20%, though the reviewers themselves thought they had retrieved 75% or more of the responsive documents.

The 1985 experimenters concluded that it was “impossibly difficult” for the human reviewers to predict the exact words, word combinations and phrases used by those documents, and they suggested human review of documents.

This is no longer practicable with some of the large datasets (in 1985, the experimenters were concerned only with 40,000 documents.)

Roitblat et al over several years from 2006 used attorneys to review and re-review whether documents were responsive to a US Department of Justice merger investigation into the takeover of MCI by Verizon. They compared their recall and precision with two computer based systems. They concluded “On every measure the performance of the two computer systems was at least as accurate (measured against the original review) as that of a human re-review. Redoing the same review with more traditional methods as was done during the re-review had no discernible benefit.”

Another is “Technology-Assisted Review in E-discovery Can Be More Effective And Efficient Than Exhaustive Manual Review”, Richmond Journal of Law and Technology, vol. XVII, Issue 3, Grossman and Cormack.

This article offers evidence that technology-assisted processes, while more efficient than human review, can also yield results superior to those of exhaustive manual review, as measured by recall and precision. The recall of manual reviews varied from 25% to about 80%. For the technology-assisted processes it was between 67% and 86.5% (table 7, p. 37).

The conclusions were that “the average efficiency and effectiveness of the five technology-assisted reviews surpasses that of the five technology-assisted reviews.”

The human reviewers whose results were analysed were assessors from the Legal Track of the National Institute of Standards and Technology’s Text Retrieval Conference (“TREC”). Since its inception in 2006 the TREC Legal Track has had the goal of developing “search technology that meets the needs of lawyers to engage in discovery in digital document collections.”

Reasons why human reviewers may produce lower recall and precision include fatigue and boredom, as well as systematic differences (subjectivity).

An example of the inconsistent results produced by human review can be seen in the West African Gas case.

D outsourced the human review of documentation to an external provider. That provider, despite initial briefings from Herbert Smith, and escalation procedures, produced inconsistent results.

In some cases, where there were duplicated documents contained within the disclosure, all of the document was disclosed and in other cases only part of it was disclosed, or it was not disclosed at all.

It turned out, following a further human review, that 10% of those documents previously designated as not disclosable were then disclosable.

This review failure was one of three bases on which costs were ordered against D.

From reading the case report, it is not possible to understand the precise circumstances in which this came about. One possibility is that the due diligence procedures before the contract was made with the external provider might have been more rigorous.

Certainly, it again shows the importance of having in place robust agreements with providers which indemnify the client/solicitor for costs and other consequences following on from failures in the disclosure process.

F. Human Review

There may be a next or other step in the processing of the documents, which is human review. If it follows keyword searching and predictive coding, the parties will have agreed or the court will have ordered that human reviewers will determine which of the documents produced by those processes are disclosable/non-disclosable/privileged.

4. Disclosure/Exchange/Inspection

Often, at least for electronic documents, parties may simply disclose the documents themselves, and dispense with a list. The same may be true for hard copy documents.

As for exchange, there are different ways this may be done. It could be left to the electronic providers for the parties to agree, or the details agreed between the parties – see eg TeCSA Guidelines A6.12ff.

Something ought to be agreed about inadvertent disclosure of privileged documents, to the effect that the other party may not use them or their contents.

Keys to the warehouse:- A possible strategy for some clients making disclosure will be the “keys to the warehouse strategy”. Some receiving parties may be anxious to secure as much documentation as possible, and this would satisfy this desire. This approach has the advantage of shifting the cost onto the receiving party to trawl through the documentation, in order to identify any documents on which it wishes to rely. Needless to say, it should not be done without agreement of the receiving party, for the reasons explained above

12. Applications

1. Timing

Any application is best made early, before it is too late to sort out the problems.

PD31B, para. 17 provides that if at any time it becomes apparent that the parties are unable to reach agreement in relation to the disclosure of ESI, the parties should seek to directions from the court at the “earliest practical date”.

This is echoed in Vector, another pre-PD31B case, where Ramsey J. said at [91]: “Secondly, there are references in the correspondence to possible applications to the court in relation to the issue of the way in which the documents were disclosed. Again I consider that if major problems arise on inspection the parties should apply to the courts so that issues are raised and dealt with at the time. Whilst I quite understand the parties are reluctant to be diverted from such activities as inspection by having to make a court application, issues can often be brought to a head and resolved by the court instead of dealing with them in lengthy correspondence between the parties.”


2. Where irrelevant documents disclosed

It can be wise, depending on the extent of the inconvenience and extra cost wasted, and where the disclosure has been carried out unilaterally, for the receiving party to apply to the ET for a further refinement of the search and for costs.

3. No proper disclosure list

Sometimes a disclosing party may disclose vast amounts of unstructured ESI without any proper list.

This places the receiving party in real difficulty. Since it does not know what the ESI looked like in the first place, it can be difficult or impossible for it to make an intelligent search of the material, even if all relevant material has been disclosed.

Imagine that it does not know that a particular body was set up and met weekly to discuss a particular issue (say the issue of sex discrimination in the workplace) it cannot know to search for its minutes.

Even if it knows that such a body met weekly, if the minutes are not grouped together in the disclosure in a structured way, and the list is uninformative as to the existence of such minutes, it may be difficult or impossible for it to find those minutes, if say, they have been saved using different naming conventions, or even if they have been saved using the same convention through the group’s history, but the name of the group is occasionally mis-spelled.

In such circumstances, the best approach is for an application to be made for a proper list, and structured disclosure.

4. Other applications

These may be made where there has been inadequate search of media, inadequate search in relation to custodians, and for a variety of other reasons.

In all these applications it is wise to consider the staged approach contemplated by Goodale and PD31B, as well the possibility of data sampling and further meetings between the parties to resolve issues.

13. A litigator’s guide to buying e-disclosure services

(The contents of this section of the paper are extracts, taken with the kind permission of Mike Taylor, i Lit Limited, from a talk that he gave at the TeCSA/TECBAR Conference on 30/6/11.)

Where a firm is considering selecting external service providers (and not all will need to do so, see below) they would be well advised to pay particular attention to the external service provider’s capabilities, its size and experience, the charging methodology, and its working assumptions.

A provider should be asked whether it uses an off-the-shelf processing engine, or a proprietary application, and if the latter, how it has been tested and benchmarked.

In order to assess the level of sophistication of the provider, it is worth asking for its daily document processing capacity.

It should be asked whether it sub-contracts, and if so to whom and how it quality checks sub-contractors.

The solicitor will need to know what document types the provider is unable to process, and whether it can search and host audio files (increasingly important for clients that record telephone calls).

The solicitor will also want to know whether the provider can scan, code and OCR paper documents and add them to the ESI, its capabilities for a hosted review tool, and its capacities for printing and paginating large quantities of documents if required.

As for charging methodology, there are two main broad approaches. One is “in the top” pricing, that is charging a price per gigabyte of data provided by the clients. The other is “from the bottom” pricing, charging per gigabyte of data that is passed or reviewed to the client after filtering and processing.

If “ in the top” pricing is used, the scoping phase of the process becomes even more important, as a party should only give the absolute minimum amount of data to its provider for processing.

If “from the bottom” pricing is used, particular attention must be paid to development of the data filters to ensure that as few irrelevant documents as possible make it through to the review stage.

On top of these processing charges there are always a great deal of peripheral costs that soon add up, including potentially costs for data collection, data preparation, data processing, data manipulation and data production, and data archiving.

The provider’s working assumptions are often overlooked by legal teams, on the basis that all providers provide essentially the same product. This can be a mistake, since in order to provide a quote the external provider has to make certain assumptions about the data and the filters that will be applied to it, including the amount of data collected, and the explosion rates (the size of a file once it has been de-compressed – for example, Microsoft compresses emails to ten times size) and filtration rates (when “from the bottom charging” methodology is used).




[i] This section of the paper is the summary of the problem as described in Goodale v. MOJ [2009] EWHC B41 (QB)

[ii] The paragraphs in this section are taken from an important article “Search, Forward” in the October 2011 issue of Law Technology News.