Speech by the Deputy Head of Civil Justice: Future Visions of Justice

Speeches

King’s College London Law School

Speech by The Rt. Hon Lord Justice Birss Deputy Head of Civil Justice

Future Visions of Justice

Monday 18 March 2024

Introduction

  1. Good evening everyone. Can I say what a pleasure it is to be here and can I thank Cari Hyde-Vaamonde for organising this event and the Dean, Dan Hunter, for inviting me to come here and talk to you.

  2. I am going to talk about three people: William Gibson, Annie Jump Cannon and Hedy Lamarr. You may not know who they all are but don’t worry, I will tell you. Even if you know who they are, you may not know why they are relevant to the future of justice, but don’t worry, I will try and explain that to you as well. They each represent a different facet of the question of where the future of justice lies and how we should get there.

  3. Let me start with William Gibson. He is a North American novelist who pioneered a sub-genre of science fiction called cyberpunk. He writes vividly about the interactions between human beings and computer networks. These works have quite a dystopian feel. William Gibson famously coined the term cyberspace. His relevance this evening however is for something he said which is a pithy way of making my first point, about the future of the justice system. He said:

    The future is already here– it’s just not evenly distributed.

  4. And I think that applies in spades to the future of justice. The Master of the Rolls and I speak about the Digital Justice System. I will say more about it a bit later. But one thing to appreciate is that the Digital Justice System already exists to a significant extent. We are not suggesting that a massive IT project is needed to build an entirely new IT system. Much of what is going to happen – and should happen– is already present. It is not necessarily in plain sight, but it is there and we can learn from it. Part of what we need to do is spread it more widely. Let me give you three examples:

(1) Algorithms

  1. The idea that justice might be dispensed by algorithm appears, when it is put as starkly as that, to be a concerning one. On the contrary justice needs to be nuanced, it depends on the detail and it often must depend on the humanity of the situation. Therefore, it might be said, justice should not be boiled down, by a reductive process, into an algorithm or a series of algorithms.

  2. Now I agree with those sentiments but there is more to say on this. In fact our system of justice already uses algorithms and moreover it already has coded up at least one of those algorithms into an automatic computer system.

  3. The example is a little dry, and not well known, but it is part of the Online Civil Money Claims (OCMC) system in England and Wales governed by PD51R. If a litigant brings a debt claim against a defendant, the defendant has the opportunity to admit the claim but ask for time to pay the debt. A defendant who wants time to pay needs to fill in a form to explain their means in order to show how much money they could pay per week. In the online system this form is filled in on screen. In the paper world, the way this process used to work is that a member of court staff would read the paper form in which the defendant had given the information about their financial resources. The staff member would apply a formula written on a sheet of paper and then draw up a court order directing the amount of money per week which the defendant would have to pay. The formula was designed to make sure that the amount the defendant paid was affordable. The formula was not really secret but nor was it published in any clear way.

  4. In fact it turned out that two different formulae were being used in different parts of the court system by accident. The differences were small but they did exist. In designing this bit of the online system the relevant members of the rule committee decided which formula to use as being the appropriate one, and made provision in the rules that the formula would be applied by the computer automatically. This is not all that different from the approach which happened in the paper world, because the court staff person had no discretion, they were simply required to apply the formula. In both the paper system and the online system the same safeguard was built into the rules. The safeguard was and remains that anyone dissatisfied with the order made as a result of applying the formula is entitled to apply to a judge. The judge has free discretion to reconsider the matter.

  5. When this rule was put into PD51R the formula itself was annexed to the rule. So perhaps ironically by digitising the application of the formula, by coding up the algorithm, the legal system became more transparent. The formula itself was published in the rule.

  6. Now this is about a simple mathematical algorithm and in that sense is much less complex than the algorithms used in Large Language Models and machine learning. Nevertheless it is unquestionably an algorithm applied by a machine. As far as I know this is caused no difficulty of any sort and attracted very little comment.

(2) Purely digital service of documents

  1. Another way in which we hope digital technology will improve the justice system of the future is in the service of documents. In civil justice we have abolished paper service for certain classes of case. In the past I’ve heard people tell me that would have been an extraordinary step. However when we took it, although there were some initial minor teething troubles of a technical nature, the process worked extremely well. It is now routine in the area in which it applies. It applies to claims for damages in the county court in which both parties are legally represented at the time the case begins. This is governed by PD51S. In the HMCTS Damages system, when a legally represented claimant begins a claim, they are required by rule to use the online system, accessed via a portal called MyHMCTS. The defendant’s legal representative is also required to have an account on MyHMCTS and receives the notification of issue of the claim via the online portal. There is no paper claim form and no service of any paper documents at all. The Particulars of Claim are made available electronically and assuming the claim is defended, so too is the Defence. Paper service has been abolished and no one seems to have noticed.

  2. Another feature of this digital system which is worth mentioning is the way in which the digital process has abolished a small cottage industry which used to exist in the paper world. In a money claim or a damages claim, after the Particulars of Claim and Defence have been exchanged the next step in the process is for both parties to fill in a Directions Questionnaire or DQ. Parties routinely failed to fill in and file their DQs on time. The cottage industry consisted of members of court staff both in the central offices of the civil justice system in Salford and Northampton and also in local courts chasing the parties for their DQs. One of the good aspects of the design of the new digital system in the OCMC and Damages systems is that the DQs are simply part of the online dialogue. The party interacting with the digital system cannot finish their interaction without filling in the DQ. So, for example, the defendant won’t be able to file their defence without also filling in their DQ. At a stroke this barrier to the smooth operation of the process of justice has been removed by good use of technology.

(3) Work allocation and the digital file

  1. My third and final example of a piece of the future which is already here, is something called the work allocation tool. In the paper world when a judge in the county court needs to look at a court case to do something such as deal with an application of some kind, there is a trolley loaded up with piles of paper. One of the members of the court staff will wheel that trolley through the court building distributing paper files to the judges. In some other county courts, the judges go to the file store where they find paper files in aged files so that they can find the oldest unresolved paper applications and take them to their room to deal with them.

  2. For the cases in the reformed systems which HMCTS have already built there is a digital file. In addition to a digital file we also have a work allocation tool. The work allocation tool applies a series of simple algorithms under judicial control which allocate the paperwork to the right kind of judge. I say paperwork but of course no paper is involved. It’s just that this is the sort of work which used to be called paperwork and which does not involve a court hearing. Another phrase for it is electronic box work. It is a common component across the civil, family and tribunals IT systems, so that judges will find their electronic box work for all three in the same place.

  3. These two features – the digital file and the work allocation tool – also mean that we can take advantage of scale. Instead of being limited to a single physical court with its piles of paper, judges in a cluster of courts will be able to see the electronic box work which is outstanding and applicable to their group of courts, and deal with it. It is only feasible to operate in this way because the files are digital and the work is allocated in that way.

Pulling these things together

  1. These are just examples of three particular aspects of what the future Digital Justice System will contain. We are learning from the benefits of digital working, and we will be able to use these ideas and many others to expand all of this. The problem really is that the future is not evenly distributed but we are trying our best to deal with that.

Annie Jump Cannon

  1. Do you know who Annie Jump Cannon was? or Williamina Fleming?

  2. They were two of a group of exceptional women working at Harvard at the turn of the 19th century. They made major contributions to astrophysics. One of their most notable contributions was the system for stellar classification based on spectral emission lines from gases in the atmosphere of the star. This system is still in use today. So what you might ask? Well, that group were known as the “Harvard Computers”. They were also known as “calculators”. What this goes to show is that the word computer and for that matter, the word calculator, both used to refer to human beings. Of course the reason why was because in those days that kind of calculation or computation- which today we associate entirely with machines- was carried out by individuals. Tasks which in 1900 were the exclusive domain of human thought and action are today the domain of digital systems. The language has moved on. And so the term computer today means a machine which does these tasks and not a person.

  3. The question you might ask is if that can happen to the word computer, will the same thing happen to the word lawyer or the word judge?

  4. This brings me to artificial intelligence. And before I go any further can I tell you what I think the answer is. My own view is that AI used properly has the potential to enhance the work of lawyers and judges enormously. I think it will democratise legal help for unrepresented people. I think it can and should be a force for good. And I think it will be as long as it is done properly and appropriately.

  5. Jumping ahead a little, history suggests that to argue that we do not need to worry, because machines will never be capable of doing this or that kind of information processing task, is usually a mistake. The argument needs to be that for us to have a system of justice, people must be at the heart of it. The fact that something can be done does not always mean it should be done. When one thinks about the rule of law and access to justice, a critical aspect is public trust in the legal system itself. I can’t imagine a legal system which does not have people at its heart as key representatives and decision makers.

  6. The launch of ChatGPT about 18 months ago transformed the profile of artificial intelligence generally. Machine learning AI systems were already in use in places in the justice system but it’s plain that these Large Language Models and generative AI systems, which are the basis for ChatGPT, have attracted a great deal of attention in the law. The potential impact of these systems is significant. Now this talk this evening is not a primer on technology. I am acutely aware that many of you here will know much more about the detail of machine learning than I do. However, it might be worth stating in simple terms what I believe the important things about the latest machine learning technology are, from the perspective of the law. I suggest perhaps that those in the law really need to know three things:

    a. There are now IT systems which can in effect understand English both written and spoken at a conceptual level and in context. I say “in effect” because of course these machine learning neural networks do not really understand at all. But they do create a facsimile of understanding and for many practical purposes they might as well be regarded as the equivalent of something which understands. These systems can generate new material as an output hence the expression “generative” AI.

    b. While private information cannot be entered into public ChatGPT and its equivalent chat bots, AI tools are available into which private information can safely be entered.

    c. Although these systems can hallucinate, there are many applications in which this does not happen to a significant extent or does not matter.

  7. The machines can interact with people in a natural way and they can absorb and conceptualise the contents of large collections of documents and spoken English. They can summarise material. Law firms and well-funded corporations are using this kind of AI already and are being offered systems of more and more capability. Let me give you three examples.

(1) democratising legal advice and assistance

  1. One of the ways in which Large Language Models have the potential to transform access to justice is in their ability to give legal advice and assistance either for free, or at very little cost in circumstances which represent a significant improvement on the current state of affairs. Let me give you a striking example from the work done by Professor Margaret Hagan at Stanford. I heard her explain this work at the Civil Justice Council’s national forum in autumn 2023. Professor Hagan runs an amazing project at Stanford called the Legal Design Lab. The test that they did was to take a small group of ordinary people and give them a legal problem of the kind which ordinary people might encounter, and for which the chance of being able to obtain professional legal advice is vanishingly small.

  2. At Stanford they allowed the group of people to try to get advice to help them with their legal problem using the computer. Naturally the people googled the problem. What Professor Hagan saw was that they got accurate advice around 40% of the time using that approach. The problem with Googling of course is that the search engine provides a whole series of hits in response to the prompt and the individual who has no knowledge of the area has to choose which sites to follow up. I’m not saying that’s the only problem, but it is part of it. The test then tried the same approach with the same problem but required the people to use ChatGPT. The result was that accurate advice was provided something like 60 or 65% of the time. Now of course getting advice which is only accurate 60 or 65% of the time is not ideal and it’s certainly not what you would expect from a magic circle law firm. However, it is striking that it is significantly better than what would happen if a normal person simply used a search engine.

  3. I think this is an illustration of the fact that web searches are a poor tool to find the answer to a problem you do not know the answer to yourself, but a good tool to be reminded of an answer which you would recognise as correct when you saw it, but had forgotten. The problem for unrepresented litigants is that they don’t know what the right answer is and so they need a different kind of help, such as may be provided by a tool of this kind.

  4. I don’t want to over-interpret these Stanford results, but it does seem to me that they indicate that Large Language Models are worth investigating as they may have a role to play in the provision of advice and assistance for those with limited means. Simply improving the quality of what is available for free, even if it does not make it perfect, is a significant step forward.

  5. By the way, we should not be ashamed of the idea that many legal problems which real people encounter are such that it is not realistic to expect individual professionals to ever be able to give individualised legal advice. There was a time when the state provided state funded legal aid for civil claims but that provision has now been reduced significantly and only certain specific classes of civil claim have legal aid. Part of the point that I’m seeking to make is to show that technology may allow us to greatly improve the efficiency of legal advice and assistance provision. One could imagine an early legal advice and assistance service designed to be the first port of call for individuals with limited means, with a front end using the natural language abilities of Large Language Models.

  6. The interesting thing about the Large Language Model in this sort of example is the way in which the interaction itself is more helpful to the person. It feels like you are talking to someone who can help you. Instead of leaving you to fend for yourself, they give you an answer.

  7. Of course, simply unleashing a chat bot like ChatGPT runs the real risk of hallucination but here is where another aspect of this technology is very interesting. I gather that it is possible to use the natural language ability of these Large Language Models but in effect, point them at a closed list database of answers. The user is able to interact with the system in a natural way but the answers which the system will provide to them are constrained by reference to the content of a database which has already been ratified. We have tried out a very simple experimental version of this in the court service, as a way of giving help to court staff on the use of a particular IT system. The test we did in house was successful. Indeed it also identified inconsistencies in what we had thought was a consistent pre-existing set of help information.

(2) Case summaries

  1. The second example of the use of AI in the justice system which I believe will be possible is to exploit its ability to summarise.

  2. In many parts in the justice system in England and Wales, certain kinds of cases come to a judge with a cover sheet containing a summary of what the contents are. In the administrative court, cases come to judges with a summary prepared by the lawyers working in the admin court and from my brief experience in that court, the lawyers there are excellent and the summaries that I saw were invaluable. In the Court of Appeal when we receive a new case it comes with a cover sheet in which there is a summary of what the case is about. These summaries are provided because they are extremely useful and particularly help a judge get into a case quickly. Perfect accuracy is not required and indeed does not happen. And I should say to be clear I don’t mean that as a criticism of the people providing these summaries, who are extremely good and work extremely hard. However we all make mistakes and, most importantly, that fact does not undermine the utility of these summaries.

  3. Usually, the most useful summaries are quite brief and are examples of what I call lossy compression. As you may know, in information theory there is a limit to how far you can compress a data stream. Below that limit it is impossible to compress the data any further without losing some of the information. So you can compress a data stream using lossless compression where all the data is preserved but that has a limit. Or you can compress the data stream further where you tolerate the loss of some of the information content. That is what is called lossy compression. Digital voice traffic is an example of lossy compression. Much of the higher frequency sound in speech is simply removed when that speech is digitised to be sent over a telecommunications network. That is why in the 1990s, when the GSM phone system began, there were stories that dogs did not recognise their master’s or mistress’s voice when that voice was played to the dog over a mobile telephone.

  4. Very often, the best precis of legal material involves lossy compression. Pedants often find it difficult to summarise material, because they cannot bear to lose all the detail. It is quite a skill to be able to precis in a manner which is accurate enough as required by the circumstances. However it is now quite clear that artificial intelligence systems like Large Language Models are able to summarise text in this way.

  5. What I would like is to have a system whereby every judge, when they receive a new case, is provided with a cover sheet which includes a brief summary of what it is about. Perfect accuracy is not required because the point of the summary is to accelerate the judge’s preparation of the case. The judge will then go into court better prepared. They will listen to the parties and, just as in the current world in which judges have been given a summary prepared by an assistant, by the time the judge has listened to the parties the summary is no longer relevant. Any inaccuracies it might have contained do not matter. It has still served its purpose at the beginning. Now I don’t know for sure how good a Large Language Model would be at doing this kind of thing, but from what I have seen I think it might be possible.

  6. The important thing to emphasise is that the Large Language Model here is not deciding the case, nor is it acting in a manner which is different from ways in which we already work. One could also imagine that the case summary could be provided to the parties as part of the hearing. One could imagine a case operating over a number of days in which the transcription service which AI can operate could be used to provide summaries of the day’s evidence to the parties and the judge as the case progressed. The process would be entirely transparent but it could well save a significant amount of work done by parties and judges during proceedings of that kind.

(3) Technology Assisted Review

  1. The final example I wish to mention is Technology Assisted Review or TAR. In fact, machine learning has been used by law firms for some years in the process of discovery or disclosure of documents. TAR involves using experienced lawyers to train the model to identify relevant documents for a given case and then using the trained model to classify the entire collection of documents (which may run to a Terabyte’s worth of data) and produce the relevant ones. I have heard some suggest using machines for this process is somehow less efficient than using people. However, it must be remembered that the normal person who will be doing the disclosure process in a large case is probably a paralegal and one might wonder whether a paralegal is a better tool for finding relevant documents than a trained machine learning system. One of the things which this kind of document system can do is retrieval from the database of documents based on not simply word searching but more conceptual or context type searching.

  2. Now one of the real problems with machine learning of course is bias but bias is a problem for people too. A well known source of bias, which judges are trained to think about and avoid as best they can, is confirmation bias. Confirmation bias is at play if someone is looking into a collection of material to find something which supports their hypothesis. It might be that an AI given the task of searching for material relevant to an issue would not exhibit confirmation bias in the same way. My point is not that these systems are perfect because they are not, but they may exhibit qualities which in some ways make them better than people in some circumstances. We certainly ought to take a look.

Pulling these things together

  1. These AI possibilities are just examples. So, whereas at Harvard a century ago a computer was a person, today even though a computer is a machine, another word for people like Annie Jump Cannon and Williamina Fleming is scientist. The progress of science is a human endeavour, the difference is that the machines have transformed both the capability and productivity of those people. So just as the term “scientist” was then and remains a reference to a person, so it seems to me that the words “lawyer” and “judge” will also remain as references to human beings. The difference will be that many legal tasks which these people undertake and oversee, will be transformed by these new legal tools.

Hedy Lamarr

  1. That takes me to my final person that I want to mention to you Hedy Lamarr. Hedy Lamarr was a Hollywood star in the 1940s. She appeared in the “Road” movies with Bob Hope. She was married to businessman with trading links around the world. One of the problems in the Second World War, as you can imagine, was that the merchant ships in her husband’s business were under attack from submarines. The submarines were locating the merchant ships by listening to their radio signals and then triangulating their location.

  2. Hedy Lamarr had an idea based on a pianola. That is an automatic piano with a drum which turns and the little marks on the drum click different keys on the piano, so a tune is played as the drum turns. Of course each key on the piano represents a different sound frequency. Hedy Lamarr realised that you could adapt this approach to make the frequency of the radio transmitter jump from level to level, in accordance with the marks on a drum. So if one had a receiver and a transmitter with equivalent drums rotating at the same speed the radio frequency would jump from place to place but the communication channel would stay open as long as the two drums were turning at the same speed. This is called frequency hopping. It sets a standard or protocol which allows the transmitter and receiver to interoperate, without a third party being able to follow what is going on. If you use it then the submarines would not be able identify the radio signals being transmitted because without a copy of the drum, as soon as the frequency had jumped, the submarine would not know where to turn the dial.

  3. Frequency hopping telecommunications was one of the technologies which was used in the 2G digital telecommunications standard and Hedy Lamarr was the inventor of the first patent on frequency hopping radio. It might be said therefore that Hedy Lamarr was the proprietor of the first telecommunications “Standard Essential Patent”. But since her patent was filed in the 1940s, and telecommunications standardisation only really got going in the 1990s, Hedy Lamarr’s patent had long expired before mobile telephones emerged.

  4. Now I think there are lessons from telecommunications standardisation which have relevance to the Digital Justice System of the future, particularly in the Pre-Action space. The mobile telephones which have taken over the world are based on data standards. The data standards allow different machines made by different private companies – Nokia, Apple, Samsung and so on – to communicate with each other, anywhere in the world. The lesson is that in order to create an interoperable digital system, the nation state does not need to build anything. One simply needs to specify the right kind of data standard.

  5. I explained this in some detail in a lecture to the Competition Lawyers Association in November 2023 but let me summarise it this evening. Thinking again about an individual without the means to afford a lawyer, very often real people’s problems are multifactorial. They may have had an injury, which means they lose their job, which means they cannot pay the rent and that may lead to family breakdown. Do they present at an employment tribunal? As a claimant in a personal injury claim in the county court? In the family court, or maybe somewhere else. Wouldn’t it be wonderful if it did not matter.

  6. If we promulgate the right data standard, then in future, it might not matter which provider of advice or assistance the individual first got in touch with. Let me explain what I mean.

  7. Today the HMCTS Reform programme has built a single database for civil, family and tribunals. Let us say we specify a single data standard which governs the ability to bring a claim into any of civil, family or tribunals. This should not be a difficult task given that all three are already on one system. One could then generalise further and promulgate the same data standard for use by any pre-action service providers: whether they are online dispute resolution (ODR) portals, or ombuds, or advice providers like Advice Now, or law centres, or law firms, or any other pre-action providers of legal advice, assistance and dispute resolution services. The route to do this already exists. The Online Procedure Rules Committee (OPRC) has the statutory authority to specify data standards in the pre action space. That is provided for by section 24 of the Judicial Review and Courts Act 2023.

  8. Now imagine the person who has suffered an injury which led to the cascade of problems I mentioned a minute ago. Let us say they approach the Housing Ombudsman because the landlord is threatening eviction. The Housing Ombudsman’s IT system will gather information from them. This may well include information which is not only about the housing problem but also perhaps employment, personal injury and family. If the ombudsman is able to resolve the housing problem, the next thing that may be required is for the person to address the employment issue. With a common data standard, the Housing Ombudsman will be able to send the data straight to the ACAS system which is a necessary prerequisite for bringing an employment tribunal claim.

  9. Of course the Housing Ombudsman would not do that unless that is what the litigant wanted to do but assuming the litigant did wish to do that, there’s no data protection problem because this is with the consent of the person. The person does not have to tell their story over and over again, and the data can be transferred appropriately. If ultimately the matter is not resolved pre-action, then the same data standard, which also applies to the court as well to the other providers, means that it will be a simple matter to bring the relevant kind of court case to resolve any outstanding issues.

  10. And by the way, the ORPC is starting work on this as we speak. The vision is of an interoperable pre-action space to replace the currently balkanised system, created by a true public -private partnership without the government having to build a monolithic IT system.

Conclusion

  1. To conclude it seems to me that the future of our justice system necessarily involves technology.

    a. Currently we have seen that useful tech solutions are not evenly distributed, as William Gibson might notice.

    b. We know that there are tasks, which we have traditionally regarded as human tasks, which almost certainly can be improved by using computers, as Annie Jump Cannon might have hoped.

    c. And we can see that innovative data standards have a role to play, not only to help avoiding sinking ships as Hedy Lamarr discovered, but to improve access to justice for those with limited means.

  2. I want to finish by making one observation about those who do not find it easy to use technology. It is important to see that technology will also be useful, even for those who don’t find it straightforward. There are many people who may not have access to the Internet and mobile telephones or may find these systems difficult, not to say impossible, to use. These people are of crucial importance to our justice system and we need to make sure that they obtain the benefits of what I have described, just as much as those who can use technology directly. But I’m an optimist. Whether it is by having trusted intermediaries to help or by improving access in public spaces like libraries, there are ways in which digitally disadvantaged people will be able to obtain the benefits of the future Digital Justice System even when it is fully technically enabled. Their experience will be better, just as the experience of those who can use the technologies will be better too.

  3. Thank you very much.