{"id":248,"date":"2014-10-16T11:57:04","date_gmt":"2014-10-16T17:57:04","guid":{"rendered":"http:\/\/harrysurden.com\/wordpress\/?p=248"},"modified":"2014-10-23T11:00:15","modified_gmt":"2014-10-23T17:00:15","slug":"predicting-supreme-court-decisions-using-artificial-intelligence","status":"publish","type":"post","link":"https:\/\/www.harrysurden.com\/wordpress\/archives\/248","title":{"rendered":"Predicting Supreme Court Decisions Using Artificial Intelligence"},"content":{"rendered":"<h2>Predicting Supreme Court Outcomes\u00a0Using AI ?<\/h2><p>Is it possible to predict the outcomes of legal cases &#8211; such as Supreme Court decisions &#8211; using Artificial Intelligence (AI)? \u00a0I recently had the opportunity to consider\u00a0this point at a talk that I gave entitled &#8220;<a href=\"https:\/\/www.youtube.com\/watch?v=sOLXOsiX0Qk&index=1&list=PL48E61C121CAD0E1B\">Machine Learning Within Law<\/a>&#8221; at Stanford.<\/p><p>At that talk, I discussed a very interesting new paper entitled &#8220;<a href=\"http:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=2463244\">Predicting the Behavior of the Supreme Court of the United States&#8221;<\/a>\u00a0by <a href=\"http:\/\/www.katz.law.msu.edu\/\">Prof. Dan Katz (Mich. State Law),<\/a> \u00a0Data Scientist <a href=\"http:\/\/bommaritollc.com\/\">Michael Bommarito<\/a>, \u00a0and <a href=\"http:\/\/joshblackman.com\/blog\/\">Prof. Josh Blackman (South Texas Law).<\/a><\/p><p>Katz, Bommarito, and Blackman used\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Machine_learning\">machine-learning AI\u00a0techniques<\/a>\u00a0to build a computer model capable of predicting the outcomes of arbitrary Supreme Court cases with an accuracy of about 70% &#8211; a strong result. \u00a0This post will discuss their\u00a0approach and why it was an improvement over prior research in this area.<\/p>\n<h2>Quantitative\u00a0Legal Prediction<\/h2><p>The general idea behind such approaches is\u00a0to use computer-based analysis of <em>existing<\/em>\u00a0data (e.g. data on past Supreme Court cases) in order to predict the outcome of <em>future<\/em> legal events (e.g. pending\u00a0cases). \u00a0The approach to using <em>data<\/em> to inform\u00a0legal predictions (as opposed to pure lawyerly analysis) has been largely championed by Prof. Katz &#8211; something that he has dubbed \u00a0&#8220;<a href=\"http:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=2187752\">Quantitative\u00a0Legal Prediction&#8221; in recent work.<\/a><\/p><p>Legal prediction is an important function that attorneys perform for clients. \u00a0Attorneys predict all sorts of things, ranging from the likely outcome of pending cases, risk of liability, and estimates about damages, to the importance of various laws and facts to legal decision-makers. \u00a0 Attorneys use a mix of legal training, problem-solving, analysis, experience, analogical reasoning, common sense, intuition and other higher order cognitive skills to engage in sophisticated, informed\u00a0assessments of likely outcomes.<\/p><p>By contrast, the\u00a0quantitative approach takes a different tack: \u00a0using analysis of data employing advanced algorithms to result in data-driven predictions of legal outcomes (instead of, or in addition to traditional legal analysis). \u00a0These data-driven predictions can provide additional information to support attorney analysis.<\/p>\n<h2><strong>Predictive Analytics: Finding Useful Patterns in Data<\/strong><\/h2><p>Outside of law, <a href=\"http:\/\/en.wikipedia.org\/wiki\/Predictive_analytics\">predictive analytics<\/a> has widely applied to produce\u00a0automated, predictions\u00a0in multiple contexts. \u00a0 Real world examples of predictive analytics include: the\u00a0automated product <a href=\"http:\/\/www.amazon.com\/gp\/help\/customer\/display.html?nodeId=13316081\">recommendations made by Amazon.com<\/a>, movie <a href=\"http:\/\/techblog.netflix.com\/search\/label\/recommendations\">recommendations made by Netflix,<\/a> and the <a href=\"http:\/\/www.google.com\/insidesearch\/features\/instant\/about.html\">search terms\u00a0automatically suggested by Google<\/a>.<\/p>\n<h3><strong>Scanning Data for Patterns that Are Predictive of Future Outcomes<\/strong><\/h3><p>In general, predictive analytics approaches use\u00a0advanced computer algorithms to scan large amounts of data to detect patterns. \u00a0These patterns can be often used to make intelligent, useful predictions about never-before-seen future data. \u00a0Many of these approaches employ\u00a0&#8220;<a href=\"http:\/\/en.wikipedia.org\/wiki\/Machine_learning\">Machine Learning<\/a>&#8221; techniques to engage in prediction.\u00a0I have <a href=\"http:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=2417415\">written about some of the ways that machine-learning based analytical approaches are starting to be used within law and the legal system<\/a>.<\/p><p>Broadly speaking, <a href=\"http:\/\/en.wikipedia.org\/wiki\/Machine_learning\">machine-learning<\/a>\u00a0refers to a research area studying computer systems that are able improve their\u00a0performance\u00a0on some task over time with experience. \u00a0Such algorithms are specifically designed\u00a0to detect patterns in data that can be highlight non-obvious relationships or that can be predictive of future outcomes (such as detecting Netflix users who like movie X, tend also to like movie Y and concluding you like movie X, so you&#8217;re likely to like movie Y.)<\/p><p>Importantly these\u00a0algorithms are designed to\u00a0&#8220;learn&#8221; &#8211; \u00a0in the sense that they can change their own behavior to get better at some task &#8211; like predicting movie preferences &#8211; over time by detecting new, useful\u00a0patterns within additional data. \u00a0Thus, the\u00a0general idea behind predictive legal analytics is to\u00a0examine data concerning past legal cases and use\u00a0 machine learning algorithms to detect and learn\u00a0patterns that could be predictive of future case outcomes.<\/p><p>In such a\u00a0machine learning approach &#8212; called supervised learning &#8211; \u00a0we &#8220;train&#8221; the algorithm by providing\u00a0it \u00a0with examples of past data that is has been definitively\u00a0classified. \u00a0For example, there may be a body of existing data about Supreme Court cases along with confirmed data indicating whether the outcome was affirm or reverse, along with other potentially predictive data, such as lower circuit, and subject matter at issue. \u00a0Such an algorithm examines this training data to detect patterns and statistical correlations between variables and outcomes (e.g. 9th Circuit cases more likely to be reversed) and build a computer model that will be predictive of future outcomes.<\/p><p>It is helpful to briefly review some earlier\u00a0research in using data\u00a0analytics to engage prediction of Supreme Court outcomes\u00a0to understand the\u00a0contribution of Katz, Bommarito, and Blackman&#8217;s paper.<\/p>\n<h2>Prior Work in Analytical Supreme Court Prediction<\/h2><p>Pioneering work in the area of quantitative legal prediction began in 2004 with a seminal\u00a0project\u00a0by <a href=\"https:\/\/www.law.upenn.edu\/cf\/faculty\/truger\/\">Prof. Ted Ruger (U Penn)<\/a>, <a href=\"http:\/\/sites.lsa.umich.edu\/admart\/\">Andrew D. Martin (now dean at U Michigan)<\/a>\u00a0and other collaborators,\u00a0<a href=\"http:\/\/scholarship.law.berkeley.edu\/cgi\/viewcontent.cgi?article=1018&context=facpubs\">employing statistical\u00a0methods to predict Supreme Court outcomes.<\/a>\u00a0 \u00a0That project pitted experts in legal prediction &#8211; law professors and attorneys &#8211; against a statistical model that had analyzed<a href=\"https:\/\/www.law.berkeley.edu\/files\/pop04.pdf\">\u00a0data about hundreds of past Supreme Court cases<\/a>.<\/p><p>Somewhat surprisingly\u00a0the computer model significantly outperformed the experts in predictive ability. The computer model correctly\u00a0forecasted<strong>\u00a075%<\/strong> of Supreme Court outcomes, while the experts only had a <strong>59%<\/strong> success rate in predicting Supreme Court affirm\u00a0or reversal decisions. \u00a0(The computer and the experts performed roughly the same in predicting the\u00a0votes of individual justices &#8211; as opposed to the ultimate outcome &#8211; \u00a0with the computer getting 66.7 % correct predictions\u00a0vs. the experts 67.9%).<\/p>\n<h2>Improvements by Katz, Bommarito, and Blackman (2014)<\/h2><p>The work by Ruger, Martin et. al &#8211; while pioneering &#8211; left some room for improvement. \u00a0One aspect\u00a0was that their predictive model &#8211; while\u00a0highly predictive of the relatively short time frame examined (the October 2002 term) \u00a0&#8211; was thought not to\u00a0be <em>broadly generalizable<\/em> to predicting arbitrary\u00a0Supreme Court cases\u00a0across any timespan. \u00a0A primary reason was that the period of Supreme Court cases that they examined to build their models &#8211; roughly 1994 &#8211; 2000 &#8211; involved an unusually stable court. \u00a0Notably, this period exhibited no change in personnel (i.e. justices leaving the court and new justices being appointed).<\/p><p>A model that was &#8220;trained&#8221; on data from an unusually stable period of the Supreme Court, and tested on a short case-load\u00a0of relatively non-fluctuation\u00a0might not perform as accurately when applied to a\u00a0broader or less homogenous examination period, or might not handle changes in court composition in a robust manner.<\/p><p>Ideally, we\u00a0would any such computer predictive\u00a0model\u00a0to be flexible\u00a0enough, and generalizable enough to handle significant changes in personnel\u00a0and still be able to produce accurate predictions. Additionally, such a model\u00a0should be general\u00a0enough to predict case outcomes with a relatively consistent level of accuracy regardless of the term or period of years examined.<\/p>\n<h3>Katz, Bommarito, and Blackman: Machine Learning And Random Forests<\/h3><p>While building upon Ruger et al&#8217;s pioneering work. Katz, Bommarito, and Blackman improve upon it\u00a0by employing a relatively new machine learning approach known as <a href=\"http:\/\/en.wikipedia.org\/wiki\/Random_forest\">&#8220;Random\u00a0Forests.&#8221;<\/a>\u00a0 \u00a0Without getting into the details, it is important to note that Random Forest approaches have been shown to be quite\u00a0robust and generalizable as compared to\u00a0other modeling approaches in contexts such as this. \u00a0 The authors applied this algorithmic approach to examine data about past Supreme Court cases found in the <a href=\"http:\/\/scdb.wustl.edu\/\">Supreme Court Database<\/a>. \u00a0In addition to outcome (e.g. affirmed, reverse), this\u00a0database contains hundreds of variables about nearly every\u00a0Supreme Court decision of the past 60 years.<\/p><p>Recall that machine learning approaches often working by providing\u00a0an algorithm with existing data (such as data concerning past Supreme Court case outcomes and potentially predictive variables such as lower-circuit) in order to &#8220;train&#8221; it. \u00a0The algorithms looks for patterns and builds an internal computer model that can hopefully be used to provide prediction is future, never-before-seen data &#8211; such as pending Supreme Court case.<\/p><p>Katz, Bommarito, and Blackman did this and produced a new\u00a0robust machine-learning based computer model that correctly forecasted ~ <strong>70%\u00a0<\/strong>\u00a0of Supreme Court affirm \/ reverse decisions.<\/p><p>This was actually a significant improvement over prior work. \u00a0 Although Ruger&#8217;s et. al&#8217;s model had a\u00a0a <strong>75%<\/strong> prediction rate on the period it was analyzed against, \u00a0Katz et. al&#8217;s model was a much more\u00a0robust, generalizable model.<\/p><p>The new model is able\u00a0to withstand changes in Supreme Court composition and still produce accurate results even when applied across widely variable supreme court terms, with varying levels of case predictability. \u00a0 In other words, it is unlikely that the Ruger model &#8211; focused only on one term 2002 &#8211; would produce a 75% rate across a 50 year range of Supreme Court jurisprudence. \u00a0By contrast,\u00a0the computer model produced by Katz et. model consistently delivered\u00a0a 70% prediction rate across nearly 8,000 cases across 50+ years.<\/p>\n<h3><strong>Conclusion: Prediction in Law Going Forward<\/strong><\/h3><p>Katz, Bommarito, and Blackman&#8217;s paper is an important contribution. \u00a0In the not too distant future, such data-driven approaches\u00a0to engaging in legal prediction are likely to become more common within law. Outside of law, data analytics and machine-learning have been transforming industries ranging from medicine to finance, and it is unlikely that law will remain as comparatively untouched by such sweeping changes as it remains today.<\/p><p>In future posts I will discuss machine learning within law more generally, and principles for understanding what such AI techniques ca, and cannot do within law given the state of current technology, and some implications of these technological changes.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Predicting Supreme Court Outcomes\u00a0Using AI ?Is it possible to predict the outcomes of legal cases &#8211; such as Supreme Court decisions &#8211; using Artificial Intelligence (AI)? \u00a0I recently had the opportunity to consider\u00a0this point at a talk that I gave entitled &#8220;Machine Learning Within Law&#8221; at Stanford.At that talk, I discussed a very interesting new [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[8,5,10,9],"tags":[],"_links":{"self":[{"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/posts\/248"}],"collection":[{"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/comments?post=248"}],"version-history":[{"count":10,"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/posts\/248\/revisions"}],"predecessor-version":[{"id":268,"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/posts\/248\/revisions\/268"}],"wp:attachment":[{"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/media?parent=248"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/categories?post=248"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.harrysurden.com\/wordpress\/wp-json\/wp\/v2\/tags?post=248"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}