Upload
rc-richards
View
1.461
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation at CICL 2013: Conference on Innovation and Communications Law, 16 May 2013, Glen Arbor, Michigan.
Citation preview
Legal Informatics Research Today:
Implications for Legal Prediction, 3D Printing & eDiscovery
Robert Richards
Penn State University
CICL 2013: Conference on Innovation and Communications Law
Agenda
Legal Informatics: Overview
eDiscovery: Methods, Recent Research
3D Printing: How legal tech could apply
Legal Prediction Methods, Recent Research
Legal Informatics: Definition
Legal informatics is: (1) the study of legal information / communication systems(2) the application of ICT (information / communication technology) to legal information
ICT
Legal Information
What is legal information?
Structured data that express: 1. Legal Rules 2. Information about Legal Rules (1st,
2nd, 3rd, etc. order legal metadata) 3. Evidence
Non-legal data used to support an assertion about a legal rule
What is a legal information / communication system?
A set of interrelated entities that receive, process, or output legal information
Examples: A law office time/billing system A database of court decisions A statistical model predicting a legal
outcome
Legal Informatics Viewpoint: 4 Levels
In a domainAddressing an application area
From one or more sub-disciplines, by
Employing one or more methodologies
Legal Informatics: Domains
Law PracticeCourtsLegislatureRegulatory
Politics / Civic Computing
Legal Education
BusinessConsumers
Legal Informatics: Application Areas
LitigationCompliancePlanningInterviewing/ Counseling
NegotiationEducationGovernance / Policy making
Legal Informatics: Sub-Disciplines
Artificial Intelligence Information Retrieval Text Processing / NLP Metadata/
Knowledge Representation
Databases / Storage
Linguistics / Communication
Human-Computer Interaction / Information Behavior
Management / Sociology of Info
Legal Informatics: Methodologies
PrototypingStatistics / Probability
ExperimentationNetwork AnalysisSurvey Research
Case StudyCost-Benefit Analysis
Ethnography InterviewingDoctrinal Analysis
ExampleMuch eDiscovery research involves… Law Practice (Domain) Litigation / Evidence (Application Area) Information retrieval + text analysis +
knowledge representation /metadata + management (Sub-Disciplines)
Prototyping + experimentation + statistical analysis + cost-benefit analysis (Methodologies)
4-Level Approach Reveals Relationships Between (Apparently) Dissimilar Research Activities
Scherer, S., Wimmer, M. A., & Markisic, S. (2013). Bridging narrative scenario texts and formal policy modeling through conceptual policy modeling. Artificial Intelligence and Law. doi:10.1007/s10506-013-9142-2
Scherer et al. (2013)
ICT Citizen’s Legal Narrative Doctrine/Rule
Scherer et al.: Public Policy DomainMethodologies:
Prototyping + Case studySub-Disciplines:
Artificial intelligence + Linguistics + Text Analysis + Knowledge Representation
Application area: Translating non-legal language to legal
conceptsDomain:
Public policy (e-Participation)
Scherer et al.: Law Practice DomainMethodologies:
Prototyping + Case studySub-Disciplines:
Artificial intelligence + Linguistics + Text Analysis + Knowledge Representation
Application area: Translating non-legal language to legal
conceptsDomain:
Law practice (Counseling, Interviewing)
Functions of Legal Informatics Approach
Analyze: Processes
Define: Problems
Explain: Causation
Predict: Outcomes
Functions of Legal Informatics Approach (cont’d)
Evaluate: Processes Outcomes
Apply: Diverse approaches and
methods
eDiscovery
DefinitionGoals and MotivationModelsResearch ResultsPredictive CodingFuture Areas of Research
eDiscovery: definition
In litigation, the request for and production of electronically stored information relevant to a claim or count
eDiscovery: Goals
Increase effectiveness of methods
Lower costs
Cost Motivation
Big Data prohibitive costs of traditional relevance- and privilege-review
With data sets of > 106 objects linear manual review and privilege review become unsustainably expensive
EDRM Model
New Models Emerging:Informatics-Based, Elaborating EDRM
EDRM Oard & Webber
Oard & Webber (2013) Production request
Collection
Responsive ESI
Production
--->
Insight
Formulation
Acquisition
Review for
Relevance
Review for
Privilege
Sense-making
©Copyright 2013 Douglas W. Oard and William Webber
TREC & EDI: Key Findings
Initial Search & Second-Step Relevance Feedback: Automated relevance ranking > Boolean
query in re: recall
Interactive Evaluation: Technology-Assisted Review > Manual
Review in re: overall results + precision
High Precision + High Recall are possible with certain topics
TREC Key Findings (cont’d)
Predictive coding produced high recall But most machine learning systems could not correctly
choose correct sample size to maximize precision and recall.
Machine learning systems that yielded highly relevant results also yielded highly material docs
Privilege Review Remains a Key Cost Driver & Is Under-Automated (Pace & Zakaras, 2012) Automated privilege review yielded high recall in one
study (but method was not disclosed)
eDiscovery: Measurement Error
Low rates of inter-assessor agreement Found in TREC & EDI studies
Cooperation between parties on evaluation in tech-assisted review likely to lower measurement error This is an emerging best practice (see, e.g.,
Da Silva Moore)
eDiscovery: Recent Emphases (Baron, 2011)
Process Quality Standards & Best Practices Metrics & certification (DESI IV, 2011)
Cooperation between Parties Sedona Conference (2009)
Improved Search, including Predictive Coding DESI V, 2013 Results of TREC & EDI research
Courts are implementing all of these recommendations
eDiscovery: Recent Emphases: Sub-Disciplines
Process Quality Standards & Best Practices Management
Cooperation between Parties Management, Information Retrieval,
Knowledge Representation Improved Search, including Predictive
Coding Information Retrieval, Text Analysis,
Knowledge Representation, Information Behavior, Management
Predictive Coding: Definition
Machine learning applied to classification of informatione.g., as responsive / non-responsive
Predictive Coding: Diverse Methods
Support Vector Machines
Latent Semantic Analysis
Naïve Bayesian Classifiers
Decision Trees
Neural Networks Association Rule
Learning Rule Induction Genetic Algorithms
Predictive Coding: Courts Reading, Citing, & Applying Legal Informatics Research
Da Silva Moore v. Publicis GroupeEORHB v. HOA HoldingsGlobal Aerospace Inc. v. Landow Aviation
Kleen Products v. Packaging Corp. of America
eDiscovery: Future Research Directions
Evaluation Standards & Certification Threshold point estimates
Relevance threshold Sample size threshold
Confidence level, confidence intervals
Typology of Production Requests Electronic Discovery Institute plans 2nd
study on real e-discovery materials testing TREC conclusions, with higher ecological
validity
eDiscovery: Future Research Directions (cont’d)
Measurement Error: Modeling it & correcting for it
Designing re-usable test collections Automated privilege review
Identifying effective methods Designing test collections to evaluate those
methods
eDiscovery: Future Research Directions (cont’d)
Evaluating de-duplication methods Improved privacy measures to enable
experiments on real-life data sets Apply other sub-disciplines, including
Information behavior Diversify methods, including social
network analysis More research on Early Case
Assessment
3D Printing
Definition Expected Effects Lawyers’ Value-Add Short-Term Application of Legal
Technology Long-Term Application of Legal
Technology
3D Printing: Definition
The generation of physical objects from computer models, by a layering process
Also called Additive Manufacturing (Gibson, Rosen, & Stucker, 2010)
3D Printing: Some Expected EffectsDemocratizing manufacturing
More inventors More innovation
More infringementMore demand for legal compliance services
More demand for patent legal services
Patent Lawyers’ Value-Add for Entrepreneurs / New Inventors
Patent SearchClaim InterpretationCurrency of InformationCustomization of Information to Client’s Circumstances
Strategic Advice (Law + Business)
How Might Legal Informatics Affect 3D Printing?Legal Informatics is likely to interact with 3D Printing in two ways:Short-Term: Unbundling of patent legal services (Mosten, 1994)
Long-Term: Automated patent search & Modeling of claim interpretations incorporated into CAD software
Unbundling of Patent Legal Services
Selling (outdated) patent search results
Selling (outdated) memoranda containing claim interpretations
Offering (remotely) updated & customized search results and counseling for an extra fee
Patent Legal Services Unbundling: 4-Levels
Domain: Business
Application Areas: Compliance, Counseling
Sub-Disciplines: Management, Information Retrieval,
Knowledge Representation
Methodologies: Prototyping, Case Studies, Doctrinal
Analysis, Cost-Benefit Analysis
Automated patent search & modeling of claim interpretations (Hulicki, 2013; Mulligan & Lee, 2012)
User inputs simulation/design/image of invention
CAD software analyzes input, determines domain & patent search parameters
CAD Software executes patent search, retrieves relevant patents in force
CAD software analyzes claims of retrieved patents
Automated patent search & modeling of claim interpretations (cont’d)
CAD Software translates claims into simulation parameters
For each simulation model, CAD software calculates probability of liability for patent infringement & possible exposure
Output displays liability probability + potential exposure
Lawyer offers (remote) legal counseling for an extra fee
Automated Patent Search & Modeling of Claim Interpretations: 4-Levels
Domain: Business
Application Areas: Compliance, Counseling
Sub-Disciplines: Artificial Intelligence, Information Retrieval,
Knowledge Representation, Human-Computer Interaction
Methodologies: Prototyping, Statistical Modeling, Case Studies,
Experimentation, Ethnography, Interviewing
Implications of Both Scenarios
More small-scale inventors/entrepreneurs will have access to legal compliance information at an affordable price
Clients can choose to pay more for higher levels of service
Reform of legal ethics rules may be required to implement either scenario
Legal PredictionDefinition4-Level ViewTemporal DimensionsResearch ResultsPossible EffectsFuture Research Directions
Legal Prediction: Definitions
(1) Methods for calculating the probability of the occurrence or non-occurrence of law-related events or circumstances at a point in time, on the basis of data acquired at an earlier point in time
(2) Methods for inferring law-related attributes of a population from a sample
Legal Prediction: Application Areas
Case Outcome / Litigation Management (Blackman et al., 2012; Ruger et al.,
2004; Ribstein, 2012) Imputing Default Terms in Contracts & Wills (Porat & Strahilevitz, 2013)
Legislative Bill Passage (Tauberer, 2012; Yano et al., 2012)
Legal Prediction: Application Areas (cont’d)
Document Relevance (eDiscovery, Legal research) (Katz, 2013)
Legal Spend (In-House Counsel) (Katz, 2013)
Lawyer Hiring (Law Firms) (Katz, 2013)
Legal Compliance (Clients, In-House Counsel) (Ribstein, 2012)
Legal Prediction: Sub-Disciplines
Artificial IntelligenceInformation RetrievalMetadata / Knowledge Representation
Text Processing
Legal Prediction: Diverse Methods
Bayesian Inference (McShane et al., 2012; Guimerà & Sales-Pardo,
2011)
Stochastic Block Modeling (Guimerà & Sales-Pardo, 2011)
Classification/Decision Trees (Ribstein, 2012; Ruger et al., 2004)
Crowdsourced Prediction Markets (Blackman et al., 2012; Ribstein, 2012)
Legal Prediction: Diverse Methods (cont’d)
Machine Learning (Katz, 2013)
Case-Based Reasoning (Ribstein, 2012)
Surveys (Dimmock & Gerken, 2012; Porat & Strahilevitz,
2013)
Regression, Maximum Likelihood (Dimmock & Gerken, 2012)
Legal Prediction: Model vs. Crowdsourcing
Blackman’s FantasySCOTUS vs. Martin, Ruger et al.
Complementary approaches
Legal Prediction: Three Temporal Dimensions
Synchronic: Inference from sample to parameters of a static population Predictive coding, machine learning Used to collect data set for model
Diachronic Future: Inference from sample at t to observations at t + 1, where t
+ 1 is later than today Forward prediction (Katz) Often performed on the data set gathered using Synchronic
prediction Diachronic Past:
Retrospective prediction Inference from sample at t to observations at t + 1, where t
+ 1 is earlier than today
Legal Prediction: Some Research Results
Decision Tree > Domain Experts (Ruger et al.)
Crowdsourcing > Domain Experts (Blackman et al.)
Crowdsourcing = Decision Tree (Blackman et al.)
Stochastic Block Models > case-content based algorithms (Guimerà & Sales-Pardo)
Stochastic Block Models > Domain Experts (Guimerà & Sales-Pardo)
Legal Prediction: Possible Effects
Lawyer disintermediation (Katz, 2013; Ribstein, 2012)
Client empowerment (Ribstein, 2012) Reduction in legal costs (Katz, 2013;
Ribstein, 2012) Within businesses, distribution of
legal tasks to non-legal personnel (Ribstein, 2012)
Legal Prediction: Future Research Directions
Analogical reasoning: development of improved models (Katz)
Crowdsourced prediction markets for lower-level courts (Blackman et al.)
Automated prediction engines for lower-level courts (Blackman et al.)
References Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast discovery
of association rules. Advances in Knowledge Discovery and Data Mining, 12:307–328. Ashley, K. D., & Brüninghaus, S. (2009). Automatically classifying case texts and
predicting outcomes. Artificial Intelligence and Law, 17, 125-165. doi:10.1007/s10506-009-9077-9
Ashley, K. D., & Bridewell, W. (2010). Emerging AI & Law approaches to automating analysis and retrieval of electronically stored information in discovery proceedings. Artificial Intelligence and Law, 18, 311-320. doi:10.1007/s10506-010-9098-4
Barnett, T., Godjevac, S., Renders, J.-M., Privault, C., Schneider, J., & Wickstrom, R. (2009, June). Machine learning classification for document review. Paper presented at the DESI III Global E-Discovery/E-Disclosure Workshop: A Pre-Conference Workshop at the twelfth International Conference on Artificial Intelligence and Law, ICAIL 2009, Barcelona, Spain.
Baron, J. (2011). Law in the age of exabytes: Some further thoughts on ‘information inflation’ and current issues in e-discovery search. Richmond Journal of Law and Technology, 17(3), Article 9. Retrieved from http://jolt.richmond.edu/v17i3/article9.pdf
Blackman, J., Aft, A., & Carpenter, C. (2012). FantasySCOTUS: Crowdsourcing a prediction market for the Supreme Court. Northwestern Journal of Technology and Intellectual Property, 10(3), Article 3. Retrieved from http://scholarlycommons.law.northwestern.edu/njtip/vol10/iss3/3
Cohen, W. W. (1995). Fast effective rule induction. In Machine learning: Proceedings of the twelfth international conference, ML95.
References (cont’d) Conrad, J. (2010). E-discovery revisited: the need for artificial intelligence beyond information
retrieval. Artificial Intelligence and Law, 18, 321-345. doi:10.1007/s10506-010-9096-6 Cormack, G. V., & Grossman, M. R., Hedin, B., & Oard, D. W. (2011). Overview of the TREC 2010
legal track. In The Nineteenth Text Retrieval Conference (TREC 2010) Proceedings. N.p.: NIST. Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y, 2012). DESI IV (2011). [Call for papers:] ICAIL 2011 workshop on setting standards for searching
electronically stored information in discovery proceedings (DESI IV Workshop), June 6, 2011, University of Pittsburgh, Pittsburgh, PA.
DESI V (2013). [Call for papers:] ICAIL 2013 workshop on standards for using predictive coding, machine learning, and other advanced search and review methods in e-discovery (DESI V workshop), June 14, 2013, Consiglio Nazionale delle Ricerche, Rome, Italy.
Dimmock, S. G., & Gerken, W. C. (2012). Predicting fraud by investment managers. Journal of Financial Economics, 105, 153-173. doi:10.1016/j.jfineco.2012.01.002
EORHB, Inc. v. HOA Holdings LLC, Civ. Ac. No. 7409-VCL (Del. Ch. Oct. 15, 2012). Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Machine Learning, 29, 131-163. Gibson, I., Rosen, D. W., & Stucker, B. (2010). Additive manufacturing technologies: Rapid
prototyping to direct digital manufacturing. New York: Springer Global Aerospace, Inc., v. Landow Aviation, L.P., No. CL 61040 (Va. Cir., Apr. 23, 2012). Grossman, M. R., & Cormack, G. V. (2011). Technology-assisted review in e-discovery can be
more effective and more efficient than exhaustive manual review. Richmond Journal of Law and Technology, 17(3), Article 11. Retrieved from http://jolt.richmond.edu/v17i3/article11.pdf
Grossman, M. R., Cormack, G. V., Hedin, B., & Oard, D. W. (2011). Overview of the TREC 2011 legal track. In The Twentieth Text Retrieval Conference (TREC 2011) Proceedings. N.p.: NIST.
References (cont’d) Guimerà, R., & Sales-Pardo, M. (2011). Justice blocks and predictability of U.S. Supreme Court votes.
PLOS ONE, 6(11), e27188. doi:10.1371/journal.pone.0027188 Hulicki, M. (2013, May). Recent judgments of the highest court as a step towards objectification of
patentability. Paper presented at CICL 2013: Conference on Innovation and Communication Law, Glen Arbor, MI.
In re Actos (Pioglitazone) Products, No. 6:11-md-2299 (M.D. La., July 27, 2012). Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant
features. In C. Nédellec & C. Rouveiro (Eds.), Proceedings of the 10th European Conference on Machine Learning (pp. 137–142).
Katz, D. M. (2013). Quantitative legal prediction—Or—How I learned to stop worrying and start preparing for the data-driven future of the legal service industry. Emory Law Journal, 62, 101-158.
Kleen Prods. LLC v. Packaging Corp. of Am., No. 10 C 5711 (N.D. Ill., Sept. 28, 2012). LexMachina. (n.d.). About, technology. Retrieved from https://lexmachina.com/about/ Martin, A. D., & Quinn, K. M. (2002). Dynamic ideal point estimation via Markov chain Monte Carlo
for the U.S. Supreme Court, 1953–1999. Political Analysis, 10, 134-153. doi:10.1093/pan/10.2.134 McShane, B. B., Watson, O. P., Baker, T., & Griffith, S. J. (2012). Predicting securities fraud
settlements and amounts: A hierarchical Bayesian model of federal securities class action lawsuits. Journal of Empirical Legal Studies, 9, 482-510. doi:10.1111/j.1740-1461.2012.01260.x
Mosten, F. S. (1994). Unbundling of legal services and the family lawyer. Family Law Quarterly, 28, 421-449.
Mulligan, C., & Lee, T. B. (forthcoming). Scaling the patent system. N.Y.U. Annual Survey of American Law. Retrieved from http://www.ssrn.com/abstract=2016968
Oard, D. W., Baron, J. R., Hedin, B., Lewis, D. D., & Tomlinson, S. (2010). Evaluation of information retrieval for e-discovery. Artificial Intelligence and Law, 18, 347-386. doi:10.1007/s10506-010-9093-9
References (cont’d) Oard, D. W., & Webber, W. (2013). Information retrieval for e-discovery.
Foundations and Trends in Information Retrieval, 7, 1-141. Retrieved from http://ediscovery.umiacs.umd.edu/pub/ow12fntir.pdf
Pace, N. M., & Zakaras, L. (2012). Where the money goes: Understanding litigant expenditures for producing electronic discovery. Santa Monica, CA: Rand Institute for Civil Justice.
Porat, A., & Strahilevitz, L. J. (2013). Personalizing default rules and disclosure with big data (University of Chicago Coase-Sandor Institute for Law and Economics working paper no. 634, 2nd series). Retrieved from http://www.law.uchicago.edu/Lawecon/index.html
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81-106.
Ribstein, L. (2012). Delawyering the corporation. Wisconsin Law Review, 2012, 305-332.
Richards, R. (2009, June). What is legal information? Paper presented at the Conference on Legal Information: Scholarship and Teaching, at the University of Colorado School of Law, Boulder, CO. Retrieved from http://legalinformatics.wordpress.com/2009/05/31/what-is-legal-information-conference-paper/
References (cont’d)
Roitblat, H. L., Kershaw, A., & Oot, P. (2010). Document categorization in legal electronic discovery: Computer classification vs. manual review. Journal of the American Society for Information Science and Technology, 61, 70-80. doi/10.1002/asi.21233
Ruger, T. W., Kim, P. T., Martin, A. D., Quinn, K. M. (2004). The Supreme Court forecasting project: Legal and political science approaches to predicting Supreme Court decisionmaking. Columbia Law Review, 104, 1150-1210.
Scherer, S., Wimmer, M. A., & Markisic, S. (2013). Bridging narrative scenario texts and formal policy modeling through conceptual policy modeling. Artificial Intelligence and Law. doi:10.1007/s10506-013-9142-2
The Sedona Conference. (2009). Commentary on achieving quality in e-discovery. N. p.: The Sedona Conference.
Tauberer, J. (2012, December 7). Bill prognosis gets a few improvements. GovTrack Blog [web log post]. Retrieved from http://www.govtrack.us/blog/2012/12/007/bill-prognosis-gets-a-few-improvements
Webber, W. (2011, July). Re-examining the effectiveness of manual review. Paper presented at SIGIR 2011 Information Retrieval for E-Discovery (SIRE) Workshop, Beijing, China.
Yano, T., Smith, N. A., & Wilkerson, J. D. (2012, October). Textual predictors of bill survival in congressional committees. Paper presented at New Directions in Analyzing Text as Data 2012, Harvard University, Cambridge, MA. Retrieved from http://projects.iq.harvard.edu/ptr/files/yanosmithwilkersonbillsurvival.pdf