Text Mining Legal Documents for Clause Extraction
Vidler, Tony, McGarry, Kenneth and Baglee, David (2023) Text Mining Legal Documents for Clause Extraction. In: The 19th International Conference on Data Science (ICDATA'23), 24-27 Jul 2023, Las Vegas, USA.
Item Type: | Conference or Workshop Item (Paper) |
---|
Abstract
Natural Language Processing (NLP) solutions for legal contracts have been the preserve of large law firms and other industries (e.g., investment banks), especially those with large amounts of resources, having both the volume and range of legal documents and manpower to label the training data. The findings suggest that it is possible to use a smaller volume of training contacts and still generate results that are within an acceptable range. Our results show that just 120 training contracts trained on a pre-trained language model can generate results that are within 10% of the same model trained on 3.3 times the volume. In conclusion, smaller law firms could benefit from machine learning NLP solutions for clause extraction.
|
PDF
CSCE23-vidler v4.pdf - Accepted Version Download (716kB) | Preview |
More Information
Uncontrolled Keywords: NLP, Text Mining, Legal Clauses, Deep Learning, BERT. |
Depositing User: Kenneth McGarry |
Identifiers
Item ID: 16508 |
URI: http://sure.sunderland.ac.uk/id/eprint/16508 | Official URL: https://icdatascience.org/ |
Users with ORCIDS
Catalogue record
Date Deposited: 21 Aug 2023 10:18 |
Last Modified: 14 Sep 2023 15:02 |
Author: | Kenneth McGarry |
Author: | David Baglee |
Author: | Tony Vidler |
University Divisions
Faculty of Technology > School of Computer ScienceSubjects
Computing > Data ScienceComputing > Artificial Intelligence
Actions (login required)
View Item (Repository Staff Only) |