Unlocking the Power of Deep Learning for Clause Classification: Revolutionizing Commercial Applications
In the dynamic landscape of commercial operations, efficiency and accuracy in document processing are paramount. Traditional methods of analyzing legal clauses and contracts have often been time-consuming and prone to human error. However, with the advent of deep learning technologies, particularly in the realm of clause classification, a new era of automation and precision has emerged.
This is a fine tune version of "google-bert/bert-base-cased" for classification using more than 3200 clause examples extracted from the contracts annotated by the Atticus Project [https://www.atticusprojectai.org/]
Through initiatives like the ATTICUS project and ongoing advancements in AI, the future of commercial document analysis is bright—a future where deep learning plays a pivotal role in unlocking efficiency, insight, and value from the vast sea of textual information that drives our global economy.
Real-World Applications
In practice, the integration of deep learning for clause classification extends across various industries:
Legal Services: Law firms and legal departments leverage deep learning to streamline contract review processes and extract key information efficiently.
Finance and Insurance: Deep learning models assist in analysing complex financial agreements, identifying clauses related to risk factors, liabilities, and compliance.
Healthcare and Pharmaceuticals: Companies in highly regulated sectors use deep learning for analyzing patient contracts, supplier agreements, and regulatory documents.
test_accuracy: 88 %
Labels:
"0": "Anti-Assignment",
"1": "Audit_Rights",
"2": "Cap_On_Liability",
"3": "Covenant_Not_To_Sue",
"4": "Effective_Date",
"5": "Expiration_Date",
"6": "Governing_Law",
"7": "Insurance",
"8": "License_Grant",
"9": "Non-Transferable_License",
"10": "Notice_ Period_To_Terminate_Renewal",
"11": "Parties",
"12": "Post-Termination_Services",
"13": "Renewal_Term",
"14": "Revenue/Profit_Sharing",
"15": "Uncapped_Liability",
"16": "Warranty_Duration"
Usage
To load the model first install transformer library in your environment
pip install transformers
from transformers import pipeline
classifier = pipeline("text-classification", model="mauro/bert-base-uncased-finetuned-clause-type")
Pipelines are the easiest way to use a model.
This is an example clause:
clause = """ The foregoing license shall be transferable or sublicensable by Parent Group solely
to a Permitted Party and subject to the restrictions herein with any sale or transfer of a
Parent business that utilizes the Licensed SpinCo IP If Parent enters an agreement to transfer
the License_Granted to it under this Section 3 1 in connection with any sale or transfer of a
Parent business then SpinCo and members of the SpinCo Group shall be made third party
beneficiaries under such transfer agreement to enforce breaches of the license
3 If SpinCo enters an agreement to transfer the License_Granted to it under this
Section 3 2 in connection with any sale or transfer of a SpinCo business then Parent
and members of the Parent Group shall be made third party beneficiaries under such transfer
agreement to enforce breaches of the license Such agreement shall prohibit any further
sublicensing or transfer of rights by the Permitted Party or in the case of a sale or
transfer of a Parent business the transferee or any use of the Licensed SpinCo IP outside
the scope of the License_Granted to Parent herein Such agreement shall prohibit any further
transfer of rights by such party or any use of the transferred Intellectual Property outside the
scope of the License_Granted to SpinCo herein"""
classifier(clause, return_all_scores=False)
The result will be :
[{'label': 'Non-Transferable_License', 'score': 0.989809513092041}]
Visualization
Now will need for this Matplotlib and Pandas.
pip install matplotlib pandas
# all probabilities
preds = classifier(clause, return_all_scores=True)
# create a df with the result
df = pd.DataFrame([[x['label'], x['score']] for x in preds[0]], columns=['label', 'score'])
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt
# probability distribution
plt.bar(df['label'], df['score'])
plt.xlabel('label')
plt.ylabel('score')
plt.title('Probaility distribution for all clauses type')
plt.xticks(rotation=90)
plt.show()
You will get the probability distribution of all classes:
License: Apache-2.0
- Downloads last month
- 13