In previous posts, I've written about using a neural network and k-nearest neighbor to predict SCOTUS judging.1 These posts were based on lesson 8 from Prof. Wolfgang Alschner's fantastic course, Data Science for Lawyers. In this lesson, Prof. Alschner reviews several machine learning algorithms and explains how to use them to predict Justice Brennan's voting record.

Logistic regression is not among the algorithms Prof. Alschner discusses. I was curious about how this algorithm would perform. In this post, I apply a logistic regression model to the Justice Brennan dataset. To do this, I use the LogisticRegression() class from scikit-learn.

Logistic Regression

In a nutshell, logistic regression is a supervised machine learning algorithm that can be used for classification. In its simpler forms, it predicts the probability of a datapoint belonging to one group or another.

In this post, I use a logistic regression model to predict whether Justice Brennan votes with the majority of the court in relation to 4746 cases.

The Dataset

You can find an overview of the dataset in my earlier post, Using Artificial Intelligence to Predict SCOTUS Judging.

Importing Dependencies

As usual, we begin by importing our dependencies.

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.compose import ColumnTransformer

Loading and Splitting the Dataset

The next steps to building this model are very similar to the other models I've reviewed on LitKM to date.

So, we first load the dataset from a CSV and turn it into a DataFrame.

Next, we split this dataset into training features and labels, and testing features and labels.

In lesson 8, each model is trained on Justice Brennan's voting data prior to 1980 and tested on his voting data from 1980 and onwards. So we'll split our dataset likewise.

dataset = ''
dataset = pd.read_csv(dataset)

x = dataset[['term', 'petitioner', 'respondent', 'jurisdiction', 'caseOrigin', 'caseSource', 'certReason', 'issue', 'issueArea']]

y = dataset['vote']

#Features for training
X_train = x[x['term'] < 1980] 

#Features for testing
X_test = x[x['term'] > 1979]  

#Labels for training
Y_train = y.iloc[0:3368]      

#Labels for testing
Y_test = y.iloc[3367:4745]

Normalizing the Dataset

Now we must normalize the dataset. The features consist of numbers, all of varying ranges. So we need to scale them to use the same range.

The labels consist of one of two words, either "majority" or "minority". We must convert these categorical labels into either a "1" (majority) or a "0" (minority).

#Scale features
columns_for_standard = ['term', 'petitioner', 'respondent', 'jurisdiction', 'caseOrigin', 'caseSource', 'certReason', 'issue', 'issueArea']

ct = ColumnTransformer([('numeric', StandardScaler(), columns_for_standard)])

X_train = ct.fit_transform(X_train) 
X_test = ct.transform(X_test)

#Convert categorical labels to numbers
le = LabelEncoder()
Y_train = le.fit_transform(Y_train.astype(str))
Y_test = le.transform(Y_test.astype(str))

Now we are ready to create our model and train it on the dataset.

The Logistic Regression Model

We instantiate our model using the LogisticRegression() class from scikit-learn and train it on our dataset using the fit method.

model = LogisticRegression(), Y_train)

Now that our model is trained, let's see how well it predicts Justice Brennan's voting record based on the training data.

print(model.score(X_train, Y_train))

Not bad. Using voting data from prior to 1980 only, the model predicts Justice Brennan's voting record with approximately 82% accuracy.

But, trained on this data, how well does the model predict Justice Brennan's voting from 1980 onwards?

print(model.score(X_test, Y_test))

The model achieves approximately 60% accuracy. This is better than a coin toss, but not as good as the other algorithms I've reviewed on LitKM. Both KNN and the neural network achieved 69% accuracy.

When working with KNN, it occurred to me that perhaps the dataset, as normalized per the approached detailed above, was incomplete. Specifically, the features and labels were not scaled to the same range. Whereas the labels use a range of 0 to 1, the features do not.

Below I reproduce the code to scale the features to a range of 0 to 1, same as the labels, and fit a new logistic regression model to this updated dataset.

#Convert the features from numpy arrays to DataFrames in prep for min-max scaling, and ease of review
X_train = pd.DataFrame(X_train)
X_test = pd.DataFrame(X_test)

#Apply min-max scaling to the training features
min = X_train.min().min()    
max = X_train.max().max()    
X_train = (X_train - min) / (max - min)

#Apply min-max scaling to the testing features
min = X_test.min().min()    
max = X_test.max().max()   
X_test = (X_test - min) / (max - min)

#Convert the labels to DataFrames from numpy arrays (because the features are now DataFrames)
Y_train = pd.DataFrame(Y_train)
Y_test = pd.DataFrame(Y_test)

#Fit a new logistic regression model with the updated dataset
model = LogisticRegression(), Y_train.values.ravel())

So, how does this new model perform?

print(model.score(X_train, Y_train))

On the training data, the model performs slightly better.

But check out the results on the testing data:

print(model.score(X_test, Y_test))

Big drop! Almost 4%.

Final Thoughts

With KNN, normalizing the features to likewise use a range of 0 to 1 boosted performance by about 1%. But, for logistic regression, this same change decreased perforance, and by a relatively significant amount. Candidly, at this point I have no idea why this occurred. Any ideas? I'm all ears!