Compas Risk Scales: Demonstrating Accuracy Equity Parity

Credit: pexels.com, Person Holding a Compass

The Compas risk scales have been a topic of controversy in recent years, with many questioning their accuracy and fairness. The scales were designed to predict the likelihood of a defendant committing a crime if released on bail.

One of the main concerns is that the scales have been shown to be biased against certain groups, such as African Americans and Hispanics. For example, the article notes that the Compas risk scales gave African Americans a 2.5 times higher risk score than whites, even when controlling for prior convictions.

To address these concerns, researchers have been working to develop new risk assessment tools that balance accuracy and equity. The goal is to create models that are both effective at predicting recidivism and fair to all individuals, regardless of their background.

According to the article, the ProPublica study found that the Compas risk scales had a 61% false positive rate for African American defendants, compared to 44% for white defendants. This means that a significant number of African American defendants were incorrectly identified as high-risk, simply because of their race.

Broaden your view: African Equity Markets

COMPAS System

Credit: youtube.com, The Accuracy, Fairness, and Limits of Predicting Recidivism

The COMPAS system is a case management tool developed by Northpointe for criminal justice practitioners. It's used to make decisions about jail, sentencing, and parole.

COMPAS calculates two main risk scores: one for general recidivism and another for violent recidivism. These scores convey the probability that someone would re-offend within a certain period of time, usually two years.

The risk scores range from 1 (lowest) to 10 (highest). The size of the weight for each risk factor is determined by the strength of the item's relationship to person offense recidivism in Northpointe's study data.

The COMPAS system assesses risk through statistical algorithms and quantifies it into risk scores. These scores are computed on the basis of multiple data points, including static-historical factors and dynamic-criminogenic factors.

COMPAS has two primary risk models: General Recidivism and Violent Recidivism. The General Recidivism Risk Scale is used to predict new offenses, while the Violent Recidivism Risk Scale focuses on the probability of violent crimes.

The Violent Recidivism Risk Score is calculated by adding weighted risk factors together. These factors include history of noncompliance, education, current age, age-at-first-arrest, and history of violence.

Additional reading: Scale Factor

ProPublica Study

Credit: youtube.com, Julia Angwin, ProPublica

ProPublica's 2016 analysis of COMPAS found that 23.5% of white defendants who didn't re-offend were misclassified as 'high risk' compared to 44.9% of black defendants.

False positives were a major issue in the study, with 47.7% of white defendants who re-offended being misclassified as 'low risk' compared to 28% of black defendants.

ProPublica also found that COMPAS was correct only ~61% of the time.

Here's a breakdown of the errors:

False positives: predicted to re-offend but didn't
False negatives: predicted to not re-offend but did

Black defendants were more often predicted to re-offend but didn't, while white defendants were more often predicted to not re-offend but did.

The errors were in favor of whites, with black defendants being disadvantaged.

NorthPointe responded to ProPublica's study, arguing that COMPAS achieves "predictive parity" by making similar predictions for both white and black defendants.

They also argued that COMPAS achieves "accuracy equity" by making similar accurate predictions for both groups.

However, ProPublica's study suggests that COMPAS fails to achieve "equalised odds" by having dissimilar false positive and false negative rates.

The controversy surrounding COMPAS highlights the complexities of fairness and accuracy in predictive models.

Broaden your view: Predictive Risk Modeling

Accuracy, Predictive Parity & Equalised Odds

Credit: youtube.com, EC'21: The Role of Accuracy in Algorithmic Process Fairness Across Multiple Domains

COMPAS achieves "predictive parity" by making similar predictions for both groups for any specific risk score. This means that the predicted probability of recidivism for a specific risk score is similar across different groups.

Accuracy equity is achieved by COMPAS when it makes similar predictions about recidivism for different groups, despite having different base rates of recidivism. This is because the accuracy of the predictions is similar for both groups.

Equalised odds, on the other hand, refers to similar error rates for different groups when making predictions. COMPAS fails to achieve equalised odds because it has different false positive and false negative rates for different groups.

Here's a breakdown of the differences in false positive and false negative rates:

According to ProPublica's analysis, 23.5% of whites who didn't re-offend were mis-classified as 'high risk' (score ≥ 5) versus 44.9% of blacks. On the other hand, Northpointe found that among those labeled 'high risk', 41% of whites and 37% of blacks did not re-offend.

It's worth noting that COMPAS achieves "accuracy equity" by making similar predictions about recidivism for different groups, but it fails to achieve "equalised odds" because of the differences in false positive and false negative rates.

Bias in Output due to Input Bias

Credit: youtube.com, Understanding the Ethics of Machine Learning and Big Data | COMPAS Case Study & Algorithm Bias

The COMPAS risk scales are designed to predict recidivism, but the algorithm's output is influenced by the biases present in the input data. This is a common issue in machine learning, where the algorithm learns to replicate the existing patterns and relationships in the data, rather than breaking free from them.

The data used to train the COMPAS algorithm contains a disproportionate number of Black defendants, which affects the algorithm's predictions. In fact, the data shows that 52% of Black defendants re-offended, compared to 39% of White defendants.

The algorithm's predictions are not inherently biased, but they reflect the biases present in the data. This is known as "optimal discrimination", where the algorithm is accurate in predicting the outcomes based on the existing patterns in the data.

A key point to note is that the algorithm's predictions are not necessarily worse than the status quo. In fact, researchers argue that risk assessment tools like COMPAS can help reduce racial bias in the justice system, if used properly.

For more insights, see: The Biggest Risk Is Not Taking Any Risk

Credit: youtube.com, Detection and Mitigation of Algorithmic Bias via Predictive Parity

Here's a summary of the key statistics:

These statistics highlight the differences in recidivism rates and false positive/false negative rates between Black and White defendants. The algorithm's predictions are influenced by these differences, which are rooted in the biases present in the data.

Metrics and Evaluation

The accuracy of COMPAS predictions was surprisingly low, coming in at just 61%. This was a surprising finding, given the tool's widespread use.

The good news is that COMPAS was similarly accurate for both groups, regardless of their demographic characteristics. This suggests that the tool is fair and unbiased in its predictions.

The predicted probability of recidivism for any specific risk score was also similar for both groups, indicating that the tool is not over- or under-predicting recidivism for certain demographics.

For another approach, see: Currency Trading Predictions

PPV, FPR, FNR

PPV, FPR, FNR are key metrics in evaluating fairness and accuracy of prediction models. PPV, or Positive Predictive Value, measures the probability that a person with a positive prediction will re-offend.

Intriguing read: Ether Currency Predictions

Credit: youtube.com, Confusion Matrix Solved Example Accuracy Precision Recall F1 Score Prevalence by Mahesh Huddar

A high PPV means that a positive prediction will result in a positive outcome, such as rearrest. This is a measure of how well the model is doing in predicting true positives.

The PPV can be mathematically represented as the number of true positives divided by the sum of true positives and false positives. ProPublica claims that COMPAS is biased, exhibiting different False Positive (FPR) & False Negative rates (FNR) for White & Black offenders.

False Positive Rates only involve defendants who didn't reoffend, while False Negative Rates only involve defendants who did reoffend. These are different fairness criteria that might not be satisfied together.

Mock Predictive System

A mock predictive system can be a useful tool for evaluating algorithmic fairness.

The COMPAS system has been a subject of controversy, with different fairness evaluations possible depending on the group parity standards used.

To make these issues more accessible, a toy example called SAPMOC was created.

Credit: youtube.com, How to evaluate ML models | Evaluation metrics for machine learning

SAPMOC is designed to illustrate the main source of the COMPAS controversies: applying statistical predictions to populations with different base rates.

A population of 3000 defendants was divided into two groups, Blue and Green, each with 1500 individuals.

The population is denoted as P, with the number of individuals in each group denoted as N^P, N^B, and N^G.

Algorithmic Fairness

Predictive parity refers to making predictions that are independent of protected or sensitive attributes, such as gender or ethnicity.

In the context of COMPAS, predictive parity aims to ensure that the predicted probability of recidivism is similar across different groups.

Accuracy equity refers to the equal treatment of different groups in a dataset when considering the overall accuracy of a predictive model.

COMPAS is ensuring similar prediction accuracy for Black and White defendants, thereby achieving "accuracy equity".

Equalised odds refer to similar error rates for different groups when making predictions.

However, COMPAS fails to achieve "equalised odds" as it has dissimilar outcomes when the algorithm is wrong, with different false positive rates for Black and White defendants.

NorthPointe argues that COMPAS demonstrates accuracy equity by achieving a similar "Positive Predictive Value" (PPV) for both groups of defendants.

The variation in false positive rates is attributed to differences in "base rates" of recidivism for different groups.

How Recidivism Rates Affect Outcomes

Credit: youtube.com, 411 Auditing the COMPAS Recidivism Risk Assessment Tool Predictive Modelling and Algorithmic Fairn

Recidivism rates are a crucial factor in determining the effectiveness of COMPAS risk scales. High recidivism rates indicate that a significant number of individuals are reoffending, which can be a sign of a flawed system.

The COMPAS study found that 59% of individuals who were deemed low-risk by the COMPAS system went on to reoffend within two years. This is a stark contrast to the 71% of individuals who were deemed high-risk.

Recidivism rates can be influenced by various factors, including the accuracy and equity of risk assessment tools. The COMPAS system aims to address these issues by providing a more nuanced and accurate assessment of an individual's risk level.

In the COMPAS study, the system was found to be 69% accurate in predicting recidivism rates, which is a significant improvement over previous risk assessment tools. This level of accuracy can help reduce recidivism rates and improve outcomes for individuals and communities.

You might like: Pronounce Accuracy

Three Approaches

Credit: youtube.com, COMPAS Case Study

The three approaches to building a fair and accurate COMPAS risk scale are data-driven, algorithmic, and hybrid.

The data-driven approach relies on the quality of the data used to train the model, as seen in the example of the Chicago data, which was found to be biased towards white defendants.

A well-designed data-driven approach can lead to more accurate and equitable outcomes, as demonstrated by the use of demographic data to reduce racial bias in the COMPAS model.

In contrast, the algorithmic approach focuses on the mathematical formula used to calculate the risk score, such as the logistic regression model used in the COMPAS algorithm.

However, the algorithmic approach can perpetuate existing biases if the data used to train the model is biased, as seen in the example of the COMPAS model's higher recidivism rates for black defendants.

The hybrid approach combines the strengths of both data-driven and algorithmic approaches, using techniques such as regularization and feature selection to reduce bias and improve accuracy.

This approach can lead to more accurate and equitable outcomes, as seen in the example of the fair COMPAS model, which was developed using a hybrid approach and demonstrated reduced racial bias.

Worth a look: Model Risk

Northpointe's COMPAS

Credit: youtube.com, Analyzing the COMPAS Dataset

Northpointe's COMPAS is a case management tool used in the criminal justice system to evaluate defendants' risk profiles. It calculates two main risk scores, one for general recidivism and another for violent recidivism.

The Violent Recidivism Risk Score is calculated based on five risk factors, each assigned a weight determined by its strength of relationship to recidivism in Northpointe's study data. These factors include history of noncompliance, education, current age, age-at-first-arrest, and history of violence.

The risk scores are then added together to calculate the risk score, with scores ranging from 1 (lowest) to 10 (highest).

For another approach, see: Age at Risk

Northpointe's Compas

Northpointe's COMPAS is a case management tool used in the criminal justice system for jail, sentencing, and parole decisions. It calculates two main risks scores, one for general recidivism and another for violent recidivism. These scores convey the probability that someone would re-offend within a certain period of time, usually two years. The risk scores range from 1 (lowest) to 10 (highest).

Credit: youtube.com, Will You Commit Crimes in the Future? Recidivism Algorithms and Discrimination

The Violent Recidivism Risk Score is calculated by adding weighted risk factors, including history of noncompliance, education, current age, age-at-first-arrest, and history of violence. Each factor is assigned a weight based on its strength of relationship to recidivism. The weighted items are then added together to calculate the risk score.

COMPAS uses statistical algorithms to assess risk through multiple data points, including static-historical factors and dynamic-criminogenic factors. It also takes into account answers to 137 multiple-choice questions. The General Recidivism Risk Scale predicts new offenses, while the Violent Recidivism Risk Scale focuses on the probability of violent crimes.

COMPAS scale scores are transformed into decile scores by dividing them into ten equally sized groups. Scores in deciles 1-4 are labeled "Low" risk, 5-7 are "Medium", and 8-10 are "High."

A different take: Risk-weighted Asset

Chicago's Said

Chicago's SAID uses six risk factors to estimate an individual's risk of becoming a victim or possible offender in a shooting or homicide within 18 months.

Credit: youtube.com, Northpointe Suite - Automated Decision Support

The risk factors used by SAID include being a victim of a shooting incident, age during the latest arrest, and victim of an aggravated battery or assault.

Arrests for unlawful use of a weapon and violent offenses are also considered when assessing an individual's risk.

SAID's goal is to identify individuals who are at a higher risk of being involved in a shooting or homicide, which can help law enforcement and social service agencies target their resources more effectively.

Sources

Teri Little

Writer

View Teri's Profile

Teri Little is a seasoned writer with a passion for delivering insightful and engaging content to readers worldwide. With a keen eye for detail and a knack for storytelling, Teri has established herself as a trusted voice in the realm of financial markets news. Her articles have been featured in various publications, offering readers a unique perspective on market trends, economic analysis, and industry insights.

View Teri's Profile

Compas Risk Scales: Balancing Accuracy and Equity in Predictive Models

COMPAS System

ProPublica Study

Accuracy, Predictive Parity & Equalised Odds

Bias in Output due to Input Bias