+1 (315) 557-6473 

How to Predict NBA Player Shooting Ability with Logistic Regression and KNN Statistical Analysis

This expertly solved homework offers a comprehensive insight into the world of basketball analytics, focusing on a Logistic Regression assignment and utilizing two powerful machine learning techniques, Logistic Regression and K-Nearest Neighbors (KNN), to forecast the shooting prowess of NBA players. Students can anticipate an in-depth exploration of the methods and data analysis used to evaluate player performance, especially in terms of their shooting accuracy.

Problem Statement

We delve into the world of the NBA to predict the shooting ability of players using logistic regression and K-Nearest Neighbors (KNN). The objective is to determine whether a player is a "good shooter" (coded as 1) or a "not good shooter" (coded as 0) based on a set of crucial variables, including Points Per Game (PTS), Three-Point Percentage (3P%), Free Throw Percentage (FT%), Rebounds per Game (REB), and Assists per Game (AST). Our analysis reveals that, among these variables, rebounds per game has a significant impact on a player's shooting ability, as evidenced by logistic regression. The homework also explores the limitations of other classification methods, emphasizing the importance of selecting the most accurate model for player evaluation and team decisions.

Solution:

Independent Variables

The dependent variable is binary, with "1" representing a "good shooter" and "0" representing a "not good shooter." Our goal in building these models, including logistic regression and K-Nearest Neighbors (KNN), is to understand which variables have the most significant impact on determining a player's shooting ability.

The independent variables used to predict shooting ability include:

  • PTS (Points Per Game): Indicates a player's scoring ability.
  • 3P% (Three-Point Percentage): Measures a player's efficiency in shooting three-pointers.
  • FT% (Free Throw Percentage): Represents the accuracy of free throw shooting.
  • REB (Rebounds per Game): Measures a player's ability to secure missed shots.
  • AST (Assists per Game): Indicates the ability to create scoring opportunities for teammates.

These variables are derived from NBA 2023 data.

Predicting whether a player is "good" is valuable for coaches, scouts, and team management, as it provides insights into a player's scoring ability, shooting efficiency, and overall offensive contribution to the team. Logistic regression can identify influential variables that offer valuable insights into a player's shooting ability.

Logistic Regression

Table 1. Test of the null hypothesis H0: Pr(Good Shooter=1) = 0.373:

Statistic DF Chi-square Pr > Chi²
-2 Log(Likelihood) 5 31.447 <0.0001
Score 5 23.356 0.000
Wald 5 12.013 0.035

Table 2. Type II analysis (Variable Good Shooter):

Source DF Chi-square (Wald) Pr > Wald Chi-square (LR) Pr > LR
PTS 1 0.807 0.369 0.949 0.330
3P% 1 0.758 0.384 0.898 0.343
FT% 1 0.962 0.327 1.053 0.305
REB 1 7.134 0.008 12.760 0.000
AST 1 2.759 0.097 3.487 0.062

Table 3. Model parameters (Variable Good Shooter):

Source Value Standard error Wald Chi-Square Pr > Chi² Wald Lower bound (95%) Wald Upper bound (95%) Odds ratio Odds ratio Lower bound (95%) Odds ratio Upper bound (95%)
Intercept 4.636 4.163 1.240 0.265 -3.524 12.796
PTS -0.159 0.177 0.807 0.369 -0.505 0.188 0.853 0.603 1.206
3P% -0.053 0.061 0.758 0.384 -0.172 0.066 0.948 0.842 1.068
FT% -0.053 0.054 0.962 0.327 -0.158 0.053 0.949 0.854 1.054
REB 0.800 0.299 7.134 0.008 0.213 1.386 2.225 1.237 4.000
AST -0.957 0.576 2.759 0.097 -2.085 0.172 0.384 0.124 1.188

The logistic regression equation, derived from the coefficients, is:

Pr(Good Shooter=1) = 1 / (1 + exp(-(4.636 - 0.159PTS - 0.0533P% - 0.053FT% + 0.800REB - 0.957*AST))

Interpretation of the Coefficient of Predictor Variables

  • The coefficient of PTS (-0.159) suggests that an increase in points per game will decrease the likelihood of being a good shooter, indicating that scoring more points does not necessarily make a player a good shooter.
  • The coefficient of 3P% (-0.053) indicates that a higher three-point shooting percentage reduces the likelihood of being a good shooter.
  • The coefficient of FT% (-0.053) suggests that a higher free throw shooting percentage is associated with a slightly lower likelihood of being a good shooter.
  • The coefficient of REB (0.800) means that grabbing more rebounds per game is associated with a higher likelihood of being a good shooter.
  • The coefficient of AST (-0.957) indicates that a higher average number of assists per game is associated with a lower likelihood of being a good shooter.

K-Nearest Neighbors (KNN)

Summary statistics for quantitative data/predictors:

Variable Observations Obs. with missing data Obs. without missing data Minimum Maximum Mean Std. deviation
PTS 201 0 201 2.500 33.400 12.915 7.149
3P% 201 0 201 0.000 55.300 32.809 10.480
FT% 201 0 201 31.400 97.400 78.184 9.209
REB 201 0 201 0.800 12.400 4.608 2.419
AST 201 0 201 0.300