3. Parameters#
This section describes the parameters and methods available in the xbooster library.
3.1. xbooster.constructor
- XGBoost Scorecard Constructor#
3.1.1. Description#
A class for generating a scorecard from a trained XGBoost model.
3.1.2. Methods#
extract_leaf_weights() -> pd.DataFrame
:Extracts the leaf weights from the booster’s trees and returns a DataFrame.
Returns:
pd.DataFrame
: DataFrame containing the extracted leaf weights.
extract_decision_nodes() -> pd.DataFrame
:Extracts the split (decision) nodes from the booster’s trees and returns a DataFrame.
Returns:
pd.DataFrame
: DataFrame containing the extracted split (decision) nodes.
construct_scorecard() -> pd.DataFrame
:Constructs a scorecard based on a booster.
Returns:
pd.DataFrame
: The constructed scorecard.
create_points(pdo=50, target_points=600, target_odds=19, precision_points=0, score_type='XAddEvidence') -> pd.DataFrame
:Creates a points card from a scorecard.
Parameters:
pdo
(int, optional): The points to double the odds. Default is 50.target_points
(int, optional): The standard scorecard points. Default is 600.target_odds
(int, optional): The standard scorecard odds. Default is 19.precision_points
(int, optional): The points decimal precision. Default is 0.score_type
(str, optional): The log-odds to use for the points card. Default is ‘XAddEvidence’.
Returns:
pd.DataFrame
: The points card.
predict_score(X: pd.DataFrame) -> pd.Series
:Predicts the score for a given dataset using the constructed scorecard.
Parameters:
X
(pd.DataFrame
): Features of the dataset.
Returns:
pd.Series
: Predicted scores.
sql_query
(property):Property that returns the SQL query for deploying the scorecard.
Returns:
str
: The SQL query for deploying the scorecard.
generate_sql_query(table_name: str = "my_table") -> str
:Converts a scorecard into an SQL format.
Parameters:
table_name
(str): The name of the input table in SQL.
Returns:
str
: The final SQL query for deploying the scorecard.
3.2. xbooster.explainer
- XGBoost Scorecard Explainer#
This module provides functionalities for explaining XGBoost scorecards, including methods to extract split information, build interaction splits, visualize tree structures, plot feature importances, and more.
3.2.1. Methods:#
extract_splits_info(features: str) -> list
:Extracts split information from the DetailedSplit feature.
Inputs:
features
(str): A string containing split information.
Outputs:
Returns a list of tuples containing split information (feature, sign, value).
build_interactions_splits(scorecard_constructor: Optional[XGBScorecardConstructor] = None, dataframe: Optional[pd.DataFrame] = None) -> pd.DataFrame
:Builds interaction splits from the XGBoost scorecard.
Inputs:
scorecard_constructor
(Optional[XGBScorecardConstructor]): The XGBoost scorecard constructor.dataframe
(Optional[pd.DataFrame]): The dataframe containing split information.
Outputs:
Returns a pandas DataFrame containing interaction splits.
split_and_count(scorecard_constructor: Optional[XGBScorecardConstructor] = None, dataframe: Optional[pd.DataFrame] = None, label_column: Optional[str] = None) -> pd.DataFrame
:Splits the dataset and counts events for each split.
Inputs:
scorecard_constructor
(Optional[XGBScorecardConstructor]): The XGBoost scorecard constructor.dataframe
(Optional[pd.DataFrame]): The dataframe containing features and labels.label_column
(Optional[str]): The label column in the dataframe.
Outputs:
Returns a pandas DataFrame containing split information and event counts.
plot_importance(scorecard_constructor: Optional[XGBScorecardConstructor] = None, metric: str = "Likelihood", normalize: bool = True, method: Optional[str] = None, dataframe: Optional[pd.DataFrame] = None, **kwargs: Any) -> None
:Plots the importance of features based on the XGBoost scorecard.
Inputs:
scorecard_constructor
(Optional[XGBScorecardConstructor]): The XGBoost scorecard constructor.metric
(str): Metric to plot (“Likelihood” (default), “NegLogLikelihood”, “IV”, or “Points”).normalize
(bool): Whether to normalize the importance values (default: True).method
(Optional[str]): The method to use for plotting the importance (“global” or “local”).dataframe
(Optional[pd.DataFrame]): The dataframe containing features and labels.fontfamily
(str): The font family to use for the plot (default: “Monospace”).fontsize
(int): The font size to use for the plot (default: 12).dpi
(int): The DPI of the plot (default: 100).title
(str): The title of the plot (default: “Feature Importance”).**kwargs
(Any): Additional Matplotlib parameters.
plot_score_distribution(y_true: pd.Series = None, y_pred: pd.Series = None, n_bins: int = 25, scorecard_constructor: Optional[XGBScorecardConstructor] = None, **kwargs: Any)
:Plots the distribution of predicted scores based on actual labels.
Inputs:
y_true
(pd.Series): The true labels.y_pred
(pd.Series): The predicted labels.n_bins
(int): Number of bins for histogram (default: 25).scorecard_constructor
(Optional[XGBScorecardConstructor]): The XGBoost scorecard constructor.**kwargs
(Any): Additional Matplotlib parameters.
plot_local_importance(scorecard_constructor: Optional[XGBScorecardConstructor] = None, metric: str = "Likelihood", normalize: bool = True, dataframe: Optional[pd.DataFrame] = None, **kwargs: Any) -> None
:Plots the local importance of features based on the XGBoost scorecard.
Inputs:
scorecard_constructor
(Optional[XGBScorecardConstructor]): The XGBoost scorecard constructor.metric
(str): Metric to plot (“Likelihood” (default), “NegLogLikelihood”, “IV”, or “Points”).normalize
(bool): Whether to normalize the importance values (default: True).dataframe
(Optional[pd.DataFrame]): The dataframe containing features and labels.fontfamily
(str): The font family to use for the plot (default: “Arial”).fontsize
(int): The font size to use for the plot (default: 12).boxstyle
(str): The rounding box style to use for the plot (default: “round”).title
(str): The title of the plot (default: “Local Feature Importance”).**kwargs
(Any): Additional parameters to pass to the matplotlib function.
plot_tree(tree_index: int, scorecard_constructor: Optional[XGBScorecardConstructor] = None, show_info: bool = True) -> None
:Plots the tree structure.
Inputs:
tree_index
(int): Index of the tree to plot.scorecard_constructor
(Optional[XGBScorecardConstructor]): The XGBoost scorecard constructor.show_info
(bool): Whether to show additional information (default: True).**kwargs
(Any): Additional Matplotlib parameters.