模型可解释性:特征重要性/SHAP/LIME

模型可解释性:特征重要性/SHAP/LIME

1. 特征重要性

fromsklearn.ensembleimportRandomForestClassifierimportpandasaspd# 树模型内置特征重要性rf=RandomForestClassifier(n_estimators=100,random_state=42)rf.fit(X_train,y_train)importance=pd.Series(rf.feature_importances_,index=feature_names)print(importance.nlargest(10))# XGBoost 特征重要性importxgboostasxgb xgb_clf=xgb.XGBClassifier().fit(X_train,y_train)xgb.plot_importance(xgb_clf,max_num_features=10)

2. SHAP 值

importshap# 计算 SHAP 值explainer=shap.TreeExplainer(rf)shap_values=explainer.shap_values(X_test)# 摘要图shap.summary_plot(shap_values[1],X_test,feature_names=feature_names)# 单样本解释shap.force_plot(explainer.expected_value[1],shap_values[1][0],X_test.iloc[0])# 依赖图shap.dependence_plot('feature_name',shap_values[1],X_test)

3. LIME

fromlime.lime_tabularimportLimeTabularExplainer explainer=LimeTabularExplainer(X_train.values,feature_names=feature_names,class_names=['class_0','class_1'],mode='classification')# 解释单个样本exp=explainer.explain_instance(X_test.iloc[0].values,rf.predict_proba,num_features=10)exp.show_in_notebook()

总结

方法适用模型粒度计算速度
特征重要性树模型全局
SHAP任意模型全局+局部
LIME任意模型局部