2017-08-06 1 views
2

Ich las einen XGB notebook und den xgb.plot.tree Befehl in Beispiel Ergebnis in einem Bild wie folgt aus: enter image description herexgb.plot.tree Layout in r

jedoch, wenn ich das gleiche tun, habe ich eine Bild wie dieses, die zwei getrennte Diagramme und in den verschiedenen Farben auch sind.

enter image description here

Ist das normal? sind die zwei Graphen zwei Bäume?

Antwort

1

Ich habe das gleiche Problem. Laut einem Problemfall im Repository von xgboost github könnte dies an einer Änderung in der DiagrameR-Bibliothek liegen, die von xgboost für das Rendern von Bäumen verwendet wird. https://github.com/dmlc/xgboost/issues/2640

Statt das „dgr_graph“ Objekt mit diagrammer Befehlen zu modifizieren, entschied ich mich, eine neue Version der Funktion xgb.plot.tree zu erstellen, die direkt die Farbe der Schrift von Knoten definiert. Es war ausreichend, um die Parameter fontcolor="black" in der nodes <- DiagrammeR::create_node_df Zeile hinzufügen

xgb.plot.tree <- function (feature_names = NULL, model = NULL, n_first_tree = NULL, 
     plot_width = NULL, plot_height = NULL, ...) 
    { 

     if (class(model) != "xgb.Booster") { 
      stop("model: Has to be an object of class xgb.Booster model generaged by the xgb.train function.") 
     } 
     if (!requireNamespace("DiagrammeR", quietly = TRUE)) { 
      stop("DiagrammeR package is required for xgb.plot.tree", 
       call. = FALSE) 
     } 
     allTrees <- xgb.model.dt.tree(feature_names = feature_names, 
      model = model, n_first_tree = n_first_tree) 
     allTrees[, `:=`(label, paste0(Feature, "\\nCover: ", Cover, 
      "\\nGain: ", Quality))] 
     allTrees[, `:=`(shape, "rectangle")][Feature == "Leaf", `:=`(shape, 
      "oval")] 
     allTrees[, `:=`(filledcolor, "Beige")][Feature == "Leaf", 
      `:=`(filledcolor, "Khaki")] 
     nodes <- DiagrammeR::create_node_df(n = length(allTrees[, 
      ID] %>% rev), label = allTrees[, label] %>% rev, style = "filled", 
      color = "DimGray", fillcolor = allTrees[, filledcolor] %>% 
       rev, shape = allTrees[, shape] %>% rev, data = allTrees[, 
       Feature] %>% rev, fontname = "Helvetica", fontcolor="black") 
     edges <- DiagrammeR::create_edge_df(from = match(allTrees[Feature != 
      "Leaf", c(ID)] %>% rep(2), allTrees[, ID] %>% rev), to = match(allTrees[Feature != 
      "Leaf", c(Yes, No)], allTrees[, ID] %>% rev), label = allTrees[Feature != 
      "Leaf", paste("<", Split)] %>% c(rep("", nrow(allTrees[Feature != 
      "Leaf"]))), color = "DimGray", arrowsize = "1.5", arrowhead = "vee", 
      fontname = "Helvetica", rel = "leading_to") 
     graph <- DiagrammeR::create_graph(nodes_df = nodes, edges_df = edges) 
     DiagrammeR::render_graph(graph, width = plot_width, height = plot_height) 
    } 

Then, it remains to change some parameters to improve the readibility of the graph. Below I add an example of the code I use to display the first tree of my xgboost model. 

    xgb.plot.tree <- function (feature_names = NULL, model = NULL, n_first_tree = NULL, 
     plot_width = NULL, plot_height = NULL, ...) 
    { 

     if (class(model) != "xgb.Booster") { 
      stop("model: Has to be an object of class xgb.Booster model generaged by the xgb.train function.") 
     } 
     if (!requireNamespace("DiagrammeR", quietly = TRUE)) { 
      stop("DiagrammeR package is required for xgb.plot.tree", 
       call. = FALSE) 
     } 
     allTrees <- xgb.model.dt.tree(feature_names = feature_names, 
      model = model, n_first_tree = n_first_tree) 

     allTrees$Quality <- round(allTrees$Quality, 3) 
     allTrees$Cover <- round(allTrees$Cover, 3) 


     allTrees[, `:=`(label, paste0(Feature, "\\nCover: ", Cover, 
      "\\nGain: ", Quality))] 
     allTrees[, `:=`(shape, "rectangle")][Feature == "Leaf", `:=`(shape, 
      "egg")] 
     allTrees[, `:=`(filledcolor, "Beige")][Feature == "Leaf", 
      `:=`(filledcolor, "Khaki")] 

     nodes <- DiagrammeR::create_node_df(n = length(allTrees[, 
      ID] %>% rev), label = allTrees[, label] %>% rev, style = "filled", width=1.5, 
      color = "DimGray", fillcolor = allTrees[, filledcolor] %>% 
       rev, shape = allTrees[, shape] %>% rev, data = allTrees[, 
       Feature] %>% rev, fontname = "Helvetica", fontcolor="black") 

     edges <- DiagrammeR::create_edge_df(from = match(allTrees[Feature != 
      "Leaf", c(ID)] %>% rep(2), allTrees[, ID] %>% rev), to = match(allTrees[Feature != 
      "Leaf", c(Yes, No)], allTrees[, ID] %>% rev), label = allTrees[Feature != 
      "Leaf", paste("<", Split)] %>% c(rep("", nrow(allTrees[Feature != 
      "Leaf"]))), color = "DimGray", arrowsize = 1, arrowhead = "vee", minlen="5", 
      fontname = "Helvetica", rel = "leading_to", fontsize="15") 

     graph <- DiagrammeR::create_graph(nodes_df = nodes, edges_df = edges, attr_theme=NULL) 
     DiagrammeR::render_graph(graph, width = plot_width, height = plot_height) 
     return(graph) 
}