ZhaoHui Tang wrote: |
We have multiple tree split methods. Score_Method = 1 is Entropy, 3 is Bayesian K2, 4 is Bayesian Dirichlet Equivalent with Uniform prior. 4 is the default setting.
|
|
Peter Kim [MS] wrote: |
As far as split method is concerned, nothing has changed between SQL 2000 and SQL 2005. They are SIMPLE_BINARY (1), COMPLETE (2), or BOTH (3). However, automatic feature selection in decision trees has completely changed. In SQL 2000, feature selection of input variables was based on entropy of each variable without considering any correlation with the output variables. We found feature selection based on Bayesian split score (output vs. each input) instead often brings significant improvement in accuracy. That's what's implemented in SQL 2005. The "score method" didn't get changed except that ORTHOGONAL (2) in SQL 2005 is dropped.
|
|