Hi, could anyone kindly let me know how to prevent the Decision Trees model from using value missing as a criterion to split trees?
Thanks,
hz
Decision trees only splits on missing if you have sparse data. Are you using nested tables possibly? If not, then you have a column in your table with enough NULLs that are correlated with your target that there is enough information gain to cause a split. You can use NOT NULL, but this will just prevent the model from processing altogether if nulls are present|||Yes, unfortunately, there are a lot of null values. I am not sure what you mean by "use NOT NULL". I have many predictor columns. Their null values are not in sync. If "NOT NULL" is used as a filter for every predictor, there may be no case left.
|||Unfortunately there's no way to not split by a value that appears in the data in SQL 2005. It's something for us to think about for future versions (expecially the "NULL" case)
No comments:
Post a Comment