Simple decision trees

Update (2016-02-09): It has been brought to my attention that one of the decision trees is not being generated correctly so please don’t use these diagrams. I’m too lazy to debug this code though.

Decision trees are commonly used in machine learning because they work surprisingly well. A simple algorithm to make a decision tree is ID3. We choose the feature X that maximizes information gain IG(X), defined as:

1 \displaystyle{\begin{align}IG(X)\equiv H(Y) - H(Y|X)\end{align}}

where H is the entropy given by

2 \displaystyle{\begin{align}H(X) = \sum_i -p(X=i) \log p(X=i)\end{align}}

and the conditional entropy is given by

3 \displaystyle{\begin{align}H(Y|X) = \sum_i  p(X=i) H(Y|X=i)\end{align}}

Type the data here (space delimited; first column is the output).

1 ID3 Tree

2 C4.5 Tree

3 SVG diagram code

This is provided here so that you can save the diagrams by copying the following code into your favourite text editor and then saving it as an SVG file.

3.1 ID3

3.2 C4.5