#1
Just to remind myself that it's possible, here's a sinus plotter clone in an animGIF:Obviously when created for the web it needs to be periodic within a few hundred frames, in the absence of the implied constraints it can be more flexible. By the way it's originally a microprogramming classic (to me, at least), e.g. this one was an example. Beyond that, its history spans way back, see the Lissajous-curves, for instance.
#2
Alas, this one's a video and I have little control over the positioning, but here we can see how the nature of an xgboost forest based prediction responds to more iterations (= new trees) for a binary classification problem with a noisy training set, and how much of the problematic overfitting can be prevented by reducing the maximum tree depth. Of course, it's only overfitting as long as those stand-alone points, constituting a few percent of the whole training data, are really noise, and not valuable information! Apparently, the max. depth is similar to some sort of a filtering parameter. I showcase the result of the training re-evaluated over the training set for depths of 1, 2, 3, and 4 below.Obviously someone with a slightly mathematical mindset could then ask - why don't we allow for fractional depth values? Since that's the parameterization style the most common image processing filters allow for. Sharpness, blur amount etc. - it's too common to miss a limitation.
My thought there is that max.depth accomplishes more than one thing, namely at least 1. poses an expectation towards what can be effectively considered a significant cluster of true values 2. controls the complexity of rules (like if (v_i_1 and v_i_2 and .. v_i_max.depth) return(1)). This can both be good and bad, but a) some separation is worth a thought, e.g. filtering as a preprocessing step and b) in practice the preferred values may change by location, so it can be a very long story...
... and that's the intro for now, I have some plans about returning to this one over the future:
[To be continued ...]
Sources:
#1
#2