Machine learning algorithms generally operate in feature space
Feature space:  What is it?
	Like the test
Mapping problems into feature space
	What features describe your problem well?

Terms for supervised machine learning:
	supervised
	Examples
	Training set
	Test set
	feature vector
	
Let's make up an algorithm!
	detect_best_feature algorithm
	Pick the best feature (in the training set)
	Classify based on that feature (in the test set)

A couple actual algorithms use that idea:
	Decision trees:  Follow it by a decision using a different feature
	Naive Bayes:  Average over many feature, with weights

Finding a separating plane in feature space:
	Graphical classification
	How many dimensions?
	Does it work in whatever dimensionality we have?
	What if we make more features?
		Then we can map the problem into a high dimensional feature space!

"kernel"
	It's got rather a lot of meanings!
	Generally speaking, it contains something only exposed with a carefully-crafted interface
	OS kernel
	kernels that I'm about to talk about

kernel methods in AI:
	Generally, these take a couple examples as input
	Produce a similarity value as output
	So, what if the kernel took care of calculating distance in high-dimensional feature space?
		SVM has a couple common kernels for this
	Kernel takes an example, might not need a feature vector!

Ok, applying this to images:
	We have to find features somehow
	Pixels can be features, with these issues:
		scale
		rotation
		translation
		brightness and contrast
		excessive or inadequate resolution
	We could use something totally different:
		Classify "forest at night" pictures
			Color binning, each bin is a feature in a normalized feature vector
	For a thing in a feature:  How about radial distances?
		Stars vs. Circles
		Distribution of radiusses (binned) for rotational invariance

How do you choose what features?
	Feature selection algorithms!
	Use entropy on each feature, discard useless ones (training set only)
	Keep best X features