
	The problem(s):
		+ determinism
			+ cold-start performance measurements have 
			  -high- jitter (Jan Kara etc.)
			+ a 5-10% improvement disappears into the noise
			+ very slow to re-do 100's of cold runs
		+ user-space
			+ no tools to trace implicit I/O -
			  touching mmaps
		+ kernel-space
			+ re-compile, re-boot, re-run N times,
				+ is it faster ?
			+ repeat cycle.

	IOGrind:
		+ visibility to both kernel & user-space
			+ deterministic:
				+ 2 phases:
					+ capture
					+ simulate / visualise
			+ 2 capture modes:
				+ user-space:
					+ full-stack information
				+ kernel-space:
					+ working set information
				+ every page touch & I/O event logged
			+ Simulation:
				+ kernel
					+ prototype new VM tweaks,
					  readahead, disk layouts etc.
				+ user-space
 					+ I/O latency broken down by
					  code-paths, files, modules,
					+ working set identification &
					  optimisation.

