Rationale for a Large Text Compression Benchmark (2009)