I wonder if there is any HTML parsers that use SSE4.2 instructions
It's an interesting idea, but I found this blog post unenlightening. Does anyone have links to other writeups?
clang does a very similar thing in the preprocessor (looking for new lines if my memory serves correctly).
can http parsing be done on a GPU?
Using SSE4.2 or more is trivial. Same goes for fast hashes with the crc32 intrinsic. Mostly the compiler does it for you (-march=native)
But you cannot assume everybody has such a CPU. Hence you'd either need to compile your own (a la macports, gentoo, perl, ...) or do run-time checks for the CPU feature and switch to the fast version then.