Hacker News

Principal component analysis explained visually

by vicapowon 2/12/2015, 6:16:53 PM with 12 comments

by jrullmannon 2/12/2015, 7:05:50 PM
Seeing how each visualization adjusts as I change the original dataset is so useful. The technique reminds me of Bret Victor's amazing work.
Ladder of Abstraction Essay: http://worrydream.com/#!2/LadderOfAbstraction
Stop Drawing Dead Fish Video: https://vimeo.com/64895205
This is awesome, thanks for sharing!
by kaitaion 2/12/2015, 7:32:10 PM
Very nice! I actually used the featured example from Mark Richardson's class notes on Principal Component Analysis (http://people.maths.ox.ac.uk/richardsonm/SignalProcPCA.pdf) in teaching. It was astounding how clear it was to some people and how unclear to others.
I did a singular value decomposition on a data set similar to the one Richardson used (except with international data). The original post here looks at the projection to country-coordinates, looking at what axes describe primary differences between countries. My students had no problem with that -- Wales and North Ireland are most different, in your example, and 'give' the first principal axis. But then I continued to do it with the foods, as Richardson did (look at Figure 4 in the linked file). Students concluded in large numbers that people just don't like fresh fruit and do like fresh potatoes. Hm. They didn't conclude that people don't like Wales and do like North Ireland; they accurately saw it as an axis. But once we were talking about food instead of countries, students saw projection to the eigenspace as being indicative of some percentage of approval.
How could we visually display both parts of this principal component analysis to combat this prejudice that sometimes leads us to read left to right as worse to better?
by hobbyiston 2/12/2015, 8:03:47 PM
How differently is linear regression than PCA? I understand the procedure and methods are completely different, but isn't linear regression also going to give the same solution on these data sets?
by rrosen326on 2/12/2015, 8:03:50 PM
This is truly fantastic.
Excuse me for being daft, but how do you transform back into 'what does this mean'?
For instance, in ex 3, we see that N. Ireland is an outlier. It wasn't obvious to me that the cause was potatoes and fruit.
How does PCA help you with the fundamental meaning?
by haihaibyeon 2/13/2015, 12:01:18 PM
A webapp for doing SVD/PCA:
http://biographserv.com/bgs/docs/svd_graph_editor/
by CmonDevon 2/13/2015, 1:29:46 PM
I think the 3D chart requires a WebGL plug-in... Where can I download one for Chrome 40.x?
by languagehackeron 2/13/2015, 1:08:57 AM
PCA is a pretty okay method for dimensionality reduction. Latent Dirichlet allocation is pretty good too. It depends on what you're trying to do and how the data is distributed in N-dimensional space.
by nathellon 2/12/2015, 8:19:17 PM
This is great. The only thing I'm missing is explanation of the various methods of rotating the principal axes (varimax, oblimin, etc.)
by benten10on 2/13/2015, 9:46:44 AM
Hope I'm not late to the party...Does anyone have an implementation(s) to recommend in Python?
by scottmcdoton 2/12/2015, 9:20:38 PM
Strange font, particularly the 's'. It's hard to read. Anyone else having the same experience?
by languagehackeron 2/13/2015, 1:08:59 AM
PCA is a pretty okay method for dimensionality reduction. Latent Dirichlet allocation is pretty good too. It depends on what you're trying to do and how the data is distributed in N-dimensional space.
by _almosnowon 2/13/2015, 5:02:30 PM
Pretty nice page, but it doesn't say much about PCA. "Visualizing PCA output" would be a more appropiate post title.