There's a second half of a two hour video on YouTube which talks about creating embeddings using some pre transforms followed by SVD with some distance shenanigans,
I know, I mentioned his name in a post last week, Figured doing so again might seem a bit fanboy-ish. I am kind-of a fan but mostly a fan of good explanations. He's just self-selecting for the group.
Strongly agree. I even searched to see I wasn't missing it. I mean yeah "SVD" is likely singular value decomposition, but in this context you have other acronyms bouncing around your head (like support vector machine- just need to get rid of the m).
In sparse coding, you're generally using an over-complete set of vectors which decompose the data into sparse activations.
So, if you have a dataset of hundred dimensional vectors, you want to find a set of vectors where each vector is well described as a combination of ~4 of the "basis" vectors.
There's a second half of a two hour video on YouTube which talks about creating embeddings using some pre transforms followed by SVD with some distance shenanigans,
https://www.youtube.com/watch?v=Z6s7PrfJlQ0&t=3084s
It's 4 years old and seems to be a bit of a hidden gem. Someone even pipes up at 1:26 to say "This is really cool. Is this written up somewhere?"
[snapshot of the code shown]
CPU times: user 3min 5s, sys: 20.2 s, total: 3min 25sWall time: 1min 26s
That’s Leland McInnes - author of UMAP, the widely-used dimension reduction tool
I know, I mentioned his name in a post last week, Figured doing so again might seem a bit fanboy-ish. I am kind-of a fan but mostly a fan of good explanations. He's just self-selecting for the group.
To the authors: Please expand your acronyms at least once! I had to stop reading to figure out what "KSVD" stands for.
Learning what it stands for* wasn't particularly helpful in this case, but defining the term would've kept me on your page.
*K-Singular Value Decomposition
Strongly agree. I even searched to see I wasn't missing it. I mean yeah "SVD" is likely singular value decomposition, but in this context you have other acronyms bouncing around your head (like support vector machine- just need to get rid of the m).
I'm surprised the authors just completely abandon the standard first-use notation for acronyms.
KSVD Algorithm:
https://legacy.sites.fas.harvard.edu/~cs278/papers/ksvd.pdf
k-SVD algorithm: https://en.wikipedia.org/wiki/K-SVD
This is great, and very relevant to some problems I've been looking around on white boards lately. Exceptionally well timed.
Basically find the primary eigenvectors.
It's not, though...
In sparse coding, you're generally using an over-complete set of vectors which decompose the data into sparse activations.
So, if you have a dataset of hundred dimensional vectors, you want to find a set of vectors where each vector is well described as a combination of ~4 of the "basis" vectors.