Dictionary Learning

If you have read any compressed sensing or sparse approximation papers, you will most likely have run into the statement 'you have a signal y which is sparse in a given basis/dictionary'. The question is just who gave you that basis/dictionary :D? In dictionary learning the tasks is exactly this. You have a bunch of signals, let's say K of them stored in a matrix Y, and you want to find a dictionary D with N atoms, to approximate all your signals with S atoms. So you want to find a factorisation of the matrix Y into the dictionary D and coefficients X, with S nonzeros in every column, such that the error in E is small.

Y=D*X + E, Y...(dxK), D....(dxN), X...(NxK), E...(dxK)

Then you can vary the question, how many atoms do I need for a given sparsity level and error, which sparsity level can I achieve, given the number of atoms and error? Also you can become more philosophic, how is the coherence of the atoms related to this, how can we characterise the usefulness of an atom?
As mentioned before these are all hard problems! Why? They are hard to treat theoretically because they are so non-linear and they are even hard to treat numerically as at some point you need to find sparse approximations to calculate the error, which is time consuming and depends on the algorithm you used.