N. Ailon, R. Jaiswal, and C. Monteleoni, Streaming k -means approximation, Advances in Neural Information Processing Systems (NIPS), pp.10-18, 2009.

D. Aloise, A. Deshpande, P. Hansen, and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Machine Learning, vol.75, pp.245-248, 2009.

J. Anderson, M. Belkin, N. Goyal, L. Rademacher, and J. Voss, The more, the merrier: the blessing of dimensionality for learning large gaussian mixtures, Conference on Learning Theory, pp.1135-1164, 2014.

C. Andrieu and A. Doucet, Online expectation-maximization type algorithms for parameter estimation in general state space models, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.6, p.69, 2003.

R. Arora, A. Cotter, K. Livescu, and N. Srebro, Stochastic optimization for pca and pls, 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp.861-868, 2012.

D. Arthur and S. Vassilvitskii, k-means++: The Advantages of Careful Seeding, ACM-SIAM symposium on Discrete algorithms, pp.1027-1035, 2007.

D. Arthur and S. Vassilvitskii, k-means++ -the advantages of careful seeding. SODA, 2007.

F. Bach, On the equivalence between kernel quadrature rules and random feature expansions, Journal of Machine Learning Research, vol.18, issue.21, pp.1-38, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01118276

A. Balsubramani, S. Dasgupta, and Y. Freund, The fast convergence of incremental pca, Advances in Neural Information Processing Systems, vol.26, pp.3174-3182, 2013.

R. Baraniuk, Compressive sensing, IEEE Signal Processing Magazine, vol.24, issue.4, pp.118-121, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00452261

R. Baraniuk, M. Davenport, R. A. Devore, and M. B. Wakin, A simple proof of the restricted isometry property for random matrices, Constr. Approx, vol.28, issue.3, pp.253-263, 2008.

M. Belkin and K. Sinha, Polynomial learning of distribution families, IEEE 51st Annual Symposium on Foundations of Computer Science. Ieee, p.16, 2010.

K. Bertin, E. L. Pennec, and V. Rivoirard, Adaptive Dantzig density estimation, vol.47, pp.43-74, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00634423

A. Bietti and J. , On the Inductive Bias of Neural Tangent Kernels, Advances in Neural Information Processing Systems (NIPS), pp.1-23, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02144221

M. Binkowski, D. J. Sutherland, M. Arbel, A. Gretton, . Demystifying et al., , pp.1-30, 2018.

G. Blanchard, O. Bousquet, and L. Zwald, Statistical properties of kernel principal component analysis, Machine Learning, vol.66, pp.259-294, 2007.

A. Bourrier, M. Davies, T. Peleg, P. Perez, and R. Gribonval, Fundamental performance limits for ideal decoders in high-dimensional linear inverse problems. Information Theory, IEEE Transactions on, vol.60, issue.12, pp.7928-7946, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00908358

E. Candès, T. Strohmer, and V. Voroninski, PhaseLift: Exact and Stable Signal Recovery from Magnitude Measurements via Convex Programming, Comm. Pure Appl. Math, vol.66, issue.8, pp.1241-1274, 2013.

E. J. Candès, The restricted isometry property and its implications for compressed sensing, Comptes Rendus Mathematique, vol.346, issue.9, pp.589-592, 2008.

E. J. Candès, J. Romberg, and T. Tao, Stable Signal Recovery from Incomplete and Inaccurate Measurements, Comm. Pure Appl. Math, vol.59, pp.1207-1223, 2006.

O. Cappé and E. Moulines, Online EM Algorithm for Latent Data Models, Journal of the Royal Statistical Society, vol.71, issue.3, pp.593-613, 2009.

M. Carrasco and J. Florens, Generalization of GMM to a continuum of moment conditions. Econometric Theory, 2000.

M. Carrasco and J. Florens, Efficient GMM estimation using the empirical characteristic function, vol.140, 2002.

M. Carrasco and J. Florens, On The Asymptotic Efficiency Of GMM, Econometric Theory, vol.30, issue.02, pp.372-406, 2014.

A. Cohen, W. Dahmen, and R. Devore, Compressed sensing and best k-term approximation, J. Amer. Math. Soc, vol.22, issue.1, 2009.

G. Cormode and M. Hadjieleftheriou, Methods for finding frequent items in data streams, The VLDB Journal, vol.19, issue.1, pp.3-20, 2009.

G. Cormode and S. Muthukrishnan, An improved data stream summary: the count-min sketch and its applications, J. Algorithms, vol.55, issue.1, pp.58-75, 2005.

G. Cormode and S. Muthukrishnan, An improved data stream summary: the count-min sketch and its applications, Journal of Algorithms, vol.55, issue.1, pp.58-75, 2005.

G. Cormode, M. Garofalakis, P. J. Haas, and C. Jermaine, Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches. Foundations and Trends in Databases, 4(xx, pp.1-294, 2011.

T. M. Cover and J. A. Thomas, Elements of Information Theory, 1991.

S. Dirksen, Dimensionality reduction with subgaussian matrices: a unified theory, Foundations of Computational Mathematics, vol.16, issue.5, pp.1367-1396, 2016.

D. L. Donoho, Compressed sensing, IEEE Trans. Information Theory, vol.52, issue.4, pp.1289-1306, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00369486

J. C. Duchi, M. I. Jordan, and M. J. Wainwright, Privacy Aware Learning, Journal of the ACM, vol.61, issue.6, 2014.

R. M. Dudley, Real Analysis and Probability, 2002.

A. Eftekhari and M. B. Wakin, New analysis of manifold embeddings and signal recovery from compressive measurements, Applied and Computational Harmonic Analysis, vol.39, issue.1, pp.67-109, 2015.

J. B. Estrach, A. Szlam, and Y. Lecun, Signal recovery from Pooling Representations. ICML, 2014.

K. Fan, On a Theorem of Weyl Concerning Eigenvalues of Linear Transformations I, Proc. Nat. Aca. Sci, vol.35, issue.11, pp.652-655, 1949.

D. Feldman and M. Langberg, A unified framework for approximating and clustering data, Proceedings of the forty-third annual ACM symposium on Theory of computing, pp.569-578, 2011.

D. Feldman, M. Monemizadeh, C. Sohler, and D. P. Woodruff, Coresets and Sketches for High Dimensional Subspace Approximation Problems, vol.1, pp.630-649, 2010.

D. Feldman, M. Faulkner, and A. Krause, Scalable Training of Mixture Models via Coresets, Proceedings of Neural Information Processing Systems, pp.1-9, 2011.

A. Feuerverger and R. A. Mureika, The Empirical Characteristic Function and Its Applications, Annals of Statistics, vol.5, issue.1, pp.88-97, 1977.

S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing, 2012.

G. Frahling and C. Sohler, A fast k -means implementation using coresets, Proceedings of the twentysecond annual symposium on Computational geometry (SoCG), vol.18, pp.605-625, 2005.

M. Gabrié, A. Manoel, C. Luneau, J. Barbier, N. Macris et al., Entropy and mutual information in models of deep neural networks, Advances in Neural Information and Processing Systems (NIPS), 2018.

M. R. Garey, D. S. Johnson, and H. S. Witsenhausen, The complexity of the generalized Lloyd -Max problem, IEEE Trans. Inf. Theory, vol.28, issue.2, pp.255-256, 1982.

M. Ghashami, D. Perry, and J. M. Phillips, Streaming Kernel Principal Component Analysis. International Conference on Artificial Intelligence and Statistics, vol.41, pp.1-16, 2016.

A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. J. Strauss, How to summarize the universe: dynamic maintenance of quantiles, VLDB '02: Proceedings of the 28th international conference on Very Large Data Bases, pp.454-465, 2002.

A. C. Gilbert, Y. Zhang, K. Lee, Y. Zhang, and H. Lee, Towards understanding the invertibility of convolutional neural networks, Proceedings of the 26th International Joint Conference on Artificial Intelligence, vol.17, p.17031710, 2017.

R. Giryes, G. Sapiro, and A. M. Bronstein, Deep Neural Networks with Random Gaussian Weights -A Universal Classification Strategy?, IEEE Trans. Signal Processing, 2016.

A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. J. Smola, A Kernel Method for the Two-Sample Problem, Advances in Neural Information Processing Systems (NIPS), pp.513-520, 2007.

R. Gribonval, G. Blanchard, N. Keriven, and Y. Traonmilin, Statistical Learning Guarantees for Compressive Clustering and Compressive Mixture Modeling, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02536818

S. Guha and E. , Clustering Data Streams, 2000.

A. R. Hall, Generalized method of moments, vol.0198775210, 2005.

P. R. Halmos, Measure theory, vol.18, 2013.

S. Har-peled and S. Mazumdar, Coresets for k-Means and k-Median Clustering and their Applications, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pp.291-300, 2004.

D. Hsu and S. M. Kakade, Learning mixtures of spherical gaussians: moment methods and spectral decompositions, Conference on Innovations in Theoretical Computer Science, 2013.

A. Jacot, F. Gabriel, and C. Hongler, Neural Tangent Kernel: Convergence and Generalization in Neural Networks, Advances in Neural Information Processing Systems (NIPS), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01824549

M. Kabanava, R. Kueng, H. Rauhut, and U. Terstiege, Stable low-rank matrix recovery via null space properties, Information and Inference, vol.5, issue.4, pp.405-441, 2016.

N. Keriven, A. Bourrier, R. Gribonval, and P. Pérèz, Sketching for Large-Scale Learning of Mixture Models, IEEE International Conference on Acoustic, Speech and Signal Processing, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01208027

N. Keriven, A. Bourrier, R. Gribonval, and P. Pérèz, Sketching for Large-Scale Learning of Mixture Models, pp.1-50, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01208027

H. J. Landau, Moments in mathematics, 1987.

C. Levrard, Fast rates for empirical vector quantization, Electronic Journal of Statistics, vol.7, issue.0, pp.1716-1746, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00664068

Y. Li, K. Swersky, and R. Zemel, Generative Moment Matching Networks, Proceedings of The 32nd International Conference on Machine Learning, vol.37, pp.1718-1727, 2015.

M. Lucic, M. Faulkner, A. Krause, and D. Feldman, Training Mixture Models at Scale via Coresets, 2017.

J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online Learning for Matrix Factorization and Sparse Coding, Journal of Machine Learning Research, vol.11, issue.1, pp.19-60, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00408716

P. Massart, Concentration Inequalities and Model Selection, Lecture Notes in Mathematics, vol.1896, 2007.

W. K. Newey and D. Mcfadden, Large sample estimation and hypothesis testing, Handbook of Econometrics, vol.4, pp.80005-80009

I. Pinelis, An approach to inequalities for the distributions of infinite-dimensional martingales, Probability in Banach Spaces, 8, Proceedings of the 8th International Conference, vol.30, pp.128-134, 1992.

G. Puy, M. E. Davies, and R. Gribonval, Recipes for Stable Linear Embeddings From Hilbert Spaces to R m, IEEE Trans. Information Theory, vol.63, issue.4, pp.2171-2187, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01203614

A. Rahimi and B. Recht, Random Features for Large Scale Kernel Machines, Advances in Neural Information Processing Systems (NIPS), pp.1-8, 2007.

A. Rahimi and B. Recht, Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning, Advances in Neural Information Processing Systems (NIPS), vol.1, pp.1-8, 2009.

M. Reiß and M. Wahl, Non-asymptotic upper bounds for the reconstruction error of pca, 2016.

A. Rudi, R. Camoriano, and L. Rosasco, Less is more: Nyström computational regularization, Advances in Neural Information Processing Systems, pp.1657-1665, 2015.

V. Schellekens, A. Chatalic, F. Houssiau, Y. De-montjoye, L. Jacques et al., Differentially Private Compressive k-Means, ICASSP 2019 -44th International Conference on Acoustics, Speech, and Signal Processing, pp.7933-7937, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02060208

J. Shawe-taylor, C. K. Williams, N. Cristianini, and J. Kandola, On the eigenspectrum of the gram matrix and the generalization error of kernel-pca, IEEE Transactions on Information Theory, vol.51, issue.7, pp.2510-2522, 2005.

R. Shwartz-ziv and N. Tishby, Opening the Black Box of Deep Neural Networks via Information, 2017.

A. J. Smola, A. Gretton, L. Song, and B. Schölkopf, A Hilbert Space Embedding for Distributions, International Conference on Algorithmic Learning Theory, pp.13-31, 2007.

B. K. Sriperumbudur and Z. Szabó, Optimal Rates for Random Fourier Features, NIPS, 2015.

B. K. Sriperumbudur, A. Gretton, K. Fukumizu, B. Schölkopf, and G. R. Lanckriet, Hilbert space embeddings and metrics on probability measures, The Journal of Machine Learning Research, vol.11, pp.1517-1561, 2010.

N. Thaper, S. Guha, P. Indyk, and N. Koudas, Dynamic multidimensional histograms, SIGMOD '02: Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp.428-439, 2002.