photo
鈴木大慈        TITECHLOGO   PRESTO
Professor

Department of Mathematical Informatics
Graduate School of Information Science and Technology
The University of Tokyo

Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan

Room: Room No. 352, Faculty of Engineering Building 6 (map)
Postal Address: Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8656, JAPAN
Phone: +81-3-5841-6921
E-mail: e-mail

Topic

  • Online Asian Machine Learning School (OAMLS) 2021 "Deep Learning Theory and Optimization" slide: slide.
  • MLSS 2015 "Stochastic optimization" slides: slide1, slide2, slide3

Research Interests

I am interested in Machine Learning and Statistics, especially the following research topics.
  • Statistical learning theory
    • Deep learning
    • Kernel method
    • Nonparametric convergence analysis
  • Optimization
    • Stochastic optimization
    • Optimization for deep learning
  • Information geometry
    • Prior selection
    • Objective Bayes

CV

Publications and Presentations

New:
Journal papers (Refereed):
  1. Atsushi Nitanda, Denny Wu, and Taiji Suzuki: Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis. Journal of Statistical Mechanics: Theory and Experiment. 114010, 2022. DOI: 10.1088/1742-5468/ac98a8.
  2. Chihiro Watanabe, and Taiji Suzuki: Deep Two-Way Matrix Reordering for Relational Data Analysis. Neural Networks, Vol. 146, pp. 303--315, 2022. doi: https://doi.org/10.1016/j.neunet.2021.11.028. arXiv:2103.14203.
  3. Chihiro Watanabe, and Taiji Suzuki: Selective Inference for Latent Block Models. The Electronic Journal of Statistics. 15(1): 3137--3183 (2021). doi: https://doi.org/10.1214/21-EJS1853. arXiv:2005.13273.
  4. Atsushi Nitanda, Tomoya Murata, Taiji Suzuki: Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Convex Optimization. Knowledge and Information Systems (KAIS) journal, 63:2513--2539 (2021).
  5. Ami Takahashi and Taiji Suzuki: Bayesian optimization for estimating the maximum tolerated dose in Phase I clinical trials. Contemporary Clinical Trials Communications, 21:100753, 2021. doi: 10.1016/j.conctc.2021.100753.
  6. Ami Takahashi and Taiji Suzuki: Bayesian optimization design for finding a maximum tolerated dose combination in Phase I clinical trials. The International Journal of Biostatis, 2021. doi: 10.1515/ijb-2020-0147.
  7. Kazuma Tsuji, Taiji Suzuki: Estimation error analysis of deep learning on the regression problem on the variable exponent Besov space. The Electronic Journal of Statistics. 15(1): 1869--1908 (2021). DOI: 10.1214/21-EJS1828. (arXiv:2009.11285)
  8. Ami Takahashi and Taiji Suzuki: Bayesian optimization design for dose-finding based on toxicity and efficacy outcomes in phase I/II clinical trials. Pharmaceutical Statistics, 2020. doi: https://doi.org/10.1002/pst.2085.
  9. Chihiro Watanabe and Taiji Suzuki: Goodness-of-fit Test for Latent Block Models. Computational Statistics and Data Analysis, Volume 154, pp.107090, 2021. arXiv:1906.03886.
  10. Shaogao Lv, Zengyan Fan, Heng Lian, Taiji Suzuki, and Kenji Fukumizu: A reproducing kernel Hilbert space approach to high dimensional partially varying coefficient model. Computational Statistics and Data Analysis, Volume 152, December 2020, 107039.
  11. Masaaki Takada, Taiji Suzuki, and Hironori Fujisawa: Independently Interpretable Lasso for Generalized Linear Models. Neural Computation, Volume 32, Issue 6, May 13, 2020, 1168--1221. . (arXiv:1711.01796).
  12. Satoshi Hayakawa and Taiji Suzuki: On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces. Neural Networks, Volume 123, March 2020, pp. 343--361 . (arXiv version, arXiv:1905.09195).
  13. Taiji Suzuki: Fast Learning Rate of Non-Sparse Multiple Kernel Learning and Optimal Regularization Strategies. Electronic Journal of Statistics, Volume 12, Number 2 (2018), 2141--2192. doi:10.1214/18-EJS1399.
  14. Yuichi Mori and Taiji Suzuki: Generalized ridge estimator and model selection criteria in multivariate linear regression. Journal of Multivariate Analysis, volume 165, pages 243--261, May 2018. arXiv:1603.09458.
  15. Song Liu, Taiji Suzuki, Relator Raissa, Jun Sese, Masashi Sugiyama, and Kenji Fukumizu: Support Consistency of Direct Sparse-Change Learning in Markov Networks. The Annals of Statistics, vol. 45, no. 3, 959–990, 2017. DOI: 10.1214/16-AOS1470.
  16. Song Liu, Kenji Fukumizu and Taiji Suzuki: Learning Sparse Structural Changes in High-dimensional Markov Networks: A Review on Methodologies and Theories. Behaviormetrika. 44(1):265–286, 2017. DOI: 10.1007/s41237-017-0014-z.
  17. Yoshito Hirata, Kai Morino, Taiji Suzuki, Qian Guo, Hiroshi Fukuhara, and Kazuyuki Aihara: System Identification and Parameter Estimation in Mathematical Medicine: Examples Demonstrated for Prostate Cancer. Quantitative Biology, 2016, 4(1): 13--19. DOI: 10.1007/s40484-016-0059-0.
  18. Taiji Suzuki: Stochastic Alternating Direction Method of Multipliers for Structured Regularization. Journal of Japan Society of Computational Statistics, 28(2015), 105--124
  19. Taiji Suzuki, and Kazuyuki Aihara: Nonlinear System Identification for Prostate Cancer and Optimality of Intermittent Androgen Suppression Therapy. Mathematical Biosciences, vol. 245, issue 1, pp. 40--48, 2013.
  20. Taiji Suzuki, and Masashi Sugiyama: Fast learning rate of multiple kernel learning: trade-off between sparsity and smoothness. The Annals of Statistics, vol. 41, number 3, pp. 1381-1405, 2013. (arXiv version, arXiv:1203.0565)
  21. Taiji Suzuki: Improvement of Multiple Kernel Learning using Adaptively Weighted Regularization. JSIAM Letters, vol. 5, pp. 49--52, 2013.
  22. Masashi Sugiyama, Takafumi Kanamori, Taiji Suzuki, M. C. du Plessis, Song Liu, Ichiro Takeuchi: Density Difference Estimation. Neural Computation, 25(10): 2734--2775, 2013.
  23. Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya, Masashi Sugiyama, Relative Density-Ratio Estimation for Robust Distribution Comparison. Neural Computation, vol. 25, number 5, pp. 1324--1370, 2013.
  24. Takafumi Kanamori, Taiji Suzuki, and Masashi Sugiyama: Computational complexity of kernel-based density-ratio estimation: A condition number analysis. Machine Learning, vol. 90, pp. 431-460, 2013.
  25. Taiji Suzuki, and Masashi Sugiyama: Sufficient dimension reduction via squared-loss mutual information estimation. Neural Computation, vol. 25, pp. 725-758, 2013. (software (matlab))
  26. Takafumi Kanamori, Taiji Suzuki, and Masashi Sugiyama: Statistical analysis of kernel-based least-squares density-ratio estimation. Machine Learning, vol. 86, Issue 3, pp. 335-367, 2012.
  27. Takafumi Kanamori, Taiji Suzuki, and Masashi Sugiyama: f-divergence estimation and two-sample homogeneity test under semiparametric density-ratio models. IEEE Transactions on Information Theory, Vol. 58, Issue 2, pp. 708-720, 2012.
  28. Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori: Density ratio matching under the Bregman divergence: A unified framework of density ratio estimation. Annals of the Institute of Statistical Mathematics, vol. 11, pp. 1--36, 2011.
  29. Taiji Suzuki and Ryota Tomioka: SpicyMKL: A Fast Algorithm for Multiple Kernel Learning with Thousands of Kernels. Machine Learning, vol. 85, issue 1, pp. 77--108, 2011. (arXiv:0909.5026, METR, slide (pptm, pdf) in one-day workshop at ISM, software)
  30. Masashi Sugiyama, Taiji Suzuki, Yuta Itoh, Takafumi Kanamori, and Manabu Kimura: Least-Squares Two-Sample Test. Neural Networks, vol.24, no.7, pp.735--751, 2011.
  31. Ryota Tomioka, Taiji Suzuki, and Masashi Sugiyama: Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparse Learning. Journal of Machine Learning Research, 12(May):1537--1586, 2011. (arXiv:0911.4046)
  32. Masashi Sugiyama, Makoto Yamada, Paul von Bunau, Taiji Suzuki, Takafumi Kanamori, and Motoaki Kawanabe: Direct density-ratio estimation with dimensionality reduction via least-squares hetero-distributional subspace search. Neural Networks, vol.24, no.2, pp.183-198, 2011.
  33. Taiji Suzuki, Nicholas Bruchovsky, and Kazuyuki Aihara: Piecewise Affine Systems Modelling for Optimizing Hormonal Therapy of Prostate Cancer. Philosophical Transactions A of the Royal Society, 368 (2010), 5045--5059.
  34. Taiji Suzuki, and Masashi Sugiyama: Least-squares Independent Component Analysis. Neural Computation, 23(1) (2011), 284--301. (software)
  35. Masashi Sugiyama, and Taiji Suzuki: Least-squares independence test. IEICE Transactions on Information and Systems, vol.E94-D, no.6, pp.1333-1336, 2011.
  36. Takafumi Kanamori, Taiji Suzuki, and Masashi Sugiyama: Theoretical analysis of density ratio estimation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol.E93-A, no.4, pp.787--798, 2010.
  37. Masashi Sugiyama, Ichiro Takeuchi, Takafumi Kanamori, Taiji Suzuki, Hirotaka Hachiya, and Daisuke Okanohara: Least-squares conditional density estimation. IEICE Transactions on Information and Systems, vol.E93-D, no.3, pp.583-594, 2010.
  38. Masashi Sugiyama, Takafumi Kanamori, Taiji Suzuki, Shohei Hido, Jun Sese, Ichiro Takeuchi, and Liwei Wang: A density-ratio framework for statistical data processing. IPSJ Transactions on Computer Vision and Applications, 1 (2009), 183--208.
  39. Taiji Suzuki, Masashi Sugiyama, Takafumi Kanamori, and Jun Sese: Mutual information estimation reveals global associations between stimuli and biological processes. BMC Bioinformatics, 10(Suppl 1):S52, 2009.
  40. Masashi Sugiyama, Taiji Suzuki, Shinichi Nakajima, Hisashi Kashima, Paul von Bunau, and Motoaki Kawanabe:
  41. Direct importance estimation for covariate shift adaptation. Annals of the Institute of Statistical Mathematics. 60(4) (2008), 699--746.
  42. Taiji Suzuki, and Fumiyasu Komaki: On prior selection and covariate shift of $\beta$-Bayesian prediction under $\alpha$-divergence risk. Communications in Statistics --- Theory and Methods, 39(8) (2010), 1655--1673.
  43. Akimichi Takemura, and Taiji Suzuki: Game-Theoretic Derivation of Discrete Distributions and Discrete Pricing Formulas. Journal of Japan Statistical Society, 37 (1) (2006), 87--104.
  44. Taiji Suzuki, Satoshi Aoki, and Kazuo Murota: Use of primal-dual technique in the network algorithm for two-waycontingency tables. Japan Journal of Industrial and Applied Mathematics, 22 (1) (2005), 133--145. (Errata)
International Conference papers (Refereed):
  1. Koopman-based generalization bound: New aspect for full-rank weights Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, Atsushi Nitanda, Taiji Suzuki. The Twelfth International Conference on Learning Representations (ICLR2024), accepted.
  2. Wei Huang, Ye Shi, Zhongyi Cai, Taiji Suzuki: Understanding Convergence and Generalization in Federated Learning through Feature Learning Theory. The Twelfth International Conference on Learning Representations (ICLR2024), accepted.
  3. Keita Suzuki, Taiji Suzuki: Optimal criterion for feature learning of two-layer linear neural network in high dimensional interpolation regime. The Twelfth International Conference on Learning Representations (ICLR2024), accepted.
  4. Yuto Nishimura, Taiji Suzuki: Minimax optimality of convolutional neural networks for infinite dimensional input-output problems and separation from kernel methods. The Twelfth International Conference on Learning Representations (ICLR2024), accepted.
  5. Atsushi Nitanda, Kazusato Oko, Taiji Suzuki, Denny Wu: Anisotropy helps: improved statistical and computational complexity of the mean-field Langevin dynamics under structured data. The Twelfth International Conference on Learning Representations (ICLR2024), accepted.
  6. Juno Kim, Kakei Yamamoto, Kazusato Oko, Zhuoran Yang, Taiji Suzuki: Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems. The Twelfth International Conference on Learning Representations (ICLR2024), accepted.
  7. Taiji Suzuki, Denny Wu, Atsushi Nitanda: Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS2023), accepted. (arXiv:2306.07221)
  8. Alireza Mousavi-Hosseini · Denny Wu · Taiji Suzuki · Murat Erdogdu: Gradient-Based Feature Learning under Structured Data. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS2023), accepted. (arXiv:2309.03843)
  9. Jimmy Ba, Murat Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu: Learning in the Presence of Low-dimensional Structure: A Spiked Random Matrix Perspective. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS2023), accepted.
  10. Taiji Suzuki, Denny Wu, Kazusato Oko, Atsushi Nitanda: Feature learning via mean-field Langevin dynamics: classifying sparse parities and beyond. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS2023), accepted.
  11. Shuhei Nitta, Taiji Suzuki, Albert Rodr\'iguez Mulet, Atsushi Yaguchi, and Ryusuke Hirai: Scalable Federated Learning for Clients with Different Input Image Sizes and Numbers of Output Categories. Proceedings of the 22nd IEEE International Conference on Machine Learning and Applications (ICMLA2023), 2023.
  12. Kazusato Oko, Shunta Akiyama, Taiji Suzuki: Diffusion Models are Minimax Optimal Distribution Estimators. Proceedings of the 40th International Conference on Machine Learning (ICML2023), PMLR 202:26517--26582, 2023. (presented also in ICLR2023 workshop, ME-FoMo 2023) arXiv: arXiv:2303.01861.
  13. Tomoya Murata, Taiji Suzuki: DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning. Proceedings of the 40th International Conference on Machine Learning (ICML2023), PMLR 202:25523--25548, 2023. arXiv: arXiv:2302.03884.
  14. Atsushi Nitanda, Kazusato Oko, Denny Wu, Nobuhito Takenouchi, Taiji Suzuki: Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems. Proceedings of the 40th International Conference on Machine Learning (ICML2023), PMLR 202:26266--26282, 2023. arXiv: arXiv:2303.02957.
  15. Shokichi Takakura, Taiji Suzuki: Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input. Proceedings of the 40th International Conference on Machine Learning (ICML2023), PMLR 202:33416--33447, 2023.
  16. Atsushi Suzuki, Atsushi Nitanda, Taiji Suzuki, Jing Wang, Feng Tian, Kenji Yamanishi. Tight and fast generalization error bound of graph embedding in metric space. Proceedings of the 40th International Conference on Machine Learning, PMLR 202:33268--33284, 2023.
  17. Hiroaki Kingetsu, Kenichi Kobayashi, Taiji Suzuki: Neural Network Module Decomposition and Recomposition with Superimposed Masks. 2023 International Joint Conference on Neural Networks (IJCNN), accepted.
  18. Shunta Akiyama, Taiji Suzuki: Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods. The Eleventh International Conference on Learning Representations (ICLR2023). arXiv: arXiv:2205.14818.
  19. Taiji Suzuki, Atsushi Nitanda, Denny Wu: Uniform-in-time propagation of chaos for the mean field gradient Langevin dynamics. The Eleventh International Conference on Learning Representations (ICLR2023).
  20. Kazusato Oko, Shunta Akiyama, Tomoya Murata, Taiji Suzuki: Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning. Accepted by OPT2022 (14th International OPT Workshop on Optimization for Machine Learning in NeurIPS2022). arXiv: arXiv:2209.00361.
  21. Kishan Wimalawarne, Taiji Suzuki: Layer-wise Adaptive Graph Convolution Networks Using Generalized Pagerank. Proceedings of The 14th Asian Conference on Machine Learning (ACML2023), PMLR 189:1117-1132, 2023. arXiv: arXiv:2108.10636.
  22. Naoki Nishikawa, Taiji Suzuki, Atsushi Nitanda, Denny Wu: Two-layer neural network on infinite dimensional data: global optimization guarantee in the mean-field regime. Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp.32612--32623, 2022.
  23. Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu, Greg Yang: High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation. Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp.37932--37946, 2022. arXiv: arXiv:2205.01445.
  24. Yuri Kinoshita, Taiji Suzuki: Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization. Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp.19022--19034, 2022. arXiv: arXiv:2203.16217.
  25. Tomoya Murata, Taiji Suzuki: Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning. Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp.5039--5051, 2022. arXiv: arXiv:2202.06083.
  26. Chenyuan Xu, Kosuke Haruki, Taiji Suzuki, Masahiro Ozawa, Kazuki Uematsu, Ryuji Sakai: Data-Parallel Momentum Diagonal Empirical Fisher (DP-MDEF):Adaptive Gradient Method is Affected by Hessian Approximation and Multi-Class Data. IEEE 2022 International Conference on Machine Learning and Applications. pp.1397--1404. DOI: 10.1109/ICMLA55696.2022.00221.
  27. Hiroaki Mikami, Kenji Fukumizu, Shogo Murai, Shuji Suzuki, Yuta Kikuchi, Taiji Suzuki, Shin-ichi Maeda, Kohei Hayashi: A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective? roceedings of Machine Learning and Knowledge Discovery in Databases (ECML-PKDD 2022), Part III. Springer Lecture Notes in Computer Science, 13715, pp.477--492, 2022. DOI: https://doi.org/10.1007/978-3-031-26409-2_29. arXiv: arXiv:2108.11018.
  28. Boris Muzellec, Kanji Sato, Mathurin Massias, Taiji Suzuki: Dimension-free convergence rates for gradient Langevin dynamics in RKHS. Proceedings of Thirty Fifth Conference on Learning Theory (COLT2022), PMLR 178:1356-1420, 2022. arXiv: arXiv:2003.00306.
  29. Kengo Machida, Kuniaki Uto, Koichi Shinoda, Taiji Suzuki: MSR-DARTS: Minimum Stable Rank of Differentiable Architecture Search. 2022 International Joint Conference on Neural Networks (IJCNN). DOI:10.1109/ijcnn55064.2022.9892751. arXiv: arXiv:2009.09209.
  30. Sho Okumoto and Taiji Suzuki: Learnability of convolutional neural networks for infinite dimensional input via mixed and anisotropic smoothness. The Tenth International Conference on Learning Representations (ICLR2022), spotlight presentation.
  31. Kazusato Oko, Taiji Suzuki, Atsushi Nitanda, and Denny Wu: Particle Stochastic Dual Coordinate Ascent: Exponential convergent algorithm for mean field neural network optimization. The Tenth International Conference on Learning Representations (ICLR2022).
  32. Jimmy Ba, Murat A Erdogdu, Marzyeh Ghassemi, Shengyang Sun, Taiji Suzuki, Denny Wu, and Tianzong Zhang: Understanding the Variance Collapse of SVGD in High Dimensions. The Tenth International Conference on Learning Representations (ICLR2022).
  33. Atsushi Nitanda, Denny Wu, Taiji Suzuki: Convex Analysis of the Mean Field Langevin Dynamics. 25th International Conference on Artificial Intelligence and Statistics (AISTATS2022), Proceedings of Machine Learning Research, 151:9741--9757, 2022. arXiv:2201.10469.
  34. Chihiro Watanabe, Taiji Suzuki: AutoLL: Automatic Linear Layout of Graphs based on Deep Neural Network. IEEE Symposium Series on Computational Intelligence (SSCI 2021), DOI: 10.1109/SSCI50451.2021.9659893. arXiv:2108.02431
  35. Atsushi Nitanda, Denny Wu, Taiji Suzuki: Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 34:19608--19621, 2021. arXiv: arXiv:2012.15477.
  36. Taiji Suzuki, Atsushi Nitanda: Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov space. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 34:3609--3621, 2021 (spotlight, 3% of all submissions). arXiv: arXiv:1910.12799.
  37. Stefano Massaroli, Michael Poli, Sho Sonoda, Taiji Suzuki, Jinkyoo Park, Atsushi Yamashita, Hajime Asama: Differentiable Multiple Shooting Layers. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 34:16532--16544, 2021. arXiv: arXiv:2106.03885.
  38. Shunta Akiyama, Taiji Suzuki: On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting. ICML2021, PMLR 139:152--162, 2021.
  39. Akira Nakagawa, Keizo Kato, Taiji Suzuki: Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding. ICML2021, PMLR 139:7916--7926, 2021.
  40. Tomoya Murata, Taiji Suzuki: Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning. ICML2021, PMLR 139:7872--7881, 2021. arXiv: arXiv:2102.03198.
  41. Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta, Yukinobu Sakata, Akiyuki Tanizawa: Decomposable-Net: Scalable Low-Rank Compression for Neural Networks. IJCAI-2021, Main Track, Pages 3249--3256, 2021. DOI: https://doi.org/10.24963/ijcai.2021/447. arXiv:1910.13141.
  42. Shingo Yashima, Atsushi Nitanda, Taiji Suzuki: Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features. AISTATS2021, PMLR 130:1954--1962, 2021. arXiv:1911.05350.
  43. Tomoya Murata, and Taiji Suzuki: Gradient Descent in RKHS with Importance Labeling. AISTATS2021, PMLR 130:1981--1989, 2021. arXiv:2006.10925.
  44. Taiji Suzuki, Shunta Akiyama: Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods. ICLR2021 (selected as splotlight). (arXiv version: arXiv:2012.03224). presentation slide.
  45. Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu: When Does Preconditioning Help or Hurt Generalization? ICLR2021. (arXiv version: arXiv:2006.10732).
  46. Atsushi Nitanda, and Taiji Suzuki: Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime. ICLR2021 (selected as oral presentation and won an outstanding paper award (8 papers out of 860 accepted papers, 2997 submitted papers)). (arXiv version: arXiv:2006.12297).
  47. Taiji Suzuki: Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics. Advances in Neural Information Processing Systems 33 (NeurIPS 2020), pp.19224--19237, 2020. (selected as spotlight). arXiv:2007.05824. presentation slide.
  48. Kenta Oono, and Taiji Suzuki: Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks. Advances in Neural Information Processing Systems 33 (NeurIPS 2020), pp.18917--18930, 2020. arXiv:2006.08550.
  49. Laurent Dillard, Yosuke Shinya, Taiji Suzuki: Domain Adaptation Regularization for Spectral Pruning. BMVC2020 (British Machine Vision Conference 2020), 2020.
  50. Taiji Suzuki, Hiroshi Abe, Tomoya Murata, Shingo Horiuchi, Kotaro Ito, Tokuma Wachi, So Hirai, Masatoshi Yukishima, Tomoaki Nishimura: Spectral pruning: Compressing deep neural networks via spectral analysis and its generalization error. IJCAI-PRICAI 2020, pp. 2839--2846. (long version: arXiv:1808.08558).
  51. Atsushi Nitanda, Taiji Suzuki: Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees. AISTATS2020, Proceedings of Machine Learning Research, 108:2981--2991, 2020.
  52. Jingling Li, Yanchao Sun, Ziyin Liu, Taiji Suzuki and Furong Huang: Understanding of Generalization in Deep Learning via Tensor Methods. AISTATS2020, Proceedings of Machine Learning Research, 108:504--515, 2020. Presented also in ICML2019 Workshop "Understanding and Improving Generalization in Deep Learning."
  53. Jimmy Ba, Murat Erdogdu, Taiji Suzuki, Denny Wu, Tianzong Zhang: Generalization of Two-layer Neural Networks: An Asymptotic Viewpoint. ICLR2020, selected as spotlight.
  54. Kenta Oono and Taiji Suzuki: Graph Neural Networks Exponentially Lose Expressive Power for Node Classification. ICLR2020, selected as spotlight. arXiv:1905.10947.
  55. Taiji Suzuki, Hiroshi Abe, Tomoaki Nishimura: Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network, ICLR2020, selected as spotlight. (slide). arXiv:1909.11274.
  56. Yosuke Shinya, Edgar Simo-Serra, and Taiji Suzuki: Understanding the Effects of Pre-training for Object Detectors via Eigenspectrum. ICCV2019, Neural Architects Workshop (selected for the shortlist of strongest papers). arXiv:1909.04021.
  57. Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki: Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Convex Optimization. ICDM2019, 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 2019, pp. 488-497. Regular paper, Nominated as Best Paper Candidate.
  58. Kenta Oono and Taiji Suzuki: Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks. ICML2019, Proceedings of Machine Learning Research, 97:4922--4931, 2019. (arXiv:1903.10047).
  59. Jingling Li, Yanchao Sun, Ziyin Liu, Taiji Suzuki and Furong Huang: Understanding of Generalization in Deep Learning via Tensor Methods. ICML2019 Workshop "Understanding and Improving Generalization in Deep Learning."
  60. Heishiro Kanagawa, Hayato Kobayashi, Nobuyuki Shimizu, Yukihiro Tagami, and Taiji Suzuki: Cross-domain Recommendation via Deep Domain Adaptation. 41st European Conference on Information Retrieval (ECIR2019), Advances in Information Retrieval, pp. 20--29, 2019. arXiv:1803.03018.
  61. Atsushi Nitanda, Taiji Suzuki: Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors. AISTATS2019, Proceedings of Machine Learning Research, PMLR 89:1417-1426, 2019. arXiv:1806.05438.
  62. Taiji Suzuki: Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality. The 7th International Conference on Learning Representations (ICLR2019), accepted. [The proof of Proposition 4 can be found here (provided by Satoshi Hayakawa who pointed out the technical flaw).] (arXiv:1810.08033).
  63. Kazuo Yonekura, Hitoshi Hattori, and Taiji Suzuki: Short-term local weather forecast using dense weather station by deep neural network. In Proceedings of 2018 IEEE International Conference on Big Data (Big Data), pp.10--13, 2018. DOI: 10.1109/BigData.2018.8622195.
  64. Tomoya Murata, and Taiji Suzuki: Sample Efficient Stochastic Gradient Iterative Hard Thresholding Method for Stochastic Sparse Linear Regression with Limited Attribute Observation. Advances in Neural Information Processing Systems 31 (NeurIPS2018), pp.5312--5321, 2018. arXiv:1809.01765.
  65. Atsushi Yaguchi, Taiji Suzuki, Wataru Asano, Shuhei Nitta, Yukinobu Sakata, Akiyuki Tanizawa: Adam Induces Implicit Weight Sparsity in Rectifier Neural Networks. In Proceedings of IEEE 17th International Conference on Machine Learning and Applications (ICMLA 2018), pp.17--20, 2018. DOI: 10.1109/ICMLA.2018.00054.
  66. Atsushi Nitanda and Taiji Suzuki: Functional gradient boosting based on residual network perception. ICML2018, Proceedings of the 35th International Conference on Machine Learning, 80:3819--3828, 2018. arXiv:1802.09031.
  67. Atsushi Nitanda and Taiji Suzuki: Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models. AISTATS2018, Proceedings of Machine Learning Research, 84:454--463, 2018. arXiv:1801.02227.
  68. Masaaki Takada, Taiji Suzuki, and Hironori Fujisawa: Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables. AISTATS2018, Proceedings of Machine Learning Research, 84:1008--1016, 2018. arXiv:1711.01796.
  69. Taiji Suzuki: Fast generalization error bound of deep learning from a kernel perspective. AISTATS2018, Proceedings of Machine Learning Research, 84:1397--1406, 2018. arXiv:1705.10182.
  70. Song Liu, Akiko Takeda, Taiji Suzuki and Kenji Fukumizu: Trimmed Density Ratio Estimation. NIPS2017, 4518--4528, 2017. arXiv:1703.03216.
  71. Tomoya Murata and Taiji Suzuki: Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization. NIPS2017, 608--617, 2017. arXiv:1703.00439.
  72. Atsushi Nitanda and Taiji Suzuki: Stochastic Difference of Convex Algorithm and its Application to Training Deep Boltzmann Machines. The 20th International Conference on Artificial Intelligence and Statistics (AISTATS2017), Proceedings of Machine Learning Research, 54:470--478, 2017.
  73. Taiji Suzuki, Heishiro Kanagawa, Hayato Kobayashi, Nobuyuki Shimizu, and Yukihiro Tagami: Minimax Optimal Alternating Minimization for Kernel Nonparametric Tensor Learning. The 30th Annual Conference on Neural Information Processing Systems (NIPS2016), pp. 3783-3791, 2016.
  74. Heishiro Kanagawa, Taiji Suzuki, Hayato Kobayashi, Nobuyuki Shimizu, and Yukihiro Tagami: Gaussian process nonparametric tensor estimator and its minimax optimality. Proceedings of The 33rd International Conference on Machine Learning, pp. 1632–1641, 2016.
  75. Song Liu, Taiji Suzuki, Masashi Sugiyama, and Kenji Fukumizu: Structure Learning of Partitioned Markov Networks. International Conference on Machine Learning (ICML2016), Proceedings of The 33rd International Conference on Machine Learning, pp. 439–448, 2016.
  76. Taiji Suzuki and Heishiro Kanagawa: Bayes method for low rank tensor estimation. International Meeting on “High-Dimensional Data Driven Science” (HD3-2015). Dec. 14th-17th/2015, Kyoto Japan. Oral presentation. Journal of Physics: Conference Series, 699(1), pp. 012020, 2016.
  77. Taiji Suzuki: Convergence rate of Bayesian tensor estimator and its minimax optimality. The 32nd International Conference on Machine Learning (ICML2015), JMLR Workshop and Conference Proceedings 37:pp. 1273--1282, 2015.
  78. Satoshi Hara, Tetsuro Morimura, Toshihiro Takahashi, Hiroki Yanagisawa, Taiji Suzuki: A Consistent Method for Graph Based Anomaly Localization. The 18th International Conference on Artificial Intelligence and Statistics (AISTATS2015), JMLR Workshop and Conference Proceedings 38:333--341, 2015.
  79. Song Liu, Taiji Suzuki, and Masashi Sugiyama: Support Consistency of Direct Sparse-Change Learning in Markov Networks. The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI2015), 2015. (arXiv:1407.0581).
  80. Taiji Suzuki: Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers. International Conference on Machine Learning (ICML2014), JMLR Workshop and Conference Proceedings 32(1):736--744, 2014. supplementary. (arXiv version: arXiv:1311.0622) This paper was also presented in OPT2013, NIPS workshop "Optimization for Machine Learning". Source code (Matlab).
  81. Ryota Tomioka, and Taiji Suzuki: Convex Tensor Decomposition via Structured Schatten Norm Regularization. Advances in Neural Information Processing Systems (NIPS2013), 1331--1339, 2013.
  82. Taiji Suzuki: Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method. International Conference on Machine Learning (ICML2013), 2013, JMLR Workshop and Conference Proceedings 28(1): 392--400, 2013. Source code (Matlab).
  83. Masashi Sugiyama, Takafumi Kanamori, Taiji Suzuki, Marthinus du Plessis, Song Liu, and Ichiro Takeuchi: Density-Difference Estimation . Advances in Neural Information Processing Systems (NIPS2012), 692--700, 2012.
  84. Taiji Suzuki: PAC-Bayesian Bound for Gaussian Process Regression and Multiple Kernel Additive Model. Conference on Learning Theory (COLT2012), 2012, JMLR Workshop and Conference Proceedings 23: 8.1--8.20, 2012. (slide)
  85. Takafumi Kanamori, Akiko Takeda and Taiji Suzuki: A Conjugate Property between Loss Functions and Uncertainty Sets in Classification Problems. Conference on Learning Theory (COLT2012), 2012, JMLR Workshop and Conference Proceedings 23: 29.1--29.23, 2012.
  86. Taiji Suzuki and Masashi Sugiyama: Fast Learning Rate of Multiple Kernel Learning: Trade-off between Sparsity and Smoothness. Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS2012), (selected as oral presentation). JMLR Workshop and Conference Proceedings 22: 1152--1183, 2012. (long version, arXiv:1203.0565)
  87. Taiji Suzuki: Unifying Framework for Fast Learning Rate of Non-Sparse Multiple Kernel Learning. Advances in Neural Information Processing Systems 24 (NIPS2011). pp.1575--1583. (long version, arXiv:1111.3781)
  88. Ryota Tomioka, Taiji Suzuki, Kohei Hayashi and Hisashi Kashima: Statistical Performance of Convex Tensor Decomposition. Advances in Neural Information Processing Systems 24 (NIPS2011). pp.972--980.
  89. Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya and Masashi Sugiyama: Relative Density-Ratio Estimation for Robust Distribution Comparison. Advances in Neural Information Processing Systems 24 (NIPS2011). pp.594--602. (long version, arXiv:1106.4729)
  90. Ryota Tomioka and Taiji Suzuki: Regularization Strategies and Empirical Bayesian Learning for MKL. NIPS2010 Workshop: New Directions in Multiple Kernel Learning, 2010. (arXiv:1011.3090)
  91. Ryota Tomioka, Taiji Suzuki, Masashi Sugiyama and Hisashi Kashima: A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices. 27th International Conference on Machine Learning International (ICML2010). pp.1087--1094. (pdf)
  92. Taiji Suzuki and Masashi Sugiyama: Sufficient dimension reduction via squared-loss mutual information estimation. Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS2010). JMLR Workshop and Conference Proceedings 9: pp.781--788, 2010. (pdf)
  93. Masashi Sugiyama, Ichiro Takeuchi, Takafumi Kanamori, Taiji Suzuki, Hirotaka Hachiya, and Daisuke Okanohara: Conditional density estimation via least-squares density ratio estimation. Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS2010). JMLR Workshop and Conference Proceedings 9: pp.804--811, 2010. (pdf)
  94. Masashi Sugiyama, Satoshi Hara, Paul von Bunau, Taiji Suzuki, Takafumi Kanamori, and Motoaki Kawanabe: Direct density ratio estimation with dimensionality reduction. 2010 SIAM International Conference on Data Mining (SDM2010). pp.595--606. (pdf)
  95. Ryota Tomioka and Taiji Suzuki: Sparsity-accuracy trade-off in MKL. NIPS 2009 Workshop :: Understanding Multiple Kernel Learning Methods, Whistler, Canada. (T. Suzuki presented) (arXiv:1001.2615)
  96. Ryota Tomioka, Taiji Suzuki, and Masashi Sugiyama: Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparse Learning. NIPS 2009 Workshop :: Optimization for Machine Learning, Whistler, Canada. (arXiv:0911.4046)
  97. Taiji Suzuki, Masashi Sugiyama, and Toshiyuki Tanaka: Mutual information approximation via maximum likelihood estimation of density ratio. 2009 IEEE International Symposium on Information Theory (ISIT2009). pp.463--467, Seoul, Korea, 2009.
  98. Taiji Suzuki, and Masashi Sugiyama: Estimating Squared-loss Mutual Information for Independent Component Analysis. ICA 2009. Paraty, Brazil, 2009. Lecture Notes in Computer Science, Vol. 5441, pp.130--137, Berlin, Springer, 2009.
  99. Taiji Suzuki, Masashi Sugiyama, Takafumi Kanamori and Jun Sese: Mutual information estimation reveals global associations between stimuli and biological processes. In Proceedings of the seventh asia pacific bioinformatics conference (APBC 2009). Beijing, China, 2009.
  100. Taiji Suzuki, Masashi Sugiyama, Jun Sese, and Takafumi Kanamori: Approximating mutual information by maximum likelihood density ratio estimation. In Proceedings of the 3rd workshop on new challenges for feature selection in data mining and knowledge discovery (FSDM2008), JMLR workshop and conference proceedings, Vol. 4, pp.5--20, 2008.
  101. Taiji Suzuki, Masashi Sugiyama, Jun Sese, and Takafumi Kanamori: A least-squares approach to mutual information estimation with application in variable selection. In Proceedings of the 3rd workshop on new challenges for feature selection in data mining and knowledge discovery (FSDM2008). Antwerp, Belgium, 2008.
  102. Taiji Suzuki, Takamasa Koshizen, Kazuyuki Aihara and Hiroshi Tsujino: Learning to estimate user interest utilizing the variational Bayes estimator. Intelligent Systems Design and Applications (ISDA) 2005, 94--99. Wroclaw, Poland, September 2005.
  103. Tetsuya Hoya, Gen Hori, Havagim Bakardjian, Tomoaki Nishimura, Taiji Suzuki, Yoichi Miyawaki, Arao Funase, and Jianting Cao: Classification of Single Trial EEG Signals by a Combined Principal + Independent Component Analysis and Probabilistic Neural Network Approach. Proc. ICA2003, pp. 197-202. Nara, Japan, January 2003.
Book:
  • Masashi Sugiyama, Taiji Suzuki, & Takafumi Kanamori: Density Ratio Estimation in Machine Learning. Cambridge University Press, 2012.
Invited Talk:
Technical Report:
  • Taiji Suzuki, Ryota Tomioka, Masashi Sugiyama: Fast Convergence Rate of Multiple Kernel Learning with Elastic-net Regularization. arXiv:1103.0431. (slide in Japanese)
  • Taiji Suzuki, Ryota Tomioka, and Masashi Sugiyama: Sharp Convergence Rate and Support Consistency of Multiple Kernel Learning with Sparse and Dense Regularization. arXiv:1103.5201.
  • Taiji Suzuki: Fast Learning Rate of lp-MKL and its Minimax Optimality. arXiv:1103.5202.
  • Taiji Suzuki, and Ryota Tomioka: SpicyMKL. arXiv:0909.5026, METR. (slide (pptm, pdf) in one-day workshop at ISM, software)
  • Taiji Suzuki, Satoshi Aoki and Kazuo Murota: Use of primal-dual technique in the network algorithm for two-waycontingency tables. METR 2004-28, Department of Mathematical Informatics, University of Tokyo, May 2004. (pdf) (Errata)
  • Akimichi Takemura and Taiji Suzuki: Game Theoretic Derivation of Discrete Distributions and Discrete Pricing Formulas. METR2005-25, Department of Mathematical Informatics, University of Tokyo, September 2005. (pdf)
Article in Book:
  • Ryota Tomioka, Taiji Suzuki, & Masashi Sugiyama: Augmented Lagrangian methods for learning, selecting, and combining features. In S. Sra, S. Nowozin, and S. J. Wright (Eds.), Optimization for Machine Learning, MIT Press, Cambridge, MA, USA, 2011.
  • Masashi Sugiyama, Taiji Suzuki, & Takafumi Kanamori: Density ratio estimation: A comprehensive review. In Statistical Experiment and Its Related Topics, Research Institute for Mathematical Sciences Kokyuroku, no.1703, pp.10-31, 2010. (Presented at Research Institute for Mathematical Sciences Workshop on Statistical Experiment and Its Related Topics, Kyoto, Japan, Mar. 8-10, 2010)
  • Ryota Tomioka, Taiji Suzuki, & Masashi Sugiyama: Optimization algorithms for sparse regularization and multiple kernel learning and their applications to image recognition. Image Lab, vol.21, no.4, pp.5-11, 2010.
Symposium:
Award:
  • Atsushi Nitanda, and Taiji Suzuki: ICLR2021, Outstanding paper award. "Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime."
  • Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki: ICDM '19 Best Paper Candidate for KAIS Publication, IEEE International Conference on Data Mining (ICDM), 2019.
  • Taiji Suzuki: The Japan Society for Industrial and Applied Mathematics, Best paper award 2016. Improvement of Multiple Kernel Learning using Adaptively Weighted Regularization.
  • Taiji Suzuki: IBISML (Information-Based Induction Sciences and Machine Learning), Best paper award 2012 (2012年度IBISML研究会賞). Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method.
  • Taiji Suzuki: 情報理工学系研究科長賞 東京大学大学院情報理工学系研究科,2009年.
  • Taiji Suzuki: 情報理工学系研究科長賞 東京大学大学院情報理工学系研究科,2006年.
  • MIRU優秀論文賞, Meeting on Image Recognition and Understanding 2008 (MIRU2008), 2008年 "Direct Importance Estimation - A New Versatile Tool for Statistical Pattern Recognition" Masashi Sugiyama (Tokyo Institute of Technology), Takafumi Kanamori (Nagoya University), Taiji Suzuki (University of Tokyo), Shohei Hido (IBM Research), Jun Sese (Ochanomizu University), Ichiro Takeuchi (Mie University), and Liwei Wang (Peking University).
Domestic Conference and Meeting:
    list (in Japanese).

Miscellaneous materials