# Kernel Klub

The Spatial Statistics and Kernel Klub meets Wednesdays from 1-2 in Chauvenet Hall room 143 and virtually to discuss spatial statistics and kernel methods.

See below for a schedule and speakers.

## Spring 2023

January 25 | Alex Vidal, Mines Title: An Optimal Transport Approach to Continuous Normalizing Flows Abstract: A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the mapping and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, alpha, that controls the strength of the soft penalty and requires significant tuning. We present an algorithm to solve OT-based CNF without the need for tuning alpha. Instead of tuning alpha, we repeatedly solve the optimization problem for a fixed alpha effectively performing a JKO update with a time-step alpha. Hence we obtain a "divide and conquer" algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large alpha. |

February 1 | Luis Tenorio, Mines Dr. Tenorio will continue the discussion on the maximum mean discrepancy (MMD) metric that Alex brought up at the last week's meeting. In addition, he will be presenting a probabilistic perspective on the paper titled "Injective Hilbert Space Embeddings of Probability Measures" by B. K. Sriperumbudur, A. Gretton, K. Fukumizu, G. Lanckroit, and B. Scholkopf. "Injective Hilbert Space Embeddings of Probability Measures" |

February 8 | No Scheduled Meeting |

February 15 | Zach Grey Title: Separable Shape Tensors: Generative Modeling of Discrete Planar Curves Abstract: We'll discuss a novel approach to dimension reduction and representation of shape boundaries using sequences of discrete landmarks sampled from curves. The advantage of our interpretation is decoupling affine-style deformations of landmarks over a submanifold and a product submanifold principally of the Grassmannian. As an "analytic" generative model of geometries, our separable data-driven representation offers: (i) a rich set of novel planar deformations not previously captured in the data, (ii) an improved parameter domain for inferential statistics, (iii) robust interpolation of curves for novel surface parametrization, and (iv) a notion of consistent deformation over distinct shapes. Some motivating applications include the design of next-generation offshore wind turbine blades in collaboration with the National Renewable Energy Laboratory (NREL) and two-sample hypothesis testing with empirical distributions of grain boundaries in material microstructure measurements. |

February 22 | Mike Wakin Title: When Randomness Helps in Undersampling Abstract: Signals cannot always be sampled at their full desired resolution. In this talk, we explore the benefits of randomly subsampling a signal’s frequency spectrum. Whereas uniform subsampling introduces structural artifacts in the time series, random subsampling introduces a type of noise whose behavior we quantify. This analysis draws brings together concepts such as frequency analysis, aliasing, and convolution with probability and statistics. Building on the idea of spectral subsampling, we also propose a compressive approach for estimating a Green’s function in seismic interferometry, a task that typically requires cross-correlating very long time series that are difficult to gather, store, and transmit in resource-constrained scenarios. This is joint work with Roel Snieder and Justin Jayne. |

March 1 | |

March 8 | |

March 15 | |

March 29 | |

April 5 | Samy Wu Fung, Mines Title: Stable Diffusion |

April 12 | |

April 19 | Jake Rezac Title: Applications of sparse optimization to inverse scattering and wireless communications Abstract: We discuss generalizations of a classical signal processing technique and apply them to two inverse problems: estimating the angle-of-arrival of a radio wave impinging on an array of antennas and estimating the location and shape of an inhomogeneity from scattered acoustic waves. We discuss this signal processing technique, the MUSIC algorithm (Multiple Signal Classification), as a method to relate unknowns-of-interest to the range of a matrix containing measurements of a relevant physical quantity in each of these applications. We build two new algorithms on this relationship, one which improves computational efficiency and one which heavily reduces measurement requirements. Each new algorithm is based on a technique from sparse optimization. We demonstrate these techniques on measured and simulated examples. |

April 26 | Michael Ivanitskiy and Ibrohim Nosirov, Mines Title: Sparks of Artificial General Intelligence: Early experiments with GPT-4 |

May 3 |

## Spring 2022

##### Spring 2022 Schedule

February 4 | Doug Nychka, Mines Title: Fast prediction of Kriging estimates to a grid and why one would care |

February 11 | Samy Wu Fung, Mines Title: A Review of Stochastic Gradient Descent and its Application in Deep Learning |

February 18 | Greg Fasshauer, Mines Title: Double Descent Abstract: The big picture idea for the success of modern deep learning algorithms proposed by Belkin and others is that the use of over-parametrized models allow us to extend the traditional “bias-variance trade-off” to a more general setting which opens the door for models that generalize better (while still retaining high training accuracy). In some ways, reproducing kernel Hilbert spaces appear to be a good place for us to study this “theory.” |

February 25 | Luis Tenorio, Mines Title: Double Descent in Linear Least-Squares Abstract: an extension to last week's talk given by Dr. Greg Fasshauer. Here the focus will be to derive some of the results of the paper by Hastie et al. and to understand why the double descent follows from the properties of random matrices. |

March 4 | Luis Tenorio, Mines Title: Random Matrix Theory This will be a follow-up to his talk last week on “Double Descent in Least-Squares." |

March 11 | Samy Wu Fung, Mines Title: A GAN-based approach for solving high-dimensional mean field games Abstract: We present an alternating population and agent control neural network for solving stochastic mean field games (MFGs). Our algorithm is geared toward high-dimensional instances of MFGs that are beyond reach with existing solution methods. We achieve this in two steps. First, we take advantage of the underlying variational primal-dual structure that MFGs exhibit and phrase it as a convex-concave saddle point problem. Second, we parameterize the value and density functions by two neural networks, respectively. By phrasing the problem in this manner, solving the MFG can be interpreted as a special case of training a generative adversarial network (GAN). We show the potential of our method on up to 100-dimensional MFG problems. |

March 18 | Dr. Monique Chyba, University of Hawaii Title: Fuel-minimal rendezvous missions with a large population of temporarily captured orbiters Abstract: Control theory pertains with achieving a prescribed goal for systems whose behavior can be influenced by directly controlling some inputs. Moreover, in most cases we wish to govern the system in an efficient way. This is known as optimal control. Due to the nature of this definition, optimal control is to be found everywhere including for instance exploring the ocean with autonomous underwater vehicles as well as understanding and developing medical protocols or morphogenesis. A main objective of this presentation is to use tools from optimal control theory to assess the feasibility of space missions to a new population of near Earth asteroids which temporarily orbit Earth, called temporarily captured orbiter. Rendezvous missions to a large random sample from a database of over 16,000 simulated temporarily captured orbiters have been designed using an indirect method based on the maximum principle. The main contribution of this work present here is to overcome the difficulty in initializing the algorithm with the construction of the so-called cloud of extremals. |

April 1 | Maggie Bailey, Mines Title: Adapting conditional simulation using circulant embedding for irregularly spaced spatial data. Abstract: Computing an ensemble of random fields using conditional simulation is an ideal method for retrieving accurate estimates of a field conditioned on available data and for quantifying the uncertainty of these realizations. Methods for generating random realizations, however, are computationally demanding, especially when the estimates are conditioned on numerous observed data and for large domains. In this talk, a new, approximate conditional simulation approach is applied that builds on circulant embedding (CE), a fast method for simulating stationary Gaussian processes. The standard CE is restricted to simulating stationary Gaussian processes (possibly anisotropic) on regularly spaced grids. We explore two possible algorithms, namely local Kriging and approximate grid embedding, that extend CE for irregularly spaced data points. We establish the accuracy of these methods to be suitable for practical inference and the speedup in computation allows for generating conditional fields close to an interactive time frame. The methods are motivated by the U.S. Geological Survey's software ShakeMap, which provides near real-time maps of shaking intensity after the occurrence of a significant earthquake. An example for the 2019 event in Ridgecrest, California is used to illustrate our method. |

April 8 | Ibrohim Nosirov, Mines Title: Randomized Numerical Linear Algebra. Abstract: Probabilistic, or randomized, algorithms have proven themselves particularly useful in numerical linear algebra, thanks largely due to their speed and scalability. We briefly survey the theoretical foundations of randomized algorithms for common linear algebra computations, as presented in a monograph by P.G. Martinsson and Joel Tropp. More specifically, we consider trace estimation by sampling, Schatten p-norm estimation by sampling, and matrix approximation by sampling. We also briefly discuss methods for kernel matrix estimation in the context of machine learning. |

April 15 | Michael Ivanitskiy, Mines Title: Transformer Networks and their Application to Reinforcement Learning Abstract: This talk will be broken into approximately two parts - the first half will introduce transformer neural nets and the attention head mechanism. The second half will cover using them in RL, both as decision transformers and as world models in model-based RL. |

April 22 | Meng Jia, Mines Title: Neural processes where Gaussian processes meet neural networks. Abstract: Gaussian processes define a distribution over possible functions and is updated by new data. It’s a probabilistic model which provides both the predictions and the corresponding uncertainties. However, its applications are limited due to the expensive computations. On the other hand, a neural network is an accurate and efficient model to learn the relationship between inputs and outputs but does not quantify the uncertainty of its predictions naturally. Neural processes combine the benefit from both models by using a neural network to approximate a Gaussian process. In my talk, I’ll first introduce the theory and then show an application on well logs interpolation, the major work of my intern last summer. |

April 29 | No Meeting |

May 6 | Dr. Nicholas Fisher, Minnesota State University Title: Multiquadric Quasi-Interpolation Schemes for Time-Fractional Diffusion. Abstract: In this introductory talk we outline some of the basic concepts of fractional calculus including the definition of fractional derivatives and their approximation. We then investigate the time-fractional diffusion equation and propose novel multiquadric quasi-interpolation schemes for the numerical solution of this equation. The talk will conclude with an overview of areas of application of kernel-based approximation methods to fractional differential equations. |

## Fall 2021

##### Fall 2021 Schedule

September 17 | Doug Nychka, Mines Title: Connecting three kernels: Smoothers, Spatial Process Estimators and Splines Kriging is a ubiquitous, nonparametric regression method used in geostatistics and in uncertainty quantification for estimating curves and surfaces. The lack of a statistical large sample theory for these very useful methods is a contrast to the well-developed mathematical analysis of kernel smoothers. Everyone likes kernel smoothers (1) – why do Kriging? This talk outlines an approach to understand the mathematical properties of Kriging. It may come as a surprise that the Kriging estimate, normally derived as the best linear unbiased estimator for a Gaussian process, is also the solution of a particular penalized least squares problem. From this observation Kriging estimators can also be interpreted as generalized smoothing splines where the roughness penalty is determined by the covariance function (2) of the spatial process. Generalized splines can be approximated by a kernel (3) estimator and this allows one to close the loop. Under some circumstances Kriging can be approximated by a kernel smoother. The kernel is complicated, however, so just do Kriging! Blood flow is governed by the incompressible Navier-Stokes equations, a set of non-linear equations that are regarded as computationally expensive to solve. Since the blood coagulation process happens over the time scale of tens of minutes, parallelization techniques are necessary to minimize overall computation time. We will present a method for decomposing the H-shaped extravascular injury domain so that the Navier-Stokes equations can be solved in parallel on multiple cores using distributed memory. |

September 24 | No Meeting |

October 1 | John Paige, Norwegian University of Science and Technology (NTNU) Title: Bayesian Multiresolution Modeling of Georeferenced Data: An Extension of `LatticeKrig' Abstract: The `LatticeKrig' (LK) model is a spatial model that is often used for modeling multiresolution spatial data with flexible covariance structures. An extension to LK under a Bayesian framework is proposed that uses integrated nested Laplace approximations (INLA). The extension enables the spatial analysis of non-Gaussian responses in latent Gaussian models, joint spatial modeling with structured and unstructured random effects, and native support for multithreaded parallel likelihood computation. The proposed extended LatticeKrig (ELK) model uses a reparameterization of LK so that the parameters and prior selection are intuitive and interpretable. Priors can be used to make inference robust by penalizing more complex models, and integration over model parameters allows for posterior uncertainty estimates that account for uncertainty in covariance parameters. Through a simulation study with a realistic mix of short and long scale dependence, the ability of ELK and LK to model multiple spatial correlation scales is demonstrated along with the ability to improve pointwise and areal predictions both near and far from observations. The predictions of ELK are compared to those of LK for a set of 188,717 LiDAR forest canopy height observations in Bonanza Creek Experimental Forest in Alaska, where the considered ELK models include a nonlinear covariate effect as a first order random walk, and use a gamma response model, neither of which would be possible under LK. ELK achieves better central predictions and uncertainty characterization (according to the interval score), while also achieving faster computation time due in part to its using 8 threads for parallel computation, relative to a LK model assuming Gaussian responses and linear covariate effects. Lastly, ELK's ability to account for non-Gaussian responses is leveraged when predicting population averages of prevalence of secondary education completion for women in Kenya collected in the 2014 Kenya demographic health survey. ELK has a modest reduction in spatial oversmoothing and improvement in prediction relative to a Matérn model. |

October 8 | No Meeting |

October 15 | No Meeting |

October 22 | David Kozak Title: Statistical dependence, Kernels, and the Hilbert-Schmidt Independence Criterion. Abstract: Statistical dependence quantifies the amount of information that one random variable contains about another. Quantifying dependence structure in practice using only observations of the random variables is a notoriously challenging problem that is still unresolved. In this talk we give a brief introduction to statistical dependence, providing several equivalent definitions and including a brief discussion about its relevance for diverse applications. We introduce the Hilbert-Schmidt Information Criterion (HSIC), an elegant framework that allows for hypothesis tests of dependence that is simple to implement and comes with theoretical guarantees. We explore some issues with the HSIC and will finish by discussing possible alternatives. |

October 29 | *2-3pm Will Kleiber, University of Colorado Title: The Basis Graphical Lasso Abstract: Many modern spatial models express the stochastic variation component as a basis expansion with random coefficients. Low rank models, approximate spectral decompositions, multiresolution representations, stochastic partial differential equations, and empirical orthogonal functions all fall within this basic framework. Given a particular basis, stochastic dependence relies on flexible modeling of the coefficients. Under a Gaussianity assumption, we propose a graphical model family for the stochastic coefficients by parameterizing the precision matrix. Sparsity in the precision matrix is encouraged using a penalized likelihood framework—we term this approach the basis graphical lasso. Computations follow from a majorization-minimization (MM) approach, a byproduct of which is a connection to the standard graphical lasso. The result is a flexible nonstationary spatial model that is adaptable to very large datasets with multiple realizations. We generalize the idea to multivariate processes and show that the framework flexibly handles dozens to hundreds of nonstationary spatial variables simultaneously. The method is illustrated on multiple datasets from statistical climatology, and yields appropriate and interpretable estimates of second-order structures. |

November 5 | Matt Picklo, Mines |

November 12 | Josh Keller, Colorado State University Title: Inferential Challenges with Spatial Data in (Air Pollution) Epidemiology Abstract: Many large-scale epidemiological studies investigate relationships between spatial and spatiotemporal exposures and adverse health outcomes. However, the spatiotemporal nature of these exposures can lead to inferential challenges including measurement error and unmeasured spatial confounding. Spatiotemporal prediction of exposures induces errors that can be correlated across space and lead to bias in point estimates and standard errors of estimated health effects. Unmeasured factors that vary spatially and impact health can further cause confounding bias that is difficult to diagnose. In this talk, I will present methods for addressing both of these challenges in analyses of regional and national cohort studies of air pollution exposure and birth, cardiovascular, and respiratory health outcomes. The limitations of these correction approaches highlight important aspects of study design that can mitigate the effects of measurement error and unmeasured spatial confounding on inference. |

November 19 | Snigdhansu (Ansu) Chatterjee, University of Minnesota Title: Bayesian equation selection for discovery of stochastic dynamical systems Natural processes like Earth's climate adhere to laws of science, that are often describable as systems of (partial, stochastic) differential equations, or stochastic dynamical systems. We present a Bayesian framework for discovering such dynamical systems under assumptions that align with real-life scenarios, including the availability of relatively sparse data. Further, we discuss computational strategies that are critical in teasing out the important details about the dynamical system. This gives a complete Bayesian pathway for model identification via a variable selection paradigm and parameter estimation of the corresponding model using only the observed data. We present detailed computations and analysis of the Lorenz- 96, Lorenz-63, and the Orstein-Uhlenbeck system using the Bayesian framework we propose. The proposed framework can be used for evaluation of the quality of climate models, and to identify the terms and relations that are significant in the Earth's climate system but may be missing or considered insignificant in a model. |

December 3 | Laura Albrecht, Mines |