# Kernel Klub

The Spatial Statistics and Kernel Klub meets Fridays from 1-2 in Chauvenet Hall room 143 and virtually to discuss spatial statistics and kernel methods.

See below for a schedule and speakers.

## Spring 2022

February 4 | Doug Nychka, Mines Title: Fast prediction of Kriging estimates to a grid and why one would care |

February 11 | Samy Wu Fung, Mines Title: A Review of Stochastic Gradient Descent and its Application in Deep Learning |

February 18 | Greg Fasshauer, Mines Title: Double Descent Abstract: The big picture idea for the success of modern deep learning algorithms proposed by Belkin and others is that the use of over-parametrized models allow us to extend the traditional “bias-variance trade-off” to a more general setting which opens the door for models that generalize better (while still retaining high training accuracy). In some ways, reproducing kernel Hilbert spaces appear to be a good place for us to study this “theory.” |

February 25 | Luis Tenorio, Mines Title: Double Descent in Linear Least-Squares Abstract: an extension to last week's talk given by Dr. Greg Fasshauer. Here the focus will be to derive some of the results of the paper by Hastie et al. and to understand why the double descent follows from the properties of random matrices. |

March 4 | Luis Tenorio, Mines Title: Random Matrix Theory This will be a follow-up to his talk last week on “Double Descent in Least-Squares." |

March 11 | Samy Wu Fung, Mines Title: A GAN-based approach for solving high-dimensional mean field games Abstract: We present an alternating population and agent control neural network for solving stochastic mean field games (MFGs). Our algorithm is geared toward high-dimensional instances of MFGs that are beyond reach with existing solution methods. We achieve this in two steps. First, we take advantage of the underlying variational primal-dual structure that MFGs exhibit and phrase it as a convex-concave saddle point problem. Second, we parameterize the value and density functions by two neural networks, respectively. By phrasing the problem in this manner, solving the MFG can be interpreted as a special case of training a generative adversarial network (GAN). We show the potential of our method on up to 100-dimensional MFG problems. |

March 18 | Dr. Monique Chyba, University of Hawaii Title: Fuel-minimal rendezvous missions with a large population of temporarily captured orbiters Abstract: Control theory pertains with achieving a prescribed goal for systems whose behavior can be influenced by directly controlling some inputs. Moreover, in most cases we wish to govern the system in an efficient way. This is known as optimal control. Due to the nature of this definition, optimal control is to be found everywhere including for instance exploring the ocean with autonomous underwater vehicles as well as understanding and developing medical protocols or morphogenesis. A main objective of this presentation is to use tools from optimal control theory to assess the feasibility of space missions to a new population of near Earth asteroids which temporarily orbit Earth, called temporarily captured orbiter. Rendezvous missions to a large random sample from a database of over 16,000 simulated temporarily captured orbiters have been designed using an indirect method based on the maximum principle. The main contribution of this work present here is to overcome the difficulty in initializing the algorithm with the construction of the so-called cloud of extremals. |

April 1 | Maggie Bailey, Mines Title: Adapting conditional simulation using circulant embedding for irregularly spaced spatial data. Abstract: Computing an ensemble of random fields using conditional simulation is an ideal method for retrieving accurate estimates of a field conditioned on available data and for quantifying the uncertainty of these realizations. Methods for generating random realizations, however, are computationally demanding, especially when the estimates are conditioned on numerous observed data and for large domains. In this talk, a new, approximate conditional simulation approach is applied that builds on circulant embedding (CE), a fast method for simulating stationary Gaussian processes. The standard CE is restricted to simulating stationary Gaussian processes (possibly anisotropic) on regularly spaced grids. We explore two possible algorithms, namely local Kriging and approximate grid embedding, that extend CE for irregularly spaced data points. We establish the accuracy of these methods to be suitable for practical inference and the speedup in computation allows for generating conditional fields close to an interactive time frame. The methods are motivated by the U.S. Geological Survey's software ShakeMap, which provides near real-time maps of shaking intensity after the occurrence of a significant earthquake. An example for the 2019 event in Ridgecrest, California is used to illustrate our method. |

April 8 | Ibrohim Nosirov, Mines Title: Randomized Numerical Linear Algebra. Abstract: Probabilistic, or randomized, algorithms have proven themselves particularly useful in numerical linear algebra, thanks largely due to their speed and scalability. We briefly survey the theoretical foundations of randomized algorithms for common linear algebra computations, as presented in a monograph by P.G. Martinsson and Joel Tropp. More specifically, we consider trace estimation by sampling, Schatten p-norm estimation by sampling, and matrix approximation by sampling. We also briefly discuss methods for kernel matrix estimation in the context of machine learning. |

April 15 | Michael Ivanitskiy, Mines Title: Transformer Networks and their Application to Reinforcement Learning Abstract: This talk will be broken into approximately two parts - the first half will introduce transformer neural nets and the attention head mechanism. The second half will cover using them in RL, both as decision transformers and as world models in model-based RL. |

April 22 | Meng Jia, Mines Title: Neural processes where Gaussian processes meet neural networks. Abstract: Gaussian processes define a distribution over possible functions and is updated by new data. It’s a probabilistic model which provides both the predictions and the corresponding uncertainties. However, its applications are limited due to the expensive computations. On the other hand, a neural network is an accurate and efficient model to learn the relationship between inputs and outputs but does not quantify the uncertainty of its predictions naturally. Neural processes combine the benefit from both models by using a neural network to approximate a Gaussian process. In my talk, I’ll first introduce the theory and then show an application on well logs interpolation, the major work of my intern last summer. |

April 29 | No Meeting |

May 6 | Dr. Nicholas Fisher, Minnesota State University Title: Multiquadric Quasi-Interpolation Schemes for Time-Fractional Diffusion. Abstract: In this introductory talk we outline some of the basic concepts of fractional calculus including the definition of fractional derivatives and their approximation. We then investigate the time-fractional diffusion equation and propose novel multiquadric quasi-interpolation schemes for the numerical solution of this equation. The talk will conclude with an overview of areas of application of kernel-based approximation methods to fractional differential equations. |

## Fall 2021

##### Fall 2021 Schedule

September 17 | Doug Nychka, Mines Title: Connecting three kernels: Smoothers, Spatial Process Estimators and Splines Kriging is a ubiquitous, nonparametric regression method used in geostatistics and in uncertainty quantification for estimating curves and surfaces. The lack of a statistical large sample theory for these very useful methods is a contrast to the well-developed mathematical analysis of kernel smoothers. Everyone likes kernel smoothers (1) – why do Kriging? This talk outlines an approach to understand the mathematical properties of Kriging. It may come as a surprise that the Kriging estimate, normally derived as the best linear unbiased estimator for a Gaussian process, is also the solution of a particular penalized least squares problem. From this observation Kriging estimators can also be interpreted as generalized smoothing splines where the roughness penalty is determined by the covariance function (2) of the spatial process. Generalized splines can be approximated by a kernel (3) estimator and this allows one to close the loop. Under some circumstances Kriging can be approximated by a kernel smoother. The kernel is complicated, however, so just do Kriging! Blood flow is governed by the incompressible Navier-Stokes equations, a set of non-linear equations that are regarded as computationally expensive to solve. Since the blood coagulation process happens over the time scale of tens of minutes, parallelization techniques are necessary to minimize overall computation time. We will present a method for decomposing the H-shaped extravascular injury domain so that the Navier-Stokes equations can be solved in parallel on multiple cores using distributed memory. |

September 24 | No Meeting |

October 1 | John Paige, Norwegian University of Science and Technology (NTNU) Title: Bayesian Multiresolution Modeling of Georeferenced Data: An Extension of `LatticeKrig' Abstract: The `LatticeKrig' (LK) model is a spatial model that is often used for modeling multiresolution spatial data with flexible covariance structures. An extension to LK under a Bayesian framework is proposed that uses integrated nested Laplace approximations (INLA). The extension enables the spatial analysis of non-Gaussian responses in latent Gaussian models, joint spatial modeling with structured and unstructured random effects, and native support for multithreaded parallel likelihood computation. The proposed extended LatticeKrig (ELK) model uses a reparameterization of LK so that the parameters and prior selection are intuitive and interpretable. Priors can be used to make inference robust by penalizing more complex models, and integration over model parameters allows for posterior uncertainty estimates that account for uncertainty in covariance parameters. Through a simulation study with a realistic mix of short and long scale dependence, the ability of ELK and LK to model multiple spatial correlation scales is demonstrated along with the ability to improve pointwise and areal predictions both near and far from observations. The predictions of ELK are compared to those of LK for a set of 188,717 LiDAR forest canopy height observations in Bonanza Creek Experimental Forest in Alaska, where the considered ELK models include a nonlinear covariate effect as a first order random walk, and use a gamma response model, neither of which would be possible under LK. ELK achieves better central predictions and uncertainty characterization (according to the interval score), while also achieving faster computation time due in part to its using 8 threads for parallel computation, relative to a LK model assuming Gaussian responses and linear covariate effects. Lastly, ELK's ability to account for non-Gaussian responses is leveraged when predicting population averages of prevalence of secondary education completion for women in Kenya collected in the 2014 Kenya demographic health survey. ELK has a modest reduction in spatial oversmoothing and improvement in prediction relative to a Matérn model. |

October 8 | No Meeting |

October 15 | No Meeting |

October 22 | David Kozak Title: Statistical dependence, Kernels, and the Hilbert-Schmidt Independence Criterion. Abstract: Statistical dependence quantifies the amount of information that one random variable contains about another. Quantifying dependence structure in practice using only observations of the random variables is a notoriously challenging problem that is still unresolved. In this talk we give a brief introduction to statistical dependence, providing several equivalent definitions and including a brief discussion about its relevance for diverse applications. We introduce the Hilbert-Schmidt Information Criterion (HSIC), an elegant framework that allows for hypothesis tests of dependence that is simple to implement and comes with theoretical guarantees. We explore some issues with the HSIC and will finish by discussing possible alternatives. |

October 29 | *2-3pm Will Kleiber, University of Colorado Title: The Basis Graphical Lasso Abstract: Many modern spatial models express the stochastic variation component as a basis expansion with random coefficients. Low rank models, approximate spectral decompositions, multiresolution representations, stochastic partial differential equations, and empirical orthogonal functions all fall within this basic framework. Given a particular basis, stochastic dependence relies on flexible modeling of the coefficients. Under a Gaussianity assumption, we propose a graphical model family for the stochastic coefficients by parameterizing the precision matrix. Sparsity in the precision matrix is encouraged using a penalized likelihood framework—we term this approach the basis graphical lasso. Computations follow from a majorization-minimization (MM) approach, a byproduct of which is a connection to the standard graphical lasso. The result is a flexible nonstationary spatial model that is adaptable to very large datasets with multiple realizations. We generalize the idea to multivariate processes and show that the framework flexibly handles dozens to hundreds of nonstationary spatial variables simultaneously. The method is illustrated on multiple datasets from statistical climatology, and yields appropriate and interpretable estimates of second-order structures. |

November 5 | Matt Picklo, Mines |

November 12 | Josh Keller, Colorado State University Title: Inferential Challenges with Spatial Data in (Air Pollution) Epidemiology Abstract: Many large-scale epidemiological studies investigate relationships between spatial and spatiotemporal exposures and adverse health outcomes. However, the spatiotemporal nature of these exposures can lead to inferential challenges including measurement error and unmeasured spatial confounding. Spatiotemporal prediction of exposures induces errors that can be correlated across space and lead to bias in point estimates and standard errors of estimated health effects. Unmeasured factors that vary spatially and impact health can further cause confounding bias that is difficult to diagnose. In this talk, I will present methods for addressing both of these challenges in analyses of regional and national cohort studies of air pollution exposure and birth, cardiovascular, and respiratory health outcomes. The limitations of these correction approaches highlight important aspects of study design that can mitigate the effects of measurement error and unmeasured spatial confounding on inference. |

November 19 | Snigdhansu (Ansu) Chatterjee, University of Minnesota Title: Bayesian equation selection for discovery of stochastic dynamical systems Natural processes like Earth's climate adhere to laws of science, that are often describable as systems of (partial, stochastic) differential equations, or stochastic dynamical systems. We present a Bayesian framework for discovering such dynamical systems under assumptions that align with real-life scenarios, including the availability of relatively sparse data. Further, we discuss computational strategies that are critical in teasing out the important details about the dynamical system. This gives a complete Bayesian pathway for model identification via a variable selection paradigm and parameter estimation of the corresponding model using only the observed data. We present detailed computations and analysis of the Lorenz- 96, Lorenz-63, and the Orstein-Uhlenbeck system using the Bayesian framework we propose. The proposed framework can be used for evaluation of the quality of climate models, and to identify the terms and relations that are significant in the Earth's climate system but may be missing or considered insignificant in a model. |

December 3 | Laura Albrecht, Mines |