Indexing and partitioning the spatial linear model for large data sets

Jay M. Ver Hoef; Michael Dumelle; Matt Higham; Erin E. Peterson; Daniel J. Isaak

doi:10.1371/journal.pone.0291906

What is it about?

Very large spatial data sets present problems for statistical modeling, such as regression, and for prediction, such as making maps. The problem occurs because we need to invert large matrices, and that is very time consuming, even for very large and fast computers. In this article, we show a fast and simple way to work with these large datasets, and still have valid regression models. We can also make maps where the uncertainty about the predictions is valid. These methods are available as open source software in an R package called spmodel.

Photo by Markus Spiske on Unsplash

Why is it important?

From satellite images to automated instruments, datasets keep getting larger and larger. Simple methods can be used to summarize these data. However, there is uncertainty in understanding relationships among variables, and in making predictions. This requires a statistical model that quantifies uncertainty in estimates and predictions so that we can make decisions in the face of risk. Thus, it is important to advance statistical methods for big spatial datasets as found in this paper.

Perspectives

This article has an interesting example. I wanted to have a method for big spatial data on stream networks, and that motivated the development of this method. It can work with any type of spatial data, and the ideas are easily extended to spatio-temporal models too.
Dr. Jay M. Ver Hoef
Alaska Fisheries Science Center, NOAA Fisheries

Spatial indexing provides a complete set of tools for model fitting and prediction of large spatial data and is readily available in the spmodel R package via the local argument to splm() (model fitting) and predict() (prediction).
Michael Dumelle
United States Environmental Protection Agency

This page is a summary of: Indexing and partitioning the spatial linear model for large data sets, PLoS ONE, November 2023, PLOS,
DOI: 10.1371/journal.pone.0291906.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page

Statistical methods for very large spatial datasets

What is it about?

Why is it important?

Perspectives

Resources

Original article on arXiv

spmodel R package

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Statistical methods for very large spatial datasets

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

Original article on arXiv

spmodel R package

Contributors

Share this page:

You might also like

Increasing transparency in machine learning through bootstrap simulation and shapely additive explanations

Bioenergetic costs and the evolution of noise regulation by microRNAs

Modeling bioaffinity‐based targeted delivery of antimicrobials to Escherichia coli biofilms using yeast microparticles. Part II: Parameter evaluation and validation

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management