Skip to contents

The sobol package generates low‑discrepancy Sobol sequences and is designed from the ground up for parameter space exploration.

This tutorial will take you from the simplest possible usage to advanced reproducibility and parallel workflows.

Installation

# From GitHub
devtools::install_github("alrobles/sobol")

1. The quickest path: sobol_design()

Imagine you are tuning a machine‑learning model with three hyperparameters:

  • learning rate in [0.0001, 0.1]
  • momentum in [0, 0.99]
  • dropout in [0, 0.5]

You want to explore the space with 200 well‑spread points. sobol_design() returns a data frame ready to be fed into your objective function.

design <- sobol_design(
  lower = c(learning_rate = 0.0001, momentum = 0.00, dropout = 0.0),
  upper = c(learning_rate = 0.1000, momentum = 0.99, dropout = 0.5),
  nseq   = 200
)

head(design)
#>   learning_rate   momentum    dropout
#> 1   0.001270703 0.29003906 0.36914062
#> 2   0.051220703 0.78503906 0.11914062
#> 3   0.076195703 0.04253906 0.49414062
#> 4   0.026245703 0.53753906 0.24414062
#> 5   0.038733203 0.90878906 0.30664062
#> 6   0.088683203 0.41378906 0.05664062

Points are in the exact ranges you specified:

summary(design)
#>  learning_rate          momentum           dropout         
#>  Min.   :0.0004902   Min.   :0.003867   Min.   :0.0009766  
#>  1st Qu.:0.0252701   1st Qu.:0.249917   1st Qu.:0.1252441  
#>  Median :0.0500500   Median :0.495967   Median :0.2495117  
#>  Mean   :0.0498549   Mean   :0.496779   Mean   :0.2491016  
#>  3rd Qu.:0.0748299   3rd Qu.:0.742017   3rd Qu.:0.3737793  
#>  Max.   :0.0996098   Max.   :0.988066   Max.   :0.4980469

The design is deterministic and space‑filling – already a big improvement over simple random or grid search.

# Use the design directly inside your optimisation loop
results <- purrr::pmap_dbl(design, ~ my_model(lr = ..1, mom = ..2, drop = ..3))

2. What makes sobol_design special?

Behind the scenes it calls sobol_points() to generate a Sobol sequence in the unit cube and then scales each column to your bounds.

  • Low discrepancy – points are more evenly distributed than random.
  • Reproducible – you’ll get the exact same design every time.
  • No wasted points – every evaluation adds information about the shape of your function.

Try it against a random Latin hypercube:

# Not run, but you can compare visual uniformity
plot(design$learning_rate, design$momentum, col = "steelblue",
     main = "Sobol design (200 points)")

3. Going one level deeper: raw points

If you already have your own scaling logic, or need the raw [0,1) points, use sobol_points() directly.

raw <- sobol_points(n = 512, dim = 4)
dim(raw)           # 512 rows, 4 columns
#> [1] 512   4
range(raw)         # values in [0, 1)
#> [1] 0.0000000 0.9980469

sobol_points() accepts an optional skip argument that lets you start from an arbitrary index – perfect for parallel workers (see below).

4. Incremental generation with sobol_generator()

Sometimes you don’t know in advance how many points you’ll need. Maybe you want to evaluate a few, check convergence, then generate more. That’s where the stateful generator shines.

gen <- sobol_generator(dimensions = 3)

# Generate one point
sobol_next(gen)
#> [1] 0 0 0

# Generate a batch of 50
batch <- sobol_next_n(gen, n = 50)
dim(batch)  # 50 x 3
#> [1] 50  3

# What’s the current index?
sobol_index(gen)
#> [1] 51

You can also jump to any position:

sobol_skip_to(gen, 1000)
sobol_index(gen)
#> [1] 1000

This is the key to parallel and restart‑friendly workflows.

5. Reproducibility and parallel optimisation

All sequences are deterministic. So two calls with the same parameters will always match:

a <- sobol_design(lower = c(p = 0), upper = c(p = 1), nseq = 32)
b <- sobol_design(lower = c(p = 0), upper = c(p = 1), nseq = 32)
identical(a, b)   # TRUE
#> [1] TRUE

To distribute work across multiple cores or machines, assign each a non‑overlapping skip interval.

  • Worker 1: points 0 – 999
  • Worker 2: points 1000 – 1999
  • Worker 3: points 2000 – 2999
# Worker 1
w1 <- sobol_design(lower = c(lr = 0.0001, mom = 0, drop = 0),
                   upper = c(lr = 0.1,    mom = 0.99, drop = 0.5),
                   nseq = 1000)  # implicitly starts at 0

# Worker 2 (needs raw points + skip to 1000)
raw2 <- sobol_points(n = 1000, dim = 3, skip = 1000)
# Then scale raw2 manually, or use sobol_design in the future with a skip argument

(A skip argument for sobol_design() is under consideration – once available, parallel designs become one‑liners.)

6. Advanced: chaining generators for adaptive sampling

A generator can be “rewound” at any time to re‑evaluate a segment:

gen <- sobol_generator(dimensions = 2)
first_10 <- sobol_next_n(gen, n = 10)

# Oops, need to re‑evaluate the first 10 with different parameters
sobol_skip_to(gen, 0)
replicated <- sobol_next_n(gen, n = 10)
identical(first_10, replicated)  # TRUE
#> [1] TRUE

7. Performance notes

The C++ engine is heavily optimised. Even 1 000 000 points in 10 dimensions complete in under a second on modern hardware, freeing you to spend time on your actual model.

For extremely high dimensions (>1000) the engine falls back to runtime generation – still fast, but initialisation takes a tick longer. Precomputed tables cover the first 1000 dimensions instantly.

Next steps

# Clean up
rm(design, raw, gen, a, b, first_10, replicated)

Acknowledgements

The sobol_design() function in this package was inspired by the sobol_design()

function from the pomp package by Aaron A. King et al. — an R package for statistical inference using partially observed Markov processes.

While the interface and purpose are similar, sobol is a ground-up reimplementation: the core algorithm is written from scratch in C++17 and exposed to R via Rcpp, with no shared code from pomp. We gratefully acknowledge Aaron King’s project as the original source of inspiration for the design of this interface.

References

  • Sobol, I.M. (1967). “On the distribution of points in a cube and the approximate evaluation of integrals”
  • Joe, S. and Kuo, F. Y. (2008). “Constructing Sobol sequences with better two-dimensional projections”

See Also

  • Core C++ Library - The underlying C++ implementation
  • Examples: inst/examples/usage_examples.R

That’s all you need to start exploring your parameter space smarter and faster.
Welcome to sobol!