SingleRandomBatches

Move Type: Basic Complexity: O(objects × containers) with batched evaluation and early termination

Evaluate destination containers in random batches with parallel processing. Combines speed (early exit) with multi-threading. Use for very large problems where full exploration is too slow.

Overview

SingleRandomBatches evaluates single object moves but processes destination containers in random batches, stopping as soon as a batch contains an improving move. This move type is designed for parallelization - each batch of destinations can be evaluated concurrently.

Use when:

Problem is very large (100K+ objects, 1K+ containers)
Have multiple CPU cores available
Need faster convergence than Single
Can benefit from randomization and parallelism

Avoid when:

Problem is small/medium (use Single or SingleFast)
Single-threaded execution
Need deterministic results
Already using sampling approaches

Quick Example

Python
C++

# Batched random evaluation with parallelism
solver.addSolver(
    SolverSpecs(
        localSearchSolverSpec=LocalSearchSolverSpec(
            moveTypeList=[
                SingleRandomBatchesMoveTypeSpec(
                    randomContainerBatchSize=10,  # Evaluate 10 containers per batch
                ),
            ]
        )
    )
)

void quickExample() {
  // Batched random evaluation with parallelism
  auto executor = std::make_shared<folly::CPUThreadPoolExecutor>(4);
  ProblemSolver solver(executor, "example", "test");
  solver.setObjectName("task");
  solver.setContainerName("host");

  LocalSearchSolverSpec solverSpec;

  SingleRandomBatchesMoveTypeSpec randomBatchesSpec;
  randomBatchesSpec.randomContainerBatchSize() =
      10; // Evaluate 10 containers per batch

  solverSpec.moveTypeList()->emplace_back();
  solverSpec.moveTypeList()->back().singleRandomBatchesMoveTypeSpec() =
      randomBatchesSpec;

  solver.addSolver(solverSpec);

Parameters

Parameter	Type	Required	Default	Description
`randomContainerBatchSize`	int32	No	10	Number of destination containers to process in each batch

Parameter Details

randomContainerBatchSize:

Purpose: Controls parallelization granularity and early exit frequency
Small values (5-10): Exit earlier, less parallelism, faster response
Large values (50-100): More parallelism, better move quality, more work
Recommendation: Start with default (10), increase for better multi-core utilization

How It Works

Given a hot container (the container contributing most to the objective):

Select object: Pick object from hot container
Shuffle containers: Randomly order all destination containers
Create batch: Take next randomContainerBatchSize containers
Evaluate in parallel: Test moving object to each container in batch (multi-threaded)
Check for improvement: If any move in batch improves objective → apply best from batch and STOP
Next batch: If no improvement, take next batch and repeat
Next object: If all batches exhausted, try next object

Visual Example

Problem: 1,000 containers, batch size = 10

Object 1:
  Batch 1 (containers 1-10):    [evaluate in parallel] → No improvement
  Batch 2 (containers 11-20):   [evaluate in parallel] → Improvement found! ✓
  → Apply best move from Batch 2, STOP

Evaluated only 20 destinations instead of 1,000 (50x speedup!)

Batching Strategy

Traditional Single:
  obj1 → [c1, c2, c3, ..., c1000] (sequential, 1000 evaluations)

SingleRandomBatches (batch=10):
  obj1 → Batch 1: [c847, c23, c901, c456, c12, c789, c234, c567, c890, c345]
         ↓ Evaluate in parallel (multi-threaded)
         ↓ Found improvement → STOP (10 evaluations only!)

Complexity

Best case: O(batch_size) - First batch has improvement

Average case: O(objects × containers / k) where k = improvement frequency

Worst case: O(objects × containers) - No improvements found

Where:

batch_size = randomContainerBatchSize
k = number of batches before finding improvement

Example - Very large problem:

Hot container: 100,000 objects
System: 10,000 containers
Batch size: 10
Typical run: 1-10 batches per object = 10-100 destinations evaluated
Full Single would evaluate: 100,000 × 10,000 = 1 billion moves

Usage Patterns

Very Large Scale

For problems too large for regular Single:

Python
C++

# Very large problem: 100K objects, 10K containers
# Small batches for fast early exit
solver.addSolver(
    SolverSpecs(
        localSearchSolverSpec=LocalSearchSolverSpec(
            moveTypeList=[
                SingleRandomBatchesMoveTypeSpec(
                    randomContainerBatchSize=10,  # Quick early exit
                ),
            ]
        )
    )
)

void largeScale() {
  // Very large problem: 100K objects, 10K containers
  // Small batches for fast early exit
  auto executor = std::make_shared<folly::CPUThreadPoolExecutor>(4);
  ProblemSolver solver(executor, "example", "test");

  LocalSearchSolverSpec solverSpec;

  SingleRandomBatchesMoveTypeSpec randomBatchesSpec;
  randomBatchesSpec.randomContainerBatchSize() = 10; // Quick early exit

  solverSpec.moveTypeList()->emplace_back();
  solverSpec.moveTypeList()->back().singleRandomBatchesMoveTypeSpec() =
      randomBatchesSpec;

  solver.addSolver(solverSpec);

Tuning Batch Size

Adjust batch size for your workload:

Python
C++

# Small batch: faster early exit, less parallelism
solver.addSolver(
    SolverSpecs(
        localSearchSolverSpec=LocalSearchSolverSpec(
            moveTypeList=[
                SingleRandomBatchesMoveTypeSpec(
                    randomContainerBatchSize=5,  # Exit after 5 evaluations
                ),
            ]
        )
    )
)

# Large batch: more parallelism, better moves
solver.addSolver(
    SolverSpecs(
        localSearchSolverSpec=LocalSearchSolverSpec(
            moveTypeList=[
                SingleRandomBatchesMoveTypeSpec(
                    randomContainerBatchSize=50,  # Utilize more cores
                ),
            ]
        )
    )
)

void batchTuning() {
  // Adjust batch size for your workload
  auto executor = std::make_shared<folly::CPUThreadPoolExecutor>(4);
  ProblemSolver solver(executor, "example", "test");

  // Small batch: faster early exit, less parallelism
  LocalSearchSolverSpec stage1;

  SingleRandomBatchesMoveTypeSpec smallBatchSpec;
  smallBatchSpec.randomContainerBatchSize() = 5; // Exit after 5 evaluations

  stage1.moveTypeList()->emplace_back();
  stage1.moveTypeList()->back().singleRandomBatchesMoveTypeSpec() =
      smallBatchSpec;

  solver.addSolver(stage1);

  // Large batch: more parallelism, better moves
  LocalSearchSolverSpec stage2;

  SingleRandomBatchesMoveTypeSpec largeBatchSpec;
  largeBatchSpec.randomContainerBatchSize() = 50; // Utilize more cores

  stage2.moveTypeList()->emplace_back();
  stage2.moveTypeList()->back().singleRandomBatchesMoveTypeSpec() =
      largeBatchSpec;

  solver.addSolver(stage2);

Multi-Core Optimization

Maximize parallelism on multi-core systems:

Python
C++

# Large batch size to utilize 16 cores
solver.addSolver(
    SolverSpecs(
        localSearchSolverSpec=LocalSearchSolverSpec(
            moveTypeList=[
                SingleRandomBatchesMoveTypeSpec(
                    randomContainerBatchSize=32,  # 2x number of cores
                ),
            ]
        )
    )
)

void multicore() {
  // Large batch size to utilize 16 cores
  auto executor = std::make_shared<folly::CPUThreadPoolExecutor>(4);
  ProblemSolver solver(executor, "example", "test");

  LocalSearchSolverSpec solverSpec;

  SingleRandomBatchesMoveTypeSpec randomBatchesSpec;
  randomBatchesSpec.randomContainerBatchSize() = 32; // 2x number of cores

  solverSpec.moveTypeList()->emplace_back();
  solverSpec.moveTypeList()->back().singleRandomBatchesMoveTypeSpec() =
      randomBatchesSpec;

  solver.addSolver(solverSpec);

Combined Strategy

Use with other move types for balance:

Python
C++

# Stage 1: Fast improvement with batches
# Stage 2: Refinement with full Single
solver.addSolver(
    SolverSpecs(
        localSearchSolverSpec=LocalSearchSolverSpec(
            moveTypeList=[
                SingleRandomBatchesMoveTypeSpec(
                    randomContainerBatchSize=10,  # Quick wins
                ),
                SingleMoveTypeSpec(),  # Thorough refinement
            ]
        )
    )
)

void combined() {
  // Stage 1: Fast improvement with batches
  // Stage 2: Refinement with full Single
  auto executor = std::make_shared<folly::CPUThreadPoolExecutor>(4);
  ProblemSolver solver(executor, "example", "test");

  LocalSearchSolverSpec solverSpec;

  // Add SingleRandomBatches
  SingleRandomBatchesMoveTypeSpec randomBatchesSpec;
  randomBatchesSpec.randomContainerBatchSize() = 10; // Quick wins

  solverSpec.moveTypeList()->emplace_back();
  solverSpec.moveTypeList()->back().singleRandomBatchesMoveTypeSpec() =
      randomBatchesSpec;

  // Add Single for refinement
  solverSpec.moveTypeList()->emplace_back();
  solverSpec.moveTypeList()->back().singleMoveTypeSpec() = SingleMoveTypeSpec{};

  solver.addSolver(solverSpec);

Performance Characteristics

Speedup Analysis

Batch Size	Parallelism	Early Exit Frequency	Quality	Use Case
5	Low	High (every 5)	Lower	Fast response, few cores
10 (default)	Medium	Medium (every 10)	Medium	Balanced
50	High	Low (every 50)	Higher	Many cores, quality focus
100	Very High	Very Low (every 100)	Highest	Maximum parallelism

Multi-Threading Benefit

Assumes N CPU cores available:

Cores	Batch Size	Parallel Speedup	Total Speedup vs Single
1	10	1x	~5-10x (early exit only)
4	10	~4x	~20-40x
8	10	~8x	~40-80x
16	50	~16x	~100-200x

Note: Speedup depends on objective function evaluation cost and improvement frequency.

Comparison with Variants

Move Type	Parallelism	Early Exit	Randomization	Use Case
Single	No	No	No	Best quality
SingleFast	No	Yes (per object)	No	Balanced
SingleGreedy	No	Yes (per move)	No	Fastest single-thread
SingleRandomBatches	Yes	Yes (per batch)	Yes	Very large + multi-core
SingleRandomStratified	No	No	Yes (stratified)	Large with strata

Decision tree:

Small problem (<10K objects) → Single or SingleFast
Medium problem (10K-100K objects) → SingleFast
Large + single-threaded → SingleGreedy
Large + multi-core → SingleRandomBatches
Stratified structure → SingleRandomStratified

Troubleshooting

Problem: Not seeing parallelism speedup

Diagnosis: Objective evaluation too fast or batch size too small

Solutions:

Increase randomContainerBatchSize (try 50-100)
Check if objective evaluation is CPU-bound
Verify multi-threading is enabled
Profile to find bottlenecks

Problem: Poor solution quality

Diagnosis: Exiting too early, missing better moves due to randomization

Solutions:

Increase batch size for more exploration per batch
Run multiple times with different random seeds
Combine with Single for final refinement
Consider SingleRandomStratified for structured sampling

Problem: Non-deterministic results

Diagnosis: Random shuffling produces different results each run

Solutions:

Set randomSeed in LocalSearchSolverSpec
Accept variance as trade-off for speed
Run multiple times and pick best result
Use deterministic move types if reproducibility critical

Problem: Still too slow

Diagnosis: Problem too large even with batching

Solutions:

Reduce batch size for faster early exit
Enable SingleGreedy for even faster convergence
Use destination filtering (destinationsToExplore)
Consider sampling approaches (SwapSampled for swaps)
Pre-process to reduce problem size

Choosing Batch Size

General guidelines:

Batch size = min(
  number of CPU cores × 2,
  containers / 100
)

Examples:

8 cores, 1000 containers → batch = min(16, 10) = 10 ✓
16 cores, 10000 containers → batch = min(32, 100) = 32
4 cores, 100 containers → batch = min(8, 1) = 1 (use SingleFast instead)

Tuning approach:

Start with default (10)
Profile CPU utilization
If CPUs underutilized → increase batch size
If quality poor → increase batch size
If too slow to exit → decrease batch size

Single move variants:

Single - Full exploration (no batching)
SingleFast - Early exit per object
SingleGreedy - Early exit per move

Randomized alternatives:

SingleRandomStratified - Stratified sampling
SingleColdestStratified - Coldest-first sampling
SingleRandomObjectStratified - Object-level stratification

Complementary:

Use SingleRandomBatches for initial improvement
Follow with Single for final refinement
Combine with Swap for capacity constraints

Source Code

Thrift definition: interface/thrift/SolverSpecs.thrift:529
Implementation: solver/moves/SingleRandomBatchesMoveType.h
Tests: solver/moves/tests/

Next Steps

Try SingleRandomStratified for stratified sampling
Learn about Single for full exploration
Review Performance Guide for parallelization
See Move Types Overview for choosing move types

Overview​

Quick Example​

Parameters​

Parameter Details​

How It Works​

Visual Example​

Batching Strategy​

Complexity​

Usage Patterns​

Very Large Scale​

Tuning Batch Size​

Multi-Core Optimization​

Combined Strategy​

Performance Characteristics​

Speedup Analysis​

Multi-Threading Benefit​

Comparison with Variants​

Troubleshooting​

Problem: Not seeing parallelism speedup​

Problem: Poor solution quality​

Problem: Non-deterministic results​

Problem: Still too slow​

Choosing Batch Size​

Related Move Types​

Source Code​

Next Steps​

Overview

Quick Example

Parameters

Parameter Details

How It Works

Visual Example

Batching Strategy

Complexity

Usage Patterns

Very Large Scale

Tuning Batch Size

Multi-Core Optimization

Combined Strategy

Performance Characteristics

Speedup Analysis

Multi-Threading Benefit

Comparison with Variants

Troubleshooting

Problem: Not seeing parallelism speedup

Problem: Poor solution quality

Problem: Non-deterministic results

Problem: Still too slow

Choosing Batch Size

Related Move Types

Source Code

Next Steps