Automatic batch size reduction
When validating on long utterances with the large model, the encoder may run out of memory even with a batch size of 1.
State resets are implemented by splitting one utterance into a batch
of smaller utterances, even when --val_batch_size=1
.
This creates an opportunity to reduce the VRAM usage
further, by processing the ‘batch’ created from one long utterance in smaller
batches, instead of all at once.
The validation script will automatically reduce the batch size if the number
of inputs to the encoder is greater than --max_inputs_per_batch
. The default
value of --max_inputs_per_batch
is 1e7, which was calibrated to let the
large model validate on a 2-hour-long utterance on an 11 GB GPU.
Note that this option can’t reduce memory usage on a long utterance if state resets is turned off, since the batch size can’t go below 1.
You may wish to reduce the default --max_inputs_per_batch
if you have a smaller GPU/longer utterances.
Increasing the default is probably unnecessary, since validation on an 8 x A100 (80GB)
system
is not slowed down by the default --max_inputs_per_batch
.