I'm taking the fast-ai course, and in "Lesson 2 - SGD" it says:
Mini-batch: a random bunch of points that you use to update your weights
And it also says that gradient descent uses mini-batches.
What is a mini-batch? What's the difference between a mini-batch and a regular batch?
Both are approaches to gradient descent. But in a batch gradient descent you process the entire training set in one iteration. Whereas, in a mini-batch gradient descent you process a small subset of the training set in each iteration.
Also compare stochastic gradient descent, where you process a single example from the training set in each iteration.
Another way to look at it: they are all examples of the same approach to gradient descent with a batch size of m and a training set of size n. For stochastic gradient descent, m=1. For batch gradient descent, m = n. For mini-batch, m=b and b < n, typically b is small compared to n.
Mini-batch adds the question of determining the right size for b, but finding the right b may greatly improve your results.