When should I do data scaling and Sampling (since my data is imbalanced)? Should I do data scaling first then Sampling?
You probably want to standardize/scale your independent values after sampling/splitting.
If you're into the Python programming language, scikit-learn.org
has a few examples that might address your issue a little better. Here's an example solution that deals with the importance of feature scaling.
Here's another one that includes stratified sampling.