I know this is a simple question, but I'm thrown off by some additions related to linear algebra problems being treated as constant time vs. linear time. In this case, I'm interested in summing m vectors of length n elementwise. I think the complexity is either O(n) or O(mn). I'm confused because a total of (m-1)n additions need to take place if we go two at a time but there are only n separate addition problems. What's the right way of seeing it? Does it matter if we are looking from the perspective of algorithms or hardware implementation?
Summing up m vectors of length n elementwise requires (m-1) * n additions, leading to an algorithmic complexity of O(mn).
However, the Big-O complexity of some algorithm always depends on how one frames the problem. You choose what the variables are that you care about. For example, we can imagine some application that always adds vectors that have say 10 elements. In this case n above is the constant 10 so the Big-O would be just O(m), from the perspective of analyzing the behavior of that application.
Regarding hardware implementation, if you consider how the algorithm is executed on different hardware, then yes, it can affect the effective time complexity. Vectorized SIMD/GPU execution can make it appear closer to O(m) if all n elements are processed in parallel. Parallel reduction can reduce it to O(n log m). The theoretical complexity remains O(mn), but practical performance depends on parallelism, yes.