From my understanding, the Turing model is made up of I/O data, a CPU, and a "program". The "program" is used to configure how the CPU will handle the input data in order to produce the output data. If we change the program, then the CPU will handle the input differently, and we get a different output.
The von Neumann model logically combines the I/O and the program into one.... OK, but practically what difference does that make when designing a computer? Why is von Neumann's model noted when it seems like it's just a modification of the Turing model? Does it mean that modern computers like smartphones are von Neumann computers (and not Turing computers) because they have the program and I/O integrated? Are old school game consoles considered Turing computers (and not von Neumann computers) because the program (cartridge) can be psychically separate from the CPU and I/O data? Someone please give me some insight.
All real-world commercial CPU designs are von Neumann architecture, or a minor variation of it: Harvard architecture where program memory is separate from data memory. Harvard is when you have a separate address space, connected to physically separate memory, often non-volatile and read-only or at least slow to program. True Harvard is usually only found in microcontrollers intended for embedded systems, which don't need the capability to run new programs at any time.
von Neumann vs. Harvard is the distinction that allows console cartridges with "the program" in read-only memory.
Note that John von Neumann was a pretty cool guy who has a lot of stuff named after him. "von Neumann" in one context might be contrasting with something different than "von Neumann" in another context. e.g. von Neumann vs. a totally different model of computation (like finite automata aka game of life) is more about serial stored-program computers vs. inherently-parallel stuff, not the harvard vs. von neumann distinction within stored-program computers. See Is C++ considered a Von Neumann programming language?
Harvard machines still store the program in the same sort of way as von Neumann, as bytes in memory that the CPU fetches. (Although to be fair, a universal Turing machine also reads its program from the tape before starting to run. But once running, the internal representation could be anything. Real life x86 CPUs like that have existed, e.g. Transmeta Crusoe that internally does dynamic recompilation of x86 machine code into code for a VLIW CPU. Or modern Sandybridge and Zen families that have decoded-uop caches instead of re-decoding the x86 machine code. But those are just caches, and can re-fetch x86 machine code during executions for new, changed, or evicted-from-cache areas).
"Turing machine" implies non-random-access memory, moving along a tape or equivalent. Building a finite Turing machine in hardware would mean your memory was a giant shift register or something (plausible but worse density than normal DRAM or SRAM), or a physical tape or drum memory that moved under program control instead of at constant speed (probably terrible).
You could build a practical (finite) Turing Machine in real life, but it would be hard to program and incredibly slow. I doubt any commercial design has ever been like that. It's just so obviously bad for performance as well as ease of programming, and even the O(f(n)) time complexity of computation.
Note that complexity analysis depends on the model of computation: we usually assume random access and serial computation like adding up all the operations being done. If we had computational memory where you could ask groups of memory cells to increment themselves in parallel, or find the min between it and a group of neighbours, you could maybe achieve a lower complexity bound for some things.
That serial computation bottleneck is inherent in having all computation happen in "the CPU" which can only do 1 thing at a time. (Or in practice, ~4 things at a time with superscalar pipelined execution, or another factor of 4 to 16 on top of that with SIMD. But those are constant factors that don't scale with problem sizes. Great in practice but don't affect the complexity class of a problem like finding the min over an unsorted array.)
(The "von Neumann bottleneck" can be narrowly defined as needing to actually fetch data and program from external memory, significantly mitigated by caches, including split L1d / L1i caches over unified L2 (aka modified Harvard). So the serial model of computation imposed by having a CPU read an instruction stream might not fall under that heading. Related: Does the Harvard architecture have the von Neumann bottleneck?)
Usually we only talk about ideal Turing Machines with unbounded tape for theoretical CS purposes because they suck so much to program for complex real worlds tasks. Anything that's computable at all is computable on a Turing Machine, but not efficiently. (Assuming the unprovable(?) Church/Turing thesis to be true, and any other formal caveats.)
When you do want to talk about efficiency with more formalism than counting operations in pseudocode, you need a formal abstract model of computation that's similar to real-world computers. An example of this is the abstract CS Random Access Machine (wikipedia).
With its program separate from the RAM, it's a Harvard architecture.
With the program in the RAM where it can be self-modifying, it's a RASP (Random Access Stored Program), and an idealized Von Neumann machine.
So no, von Neumann vs. Turing isn't real-world vs. theoretical unbounded; both types could be built in real life, and abstract models of both types do exist and are used. It's easy to make that mistake because you'll never see a real-life Turing machine used for anything, and normally discussion of von Neumann architectures centres on real-life hardware.
Abstract RAM machines notably are useful for analyzing parallel algorithms. (PRAM = Parallel RAM model). Nobody in their right mind wants parallel Turing machines shuttling around on the same tape. And PRAM models real CPUs well enough to prove stuff about algorithms being wait-free, lock-free, or obstruction-free (wiki).