[SOLVED] How many steps does the Hack computer's fetch-execute cycle take?

How many steps does the Hack computer's fetch-execute cycle take?

I have completed the first part of the From Nand to Tetris course (thanks MadOverlord and everyone!) but still can't figure out how many steps the Hack CPU's fetch-execute cycle takes.

I read the But How Do It Know? book about an 8-bit "shared instruction and data memory" computer design which features a simplified finite-state machine called a stepper, which has six steps for each fetch-execute cycle (three for fetching + incrementing the PC/IAR counter, and the remaining three for executing, either a logic/arithmetic or a load/store instruction).

In Nand2Tetris we have a Harvard architecture with simultaneous fetching of instructions (from ROM) and data (from RAM). Execution is mentioned to be faster compared to the von Neumann design, but I'm not sure how many clock cycles it takes. I'm trying to establish this for helping me understand the von Neumann bottleneck (but that is another question).

In Hack Assembly (the assembly language for the Nand2Tetris Hack computer), a single data store operation could be made up of two instructions, an A-instruction followed by a C-instruction, for example

@some_address
M=M+1

which means, in my understanding, one fetch step for @some_address and one execute step for M=M+1. Is this correct? Or does each of the instructions take two steps? Or something else entirely?

I tried to think of it in terms of register access and clock cycles, but things became too muddy according to what I managed to understand from the course.

I know @some_address both selects an instruction and a data location. And thus M=M+1 operates over the previously selected data and also stores it in memory. This is all I could figure out. Does it mean I have two steps? Or four or something else?

P.S.: Come to think of it, is this question a duplicate of this one?

Solution

In the simulator, each instruction takes one cycle, and everything needed to do that instruction happens instantaneously. The actual details of how this would happen in a real implementation is much more complex and implementation-dependent.

I've actually been working, in fits and starts, on an implementation of the HACK cpu in physical relays. In this implementation, each instruction consists of 10 clock "ticks" that step the machine through the process. You can read more about it here: https://github.com/RJWoodhead/Relay2Tetris/blob/master/Design.md

Please note that this is a work-in-progress and subject to revision, but it should give you a decent idea of how things might work in the real world.