c++assemblyintelprocessor

how does assembler convert from assembly to machine code?


I know this has been asked many times, but I am looking for a simple interpretation.

Let's say I have some assembly code that C++ compiler generated.

Now assembler kicks in and it has to transform the assembly code into machine code.

Question 1). Will the C++ assembler compiler look at the table where each assembly instruction has the corresponding machine code instruction ?

Question 2). If the C++ program runs on the intel processor, then, assembler needs to take a look at the table published by Intel team, right ? because in the end, C++ program runs on the intel processor.

Question 3). If I am right about the question 2, then how is it possible that program written in C++ can be run on the computer which uses Intel and on the computer which uses AMD processor ?


Solution

  • Please try to limit your questions to one question per question. Neverthless, let me try and answer them.

    Question 1

    An “assembly compiler” is called an “assembler.” Assembly is assembled, not compiled. And the assembler is not specific to C++. It is specific to the architecture and can only be used to assemble assembly programs for that architecture.

    Yes, assemblers are usually implemented by having a large table mapping instruction mnemonics to the operation codes (opcodes) they correspond to. This table also tells the assembler what operands the instruction takes and how the operands are encoded. There can be multiple entries for the same mnemonic if the mnemonic corresponds to multiple instructions.

    It is however not a requirement to do it this way. Assemblers may chose different approaches or combine tables with pre- and postprocessing steps.

    Question 2

    This is correct. Processor vendors generally provide documentation for their processors in which all instructions and their instruction encodings are listed. For Intel, this information can be found in the Intel Software Development Manuals. Note that while the processor vendor provides such specifications, it is the job of the assembler author to translate these documents into tables for use by the assembler. This is traditionally done manually but recently, people have started automatically translating manuals into tables.

    Question 3

    Both Intel and AMD produce processors of the amd64 (also called x86-64, IA32e, Intel 64, EM64T, and other things) architecture. So a program written for an Intel processor generally also runs on an AMD processor.

    Note that there are tiny differences between Intel's and AMD's implementation of this architecture. Your compiler is aware of them and won't generate code that can behave differently between the two.

    There are also various instruction set extensions available on some but not all amd64 processors. Programs using these will only run on processors that have these instruction set extensions. However, unless you specifically tell your compiler to make use of such extensions, it won't use any of them and your code will run on amd64 processors of any vendor.