assemblyhigh-level

Does a high-level language support all of the assembly languages related to all hardware?


I know machine language and assembly are specific to the hardware and different hardware involves different machine and assembly code, so Higher-level languages were invented to solve of these problems. It might be very basic but i want to know, should a high-level language be translated to each assembly language to support it's related hardware?


Solution

  • A high-level language either has an interpreter (typically written in portable C), or it has a compiler which outputs assembly or machine-code (essentially equivalent). These days, compilers for various high-level languages are often front-ends to gcc or LLVM, to take advantage of the optimization and code-generation capabilities of those tools.

    So to make software run on a given platform, you need a C compiler that can make binaries for that platform. This lets you build an interpreter, or directly build binaries for the target platform. C, by an accident of history, is the primary language for highly-portable software development.

    Some languages have a self-hosting compiler. e.g. the Free Pascal compiler is implemented in Free Pascal, and thus need to ported separately. Fortran has an f2c "compiler" which translates fortran to C, to be compiled by a C compiler. (gfortran is part of the Gnu Compiler Collection (gcc), though, so f2c is not widely useful.)

    Note that different OSes on the same hardware often have different ABIs (Application Binary Interface). A Windows binary runs on the same hardware as an x86-64 Linux binary, but makes different system calls. An x86-64 FreeBSD binary makes very similar system calls, and only requires a very lightweight translation layer to run on a Linux kernel.

    Some interpreters (Oracle / OpenJDK, python, and some others) have optimizations for some specific platforms. e.g., when running on an x86 or x86-64 system, a good JVM will Just-In-Time compile the java bytecode to native machine code as it's running. On platforms where it doesn't have a JIT engine, it falls back to normal interpreting. This allows much higher performance than a traditional interpreter on platforms where optimization work has been done, but still keeps everything portable.

    A good port to a new platform requires porting code-generation engines to the new target. Also, some C software will need its #ifdefs tweaked to pick the right branch for the new target, or even have some new code written if it didn't previously support all combinations of endian and type sizes.


    Let's take a Linux distribution like Debian as an example of a huge collection of software written in many different languages as an example.

    First you'd build gcc as a cross-compiler (run on your normal system, but generate binaries for the target system). Then you'd write Linux drivers for any different hardware in the new platform, and anything you needed for the bootloader to load a Linux kernel.

    Once you've built enough binaries to boot Linux on the new hardware and run gcc, the new port is self-hosting and bootstrapping a full environment with compilers and interpreters for all high-level languages can begin in earnest.

    I'm omitting a lot of details because there's a 30k character limit on answers, and I don't feel like hitting it. There's some discussion in comments on Guffa's answer that paints a much less rosy picture than this ideal-world picture, where all the language interpreters in question have a platform-independent portability fallback.